[
https://issues.apache.org/jira/browse/IMPALA-8722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong resolved IMPALA-8722.
-----------------------------------
Resolution: Cannot Reproduce
> test_hbase_col_filter failure
> -----------------------------
>
> Key: IMPALA-8722
> URL: https://issues.apache.org/jira/browse/IMPALA-8722
> Project: IMPALA
> Issue Type: Bug
> Affects Versions: Impala 3.3.0
> Reporter: Bikramjeet Vig
> Assignee: Vihang Karajgaonkar
> Priority: Critical
> Labels: broken-build, flaky
>
> test_hbase_col_filter failure with the following exec params failed in one of
> the exhaustive runs
> {noformat}
> query_test.test_hbase_queries.TestHBaseQueries.test_hbase_col_filter[protocol:
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> hbase/none]
> {noformat}
> From the logs it seems like the insert query was executed completely around
> 23:27:42 and the invalidate metadata in impala was run around 23:27:32
> I am not sure if this is due to the log being buffered and written later, but
> if the insert finished after the invalidate metadata then it probably didn't
> get the necessary file data and hence the query that expected 3 rows, didn't
> get any. Note: The insert call is run in hive using
> "self.run_stmt_in_hive(add_data, username='hdfs')"
> hive server 2 logs:
> {noformat}
> 2019-06-26T23:27:42,456 INFO [LocalJobRunner Map Task Executor #0]
> exec.TableScanOperator: Initializing operator TS[0]
> 2019-06-26T23:27:42,456 INFO [LocalJobRunner Map Task Executor #0]
> exec.SelectOperator: Initializing operator SEL[1]
> 2019-06-26T23:27:42,456 INFO [LocalJobRunner Map Task Executor #0]
> exec.SelectOperator: SELECT
> struct<tmp_values_col1:string,tmp_values_col2:string>
> 2019-06-26T23:27:42,456 INFO [LocalJobRunner Map Task Executor #0]
> exec.FileSinkOperator: Initializing operator FS[2]
> 2019-06-26T23:27:42,465 INFO [LocalJobRunner Map Task Executor #0]
> hadoop.InternalParquetRecordReader: block read in memory in 17 ms. row count
> = 2133979
> 2019-06-26T23:27:42,469 INFO [LocalJobRunner Map Task Executor #0]
> exec.FileSinkOperator: Using serializer : class
> org.apache.hadoop.hive.hbase.HBaseSerDe[[:key,cf:c:[k, c]:[string, string]]]
> and formatter :
> org.apache.hadoop.hive.hbase.HiveHBaseTableOutputFormat@c73c2ac
> 2019-06-26T23:27:42,471 INFO [LocalJobRunner Map Task Executor #0]
> exec.FileSinkOperator: New Final Path: FS
> hdfs://localhost:20500/test-warehouse/test_hbase_col_filter_2598223d.db/_tmp.hbase_col_filter_testkeyx/000000_0
> 2019-06-26T23:27:42,479 ERROR [LocalJobRunner Map Task Executor #0]
> hadoop.ParquetRecordReader: Can not initialize counter due to context is not
> a instance of TaskInputOutputContext, but is
> org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl
> 2019-06-26T23:27:42,482 INFO [LocalJobRunner Map Task Executor #0]
> hadoop.InternalParquetRecordReader: RecordReader initialized will read a
> total of 2142543 records.
> 2019-06-26T23:27:42,482 INFO [LocalJobRunner Map Task Executor #0]
> hadoop.InternalParquetRecordReader: at row 0. reading next block
> 2019-06-26T23:27:42,496 INFO [ReadOnlyZKClient-localhost:2181@0x4d49ce4e]
> zookeeper.ZooKeeper: Initiating client connection,
> connectString=localhost:2181 sessionTimeout=90000
> watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$47/372191955@532dae72
> 2019-06-26T23:27:42,497 INFO
> [ReadOnlyZKClient-localhost:2181@0x4d49ce4e-SendThread(localhost:2181)]
> zookeeper.ClientCnxn: Opening socket connection to server
> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL
> (unknown error)
> 2019-06-26T23:27:42,498 INFO
> [ReadOnlyZKClient-localhost:2181@0x4d49ce4e-SendThread(localhost:2181)]
> zookeeper.ClientCnxn: Socket connection established, initiating session,
> client: /0:0:0:0:0:0:0:1:55090, server: localhost/0:0:0:0:0:0:0:1:2181
> 2019-06-26T23:27:42,499 INFO
> [ReadOnlyZKClient-localhost:2181@0x4d49ce4e-SendThread(localhost:2181)]
> zookeeper.ClientCnxn: Session establishment complete on server
> localhost/0:0:0:0:0:0:0:1:2181, sessionid = 0x16b96a782de00c5, negotiated
> timeout = 90000
> 2019-06-26T23:27:42,503 INFO [LocalJobRunner Map Task Executor #0]
> exec.FileSinkOperator: FS[2]: records written - 1
> 2019-06-26T23:27:42,504 INFO [LocalJobRunner Map Task Executor #0]
> exec.MapOperator: MAP[0]: records read - 1
> 2019-06-26T23:27:42,504 INFO [LocalJobRunner Map Task Executor #0]
> exec.MapOperator: MAP[0]: Total records read - 1. abort - false
> 2019-06-26T23:27:42,504 INFO [LocalJobRunner Map Task Executor #0]
> exec.MapOperator: DESERIALIZE_ERRORS:0, RECORDS_IN:1,
> 2019-06-26T23:27:42,504 INFO [LocalJobRunner Map Task Executor #0]
> exec.FileSinkOperator: FS[2]: records written - 1
> 2019-06-26T23:27:42,511 INFO [SpillThread] mapred.MapTask: Finished spill 3
> 2019-06-26T23:27:42,514 INFO [LocalJobRunner Map Task Executor #0]
> exec.FileSinkOperator: TOTAL_TABLE_ROWS_WRITTEN:1,
> RECORDS_OUT_1_test_hbase_col_filter_2598223d.hbase_col_filter_testkeyx:1,
> 2019-06-26T23:27:42,517 INFO [LocalJobRunner Map Task Executor #0]
> mapred.Task: Task:attempt_local258751042_0023_m_000000_0 is done. And is in
> the process of committing
> 2019-06-26T23:27:42,518 INFO [LocalJobRunner Map Task Executor #0]
> mapred.Task: Task 'attempt_local258751042_0023_m_000000_0' done.
> 2019-06-26T23:27:42,518 INFO [LocalJobRunner Map Task Executor #0]
> mapred.Task: Final Counters for attempt_local258751042_0023_m_000000_0:
> Counters: 26
> File System Counters
> FILE: Number of bytes read=823764465
> FILE: Number of bytes written=1184809057
> FILE: Number of read operations=0
> FILE: Number of large read operations=0
> FILE: Number of write operations=0
> HDFS: Number of bytes read=49758306
> HDFS: Number of bytes written=1721
> HDFS: Number of read operations=2308
> HDFS: Number of large read operations=0
> HDFS: Number of write operations=907
> HDFS: Number of bytes read erasure-coded=0
> Map-Reduce Framework
> Map input records=1
> Map output records=0
> Input split bytes=276
> Spilled Records=0
> Failed Shuffles=0
> Merged Map outputs=0
> GC time elapsed (ms)=0
> Total committed heap usage (bytes)=1310720000
> HIVE
> CREATED_FILES=1
> DESERIALIZE_ERRORS=0
> RECORDS_IN=1
>
> RECORDS_OUT_1_test_hbase_col_filter_2598223d.hbase_col_filter_t=1
> TOTAL_TABLE_ROWS_WRITTEN=1
> File Input Format Counters
> Bytes Read=0
> File Output Format Counters
> Bytes Written=0
> {noformat}
> failed test log:
> {noformat}
> Stacktrace
> query_test/test_hbase_queries.py:91: in test_hbase_col_filter
> self.run_test_case('QueryTest/hbase-col-filter', vector, unique_database)
> common/impala_test_suite.py:592: in run_test_case
> self.__verify_results_and_errors(vector, test_section, result, use_db)
> common/impala_test_suite.py:431: in __verify_results_and_errors
> replace_filenames_with_placeholder)
> common/test_result_verifier.py:449: in verify_raw_results
> VERIFIER_MAP[verifier](expected, actual)
> common/test_result_verifier.py:271: in verify_query_result_is_equal
> assert expected_results == actual_results
> E assert Comparing QueryTestResults (expected vs actual):
> E 3 != 0
> Standard Error
> SET
> client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_hbase_col_filter_2598223d` CASCADE;
> -- 2019-06-26 23:27:31,609 INFO MainThread: Started query
> d342e8d640a4497a:27d6b2ff00000000
> SET
> client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_hbase_col_filter_2598223d`;
> -- 2019-06-26 23:27:32,524 INFO MainThread: Started query
> 6245118433382eee:491c3c5f00000000
> -- 2019-06-26 23:27:32,525 INFO MainThread: Created database
> "test_hbase_col_filter_2598223d" for test ID
> "query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:
> beeswax | exec_option: {'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> hbase/none]"
> -- executing against localhost:21000
> invalidate metadata test_hbase_col_filter_2598223d.hbase_col_filter_testkeyx;
> -- 2019-06-26 23:27:43,606 INFO MainThread: Started query
> 8042663402a431b0:d18c468900000000
> SET
> client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
> -- executing against localhost:21000
> use test_hbase_col_filter_2598223d;
> -- 2019-06-26 23:27:43,613 INFO MainThread: Started query
> b34ba175430fa8ab:d338ded600000000
> SET
> client_identifier=query_test/test_hbase_queries.py::TestHBaseQueries::()::test_hbase_col_filter[protocol:beeswax|exec_option:{'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_threshold':0}|tab;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> select count(*) from hbase_col_filter_testkeyx
> WHERE k != 'row1';
> -- 2019-06-26 23:27:47,629 INFO MainThread: Started query
> 564e1694a90e24b3:2e5ccc1f00000000
> -- 2019-06-26 23:27:47,703 ERROR MainThread: Comparing QueryTestResults
> (expected vs actual):
> 3 != 0
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)