[ 
https://issues.apache.org/jira/browse/IMPALA-7335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579113#comment-16579113
 ] 

ASF subversion and git services commented on IMPALA-7335:
---------------------------------------------------------

Commit 2e5df138aaf4354fd4ada69b627842c34fef2e05 in impala's branch 
refs/heads/master from poojanilangekar
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=2e5df13 ]

IMPALA-7335/IMPALA-7418: Add logs to HdfsScanNode to debug the issues

IMPALA-7335 and IMPALA-7418 are failing some builds on jenkins.
However, there is no deterministic method to reproduce them
locally and hence it is difficult to figure out the cause of the
failure. From the existing logs, it appears that the status
generated by HdfsScanNode::ProcessSplit() is lost. This log
would help determine the condition when the failures occur.

Change-Id: I68698c90031edc6ee8c31e9ce3d52dade9d8f6f1
Reviewed-on: http://gerrit.cloudera.org:8080/11174
Reviewed-by: Bikramjeet Vig <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Assertion Failure - test_corrupt_files
> --------------------------------------
>
>                 Key: IMPALA-7335
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7335
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 3.1.0
>            Reporter: nithya
>            Assignee: Pooja Nilangekar
>            Priority: Critical
>              Labels: broken-build
>
> test_corrupt_files fails 
>  
> query_test.test_scanners.TestParquet.test_corrupt_files[exec_option: 
> \\{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0, 
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None, 
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from 
> pytest)
>  
> {code:java}
> Error Message
> query_test/test_scanners.py:300: in test_corrupt_files     
> self.run_test_case('QueryTest/parquet-abort-on-error', vector) 
> common/impala_test_suite.py:420: in run_test_case     assert False, "Expected 
> exception: %s" % expected_str E   AssertionError: Expected exception: Column 
> metadata states there are 11 values, but read 10 values from column id.
> STACKTRACE
> query_test/test_scanners.py:300: in test_corrupt_files
>     self.run_test_case('QueryTest/parquet-abort-on-error', vector)
> common/impala_test_suite.py:420: in run_test_case
>     assert False, "Expected exception: %s" % expected_str
> E   AssertionError: Expected exception: Column metadata states there are 11 
> values, but read 10 values from column id.
> Standard Error
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=0;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id, cnt from bad_column_metadata t, (select count(*) cnt from 
> t.int_array) v;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id from bad_column_metadata;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> SELECT * from bad_parquet_strings_negative_len;
> -- executing against localhost:21000
> SELECT * from bad_parquet_strings_out_of_bounds;
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id, cnt from bad_column_metadata t, (select count(*) cnt from 
> t.int_array) v;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id from bad_column_metadata;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to