[
https://issues.apache.org/jira/browse/IMPALA-7335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16562750#comment-16562750
]
Pooja Nilangekar edited comment on IMPALA-7335 at 7/31/18 6:53 PM:
-------------------------------------------------------------------
This appears to be a flaky issue. From the cluster logs, it was not clear what
the cause of failure was.
Since the file is not generated on the fly, it can't be a case of failure due
to errors while generating the file. Additionally, the code path to determine
invalid metadata is deterministic and should be hit as long as the file
contains fewer values than expected. In the cluster logs, the test failed
because all fragments completed successfully.
I tried to recreate the error by running the test in an infinite loop with the
same build parameters. At the end of a 24 hour run, it completed over 10000
successful iterations without any failure. We can wait and see if the issue is
reproduced in the next week or so. If not, we can close it until it reappears.
was (Author: poojanilangekar):
This appears to be a flaky issue. From the cluster logs, it was not clear what
the cause of failure was.
Since the file is not generated on the fly, it can't be a case of failure due
to errors while generating the file. Additionally, the code path to determine
invalid metadata is deterministic and should be hit as long as the file
contains fewer values than expected. In the cluster logs, the test failed
because all fragments completed successfully.
I tried to recreate the error by running the test in an infinite loop with the
same build parameters. As of now, it has had over 3700 successful iterations
without any failure. We can wait and see if the issue is reproduced in the next
week or so. If not, we can close it until it reappears.
> Assertion Failure - test_corrupt_files
> --------------------------------------
>
> Key: IMPALA-7335
> URL: https://issues.apache.org/jira/browse/IMPALA-7335
> Project: IMPALA
> Issue Type: Bug
> Affects Versions: Impala 3.1.0
> Reporter: nithya
> Assignee: Pooja Nilangekar
> Priority: Critical
> Labels: broken-build
>
> test_corrupt_files fails
>
> query_test.test_scanners.TestParquet.test_corrupt_files[exec_option:
> \\{'batch_size': 0, 'num_nodes': 0, 'disable_codegen_rows_threshold': 0,
> 'disable_codegen': False, 'abort_on_error': 1, 'debug_action': None,
> 'exec_single_node_rows_threshold': 0} | table_format: parquet/none] (from
> pytest)
>
> {code:java}
> Error Message
> query_test/test_scanners.py:300: in test_corrupt_files
> self.run_test_case('QueryTest/parquet-abort-on-error', vector)
> common/impala_test_suite.py:420: in run_test_case assert False, "Expected
> exception: %s" % expected_str E AssertionError: Expected exception: Column
> metadata states there are 11 values, but read 10 values from column id.
> STACKTRACE
> query_test/test_scanners.py:300: in test_corrupt_files
> self.run_test_case('QueryTest/parquet-abort-on-error', vector)
> common/impala_test_suite.py:420: in run_test_case
> assert False, "Expected exception: %s" % expected_str
> E AssertionError: Expected exception: Column metadata states there are 11
> values, but read 10 values from column id.
> Standard Error
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=0;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id, cnt from bad_column_metadata t, (select count(*) cnt from
> t.int_array) v;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id from bad_column_metadata;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> SELECT * from bad_parquet_strings_negative_len;
> -- executing against localhost:21000
> SELECT * from bad_parquet_strings_out_of_bounds;
> -- executing against localhost:21000
> use functional_parquet;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id, cnt from bad_column_metadata t, (select count(*) cnt from
> t.int_array) v;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> -- executing against localhost:21000
> set num_nodes=1;
> -- executing against localhost:21000
> set num_scanner_threads=1;
> -- executing against localhost:21000
> select id from bad_column_metadata;
> -- executing against localhost:21000
> SET NUM_NODES="0";
> -- executing against localhost:21000
> SET NUM_SCANNER_THREADS="0";
> {code}
>
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]