[ https://issues.apache.org/jira/browse/IMPALA-13887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joe McDonnell resolved IMPALA-13887. ------------------------------------ Fix Version/s: Impala 5.0.0 Resolution: Fixed > TestParquet.test_resolution_by_name fails with tuple caching enabled > -------------------------------------------------------------------- > > Key: IMPALA-13887 > URL: https://issues.apache.org/jira/browse/IMPALA-13887 > Project: IMPALA > Issue Type: Bug > Components: Frontend > Affects Versions: Impala 5.0.0 > Reporter: Joe McDonnell > Assignee: Joe McDonnell > Priority: Critical > Fix For: Impala 5.0.0 > > > When running TestParquet.test_resolution_by_name with tuple caching enabled, > it fails with a correctness issue: > {noformat} > TestParquet.test_resolution_by_name[protocol: beeswax | table_format: > parquet/none | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': > 0, 'disable_codegen_rows_threshold': 0, 'disable_codegen': True, > 'abort_on_error': 1, 'debug_action': > 'HDFS_SCANNER_THREAD_CHECK_SOFT_MEM_LIMIT:FAIL@0.5', > 'exec_single_node_rows_threshold': 0}] > [gw0] linux2 -- Python 2.7.16 > /home/joemcdonnell/upstream/Impala/bin/../infra/python/env-gcc10.4.0/bin/python > query_test/test_scanners.py:1052: in test_resolution_by_name > use_db=unique_database) > common/impala_test_suite.py:904: in run_test_case > self.__verify_results_and_errors(vector, test_section, result, use_db) > common/impala_test_suite.py:737: in __verify_results_and_errors > replace_filenames_with_placeholder) > common/test_result_verifier.py:523: in verify_raw_results > VERIFIER_MAP[verifier](expected, actual) > common/test_result_verifier.py:305: in verify_query_result_is_equal > assert expected_results == actual_results > E assert Comparing QueryTestResults (expected vs actual): > E 'NULL' == 'NULL' > E 'NULL' == 'NULL' > E 'NULL' == 'NULL' > E 'NULL' == 'NULL' > E 'NULL' == 'NULL' > E 'NULL' != 'aaa' > E 'NULL' != 'aaa' > E 'NULL' != 'bbb' > E 'NULL' != 'bbb' > E 'NULL' != 'c' > E 'NULL' != 'c' > E 'NULL' != 'nonnullable' > {noformat} > The test alters a table to change the name of a column, which actually > changes the meaning of the statement when using > parquet_fallback_schema_resolution=name. The issue is that the cache key > doesn't contain the actual column names. These are the SQLs: > {noformat} > select tmp.f from nested_resolution_by_name_test.nested_struct.c.d.item tmp; > # Renames 'f' to 'renamed' > alter table nested_resolution_by_name_test change nested_struct nested_struct > struct<b: array<int>, a: int, c: struct<d: array<array<struct<renamed: > string>>>>>; > select tmp.renamed from nested_resolution_by_name_test.nested_struct.c.d.item > tmp;{noformat} > The cache key should incorporate the column/field names. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org