[
https://issues.apache.org/jira/browse/IMPALA-11807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17651409#comment-17651409
]
ASF subversion and git services commented on IMPALA-11807:
----------------------------------------------------------
Commit 4a05eaf988f3a613ff86b934dd077c80070b4ca0 in impala's branch
refs/heads/master from noemi
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=4a05eaf98 ]
IMPALA-11807: Fix TestIcebergTable.test_avro_file_format and
test_mixed_file_format
Iceberg hardcodes URIs in metadata files. If the table was written
in a certain storage location and then moved to another file system,
the hardcoded URIs will still point to the old location instead of
the current one. Therefore Impala will be unable to read the table.
TestIcebergTable.test_avro_file_format and test_mixed_file_format
use Hive from Impala to write tables. If the tables are created in
a different file system than the one they will be read from, the tests
fail due to the invalid URIs.
Skipping these 2 tests if testing is not done on HDFS.
Updated the data load schema of the 2 test tables created by Hive and
set LOCATION to the same as in the previous test tables. If this
makes it possible to rewrite the URIs in the metadata and makes the
tables accessible from another file system as well later, then the
tests can be enabled again.
Testing:
- Testing locally on HDFS minicluster
- Triggered an Ozone build to verify that it is skipped on a different
file system
Change-Id: Ie2f126de80c6e7f825d02f6814fcf69ae320a781
Reviewed-on: http://gerrit.cloudera.org:8080/19387
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> TestIcebergTable.test_avro_file_format and
> TestIcebergTable.test_mixed_file_format failed
> -----------------------------------------------------------------------------------------
>
> Key: IMPALA-11807
> URL: https://issues.apache.org/jira/browse/IMPALA-11807
> Project: IMPALA
> Issue Type: Bug
> Components: Backend, Frontend
> Affects Versions: Impala 4.3.0
> Reporter: Wenzhe Zhou
> Assignee: Noemi Pap-Takacs
> Priority: Major
>
> TestIcebergTable.test_avro_file_format failed after merging patch
> IMPALA-11708 (Add support for mixed Iceberg tables with AVRO file format).
> {code:java}
> *Error Message*
> query_test/test_iceberg.py:906: in test_avro_file_format
> self.run_test_case('QueryTest/iceberg-avro', vector, unique_database)
> common/impala_test_suite.py:712: in run_test_case result = exec_fn(query,
> user=test_section.get('USER', '').strip() or None)
> common/impala_test_suite.py:650: in __exec_in_impala result =
> self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:986: in __execute_query return
> impalad_client.execute(query, user=user) common/impala_connection.py:212: in
> execute return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:189: in execute handle =
> self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:365: in __execute_query handle =
> self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:359: in execute_query_async handle =
> self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:522: in __do_rpc raise
> ImpalaBeeswaxException(self.__build_error_message(b), b) E
> ImpalaBeeswaxException: ImpalaBeeswaxException: E INNER EXCEPTION: <class
> 'beeswaxd.ttypes.BeeswaxException'> E MESSAGE: AnalysisException: Failed
> to load metadata for table: 'functional_parquet.iceberg_avro_format' E
> CAUSED BY: TableLoadingException: IcebergTableLoadingException: Error loading
> metadata for Iceberg table
> s3a://impala-test-uswest2-2/test-warehouse/functional_parquet.db/iceberg_avro_format
> E CAUSED BY: RuntimeIOException: Failed to open input stream for file:
> hdfs://localhost:20500/test-warehouse/functional_parquet.db/iceberg_avro_format/metadata/snap-5594844384179945437-1-6b11ef63-7b9a-48a5-a448-7cc329eb85ec.avro
> E CAUSED BY: ConnectException: Call From
> impala-ec2-centos79-m6i-4xlarge-ondemand-1b22.vpc.cloudera.com/127.0.0.1 to
> localhost:20500 failed on connection exception: java.net.ConnectException:
> Connection refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused E CAUSED BY:
> ConnectException: Connection refused
> *Stacktrace*
> query_test/test_iceberg.py:906: in test_avro_file_format
> self.run_test_case('QueryTest/iceberg-avro', vector, unique_database)
> common/impala_test_suite.py:712: in run_test_case
> result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
> common/impala_test_suite.py:650: in __exec_in_impala
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:986: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:212: in execute
> return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:189: in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:365: in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> beeswax/impala_beeswax.py:359: in execute_query_async
> handle = self.__do_rpc(lambda: self.imp_service.query(query,))
> beeswax/impala_beeswax.py:522: in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> E ImpalaBeeswaxException: ImpalaBeeswaxException:
> E INNER EXCEPTION: <class 'beeswaxd.ttypes.BeeswaxException'>
> E MESSAGE: AnalysisException: Failed to load metadata for table:
> 'functional_parquet.iceberg_avro_format'
> E CAUSED BY: TableLoadingException: IcebergTableLoadingException: Error
> loading metadata for Iceberg table
> s3a://impala-test-uswest2-2/test-warehouse/functional_parquet.db/iceberg_avro_format
> E CAUSED BY: RuntimeIOException: Failed to open input stream for file:
> hdfs://localhost:20500/test-warehouse/functional_parquet.db/iceberg_avro_format/metadata/snap-5594844384179945437-1-6b11ef63-7b9a-48a5-a448-7cc329eb85ec.avro
> E CAUSED BY: ConnectException: Call From
> impala-ec2-centos79-m6i-4xlarge-ondemand-1b22.vpc.cloudera.com/127.0.0.1 to
> localhost:20500 failed on connection exception: java.net.ConnectException:
> Connection refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
> E CAUSED BY: ConnectException: Connection refused
> *Standard Error*
> SET
> client_identifier=query_test/test_iceberg.py::TestIcebergTable::()::test_avro_file_format[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_thresho;
> SET sync_ddl=False;
> -- executing against localhost:21000
> DROP DATABASE IF EXISTS `test_avro_file_format_857d7a53` CASCADE;
> -- 2022-12-16 22:28:50,516 INFO MainThread: Started query
> e74f5b79e5a239bd:99af979600000000
> SET
> client_identifier=query_test/test_iceberg.py::TestIcebergTable::()::test_avro_file_format[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_thresho;
> SET sync_ddl=False;
> -- executing against localhost:21000
> CREATE DATABASE `test_avro_file_format_857d7a53`;
> -- 2022-12-16 22:28:57,091 INFO MainThread: Started query
> 934c3b1446ff501e:f7e9e85900000000
> -- 2022-12-16 22:28:57,394 INFO MainThread: Created database
> "test_avro_file_format_857d7a53" for test ID
> "query_test/test_iceberg.py::TestIcebergTable::()::test_avro_file_format[protocol:
> beeswax | exec_option: {'test_replan': 1, 'batch_size': 0, 'num_nodes': 0,
> 'disable_codegen_rows_threshold': 0, 'disable_codegen': False,
> 'abort_on_error': 1, 'exec_single_node_rows_threshold': 0} | table_format:
> parquet/none]"
> SET
> client_identifier=query_test/test_iceberg.py::TestIcebergTable::()::test_avro_file_format[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_thresho;
> -- executing against localhost:21000
> use test_avro_file_format_857d7a53;
> -- 2022-12-16 22:28:57,396 INFO MainThread: Started query
> b44407d013839e5e:7f22ab5400000000
> SET
> client_identifier=query_test/test_iceberg.py::TestIcebergTable::()::test_avro_file_format[protocol:beeswax|exec_option:{'test_replan':1;'batch_size':0;'num_nodes':0;'disable_codegen_rows_threshold':0;'disable_codegen':False;'abort_on_error':1;'exec_single_node_rows_thresho;
> SET test_replan=1;
> SET batch_size=0;
> SET num_nodes=0;
> SET disable_codegen_rows_threshold=0;
> SET disable_codegen=False;
> SET abort_on_error=1;
> SET exec_single_node_rows_threshold=0;
> -- 2022-12-16 22:28:57,396 INFO MainThread: Loading query test file:
> /data/jenkins/workspace/impala-asf-master-core-s3-data-cache/repos/Impala/testdata/workloads/functional-query/queries/QueryTest/iceberg-avro.test
> -- executing against localhost:21000
> select * from functional_parquet.iceberg_avro_format;
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]