[
https://issues.apache.org/jira/browse/IMPALA-13284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875222#comment-17875222
]
ASF subversion and git services commented on IMPALA-13284:
----------------------------------------------------------
Commit d2e495e83a3962277a538360752d20e0ad5ab323 in impala's branch
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d2e495e83 ]
IMPALA-13284: Loading test data on Apache Hive3
There are some failures in loading test data on Apache Hive 3.1.3:
- STORED AS JSONFILE is not supported
- STORED BY ICEBERG is not supported. Similarly, STORED BY ICEBERG
STORED AS AVRO is not supported.
- Missing the jar of iceberg-hive-runtime in CLASSPATH of HMS and Tez
jobs.
- Creating table in Impala is not translated to EXTERNAL table in HMS
- Hive INSERT on insert-only tables failed in generating InsertEvents
(HIVE-20067).
This patch fixes the syntax issues by using old syntax of Apache Hive
3.1.3:
- Convert STORED AS JSONFILE to ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.JsonSerDe'
- Convert STORED BY ICEBERG to STORED BY
'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
- Convert STORED BY ICEBERG STORED AS AVRO to the above one with
tblproperties('write.format.default'='avro')
Most of the conversion are done in generate-schema-statements.py. One
exception is in testdata/bin/load-dependent-tables.sql where we need to
generate a new file with the conversion when using it.
The missing jar of iceberg-hive-runtime is added into HIVE_AUX_JARS_PATH
in bin/impala-config.sh. Note that this is only needed by Apache Hive3
since CDP Hive3 has the jar of hive-iceberg-handler in its lib folder.
To fix the failure of InsertEvents, we add the patch of HIVE-20067 and
modify testdata/bin/patch_hive.sh to also recompile the submodule
standalone-metastore.
Modified some statements in
testdata/datasets/functional/functional_schema_template.sql to be more
reliable in retry.
Tests
- Verified the testdata can be loaded in ubuntu-20.04-from-scratch
Change-Id: I8f52c91602da8822b0f46f19dc4111c7187ce400
Reviewed-on: http://gerrit.cloudera.org:8080/21657
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Loading test data on Apache Hive3
> ---------------------------------
>
> Key: IMPALA-13284
> URL: https://issues.apache.org/jira/browse/IMPALA-13284
> Project: IMPALA
> Issue Type: Task
> Components: Infrastructure
> Reporter: Quanlong Huang
> Assignee: Quanlong Huang
> Priority: Major
>
> When building on Apache Hive3, there are some failures in loading test data,
> e.g.
> {noformat}
> ERROR: ALTER TABLE functional_parquet.iceberg_lineitem_sixblocks CONVERT TO
> ICEBERG
> Traceback (most recent call last):
> File "/media/quanlong/hdd-backup/impala-apache-hive/bin/load-data.py", line
> 195, in exec_impala_query_from_file
> result = impala_client.execute(query)
> File
> "/media/quanlong/hdd-backup/impala-apache-hive/tests/beeswax/impala_beeswax.py",
> line 191, in execute
> handle = self.__execute_query(query_string.strip(), user=user)
> File
> "/media/quanlong/hdd-backup/impala-apache-hive/tests/beeswax/impala_beeswax.py",
> line 382, in __execute_query
> handle = self.execute_query_async(query_string, user=user)
> File
> "/media/quanlong/hdd-backup/impala-apache-hive/tests/beeswax/impala_beeswax.py",
> line 376, in execute_query_async
> handle = self.__do_rpc(lambda: self.imp_service.query(query,))
> File
> "/media/quanlong/hdd-backup/impala-apache-hive/tests/beeswax/impala_beeswax.py",
> line 539, in __do_rpc
> raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> ImpalaBeeswaxException: ImpalaBeeswaxException:
> INNER EXCEPTION: <class 'beeswaxd.ttypes.BeeswaxException'>
> MESSAGE: AnalysisException: CONVERT TO ICEBERG is not supported for managed
> tables{noformat}
> This is due to the table is created as a managed table in Hive.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]