[ 
https://issues.apache.org/jira/browse/IMPALA-13284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17875222#comment-17875222
 ] 

ASF subversion and git services commented on IMPALA-13284:
----------------------------------------------------------

Commit d2e495e83a3962277a538360752d20e0ad5ab323 in impala's branch 
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=d2e495e83 ]

IMPALA-13284: Loading test data on Apache Hive3

There are some failures in loading test data on Apache Hive 3.1.3:
 - STORED AS JSONFILE is not supported
 - STORED BY ICEBERG is not supported. Similarly, STORED BY ICEBERG
   STORED AS AVRO is not supported.
 - Missing the jar of iceberg-hive-runtime in CLASSPATH of HMS and Tez
   jobs.
 - Creating table in Impala is not translated to EXTERNAL table in HMS
 - Hive INSERT on insert-only tables failed in generating InsertEvents
   (HIVE-20067).

This patch fixes the syntax issues by using old syntax of Apache Hive
3.1.3:
 - Convert STORED AS JSONFILE to ROW FORMAT SERDE
   'org.apache.hadoop.hive.serde2.JsonSerDe'
 - Convert STORED BY ICEBERG to STORED BY
   'org.apache.iceberg.mr.hive.HiveIcebergStorageHandler'
 - Convert STORED BY ICEBERG STORED AS AVRO to the above one with
   tblproperties('write.format.default'='avro')
Most of the conversion are done in generate-schema-statements.py. One
exception is in testdata/bin/load-dependent-tables.sql where we need to
generate a new file with the conversion when using it.

The missing jar of iceberg-hive-runtime is added into HIVE_AUX_JARS_PATH
in bin/impala-config.sh. Note that this is only needed by Apache Hive3
since CDP Hive3 has the jar of hive-iceberg-handler in its lib folder.

To fix the failure of InsertEvents, we add the patch of HIVE-20067 and
modify testdata/bin/patch_hive.sh to also recompile the submodule
standalone-metastore.

Modified some statements in
testdata/datasets/functional/functional_schema_template.sql to be more
reliable in retry.

Tests
 - Verified the testdata can be loaded in ubuntu-20.04-from-scratch

Change-Id: I8f52c91602da8822b0f46f19dc4111c7187ce400
Reviewed-on: http://gerrit.cloudera.org:8080/21657
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>


> Loading test data on Apache Hive3
> ---------------------------------
>
>                 Key: IMPALA-13284
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13284
>             Project: IMPALA
>          Issue Type: Task
>          Components: Infrastructure
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>
> When building on Apache Hive3, there are some failures in loading test data, 
> e.g.
> {noformat}
> ERROR: ALTER TABLE functional_parquet.iceberg_lineitem_sixblocks CONVERT TO 
> ICEBERG
> Traceback (most recent call last):
>   File "/media/quanlong/hdd-backup/impala-apache-hive/bin/load-data.py", line 
> 195, in exec_impala_query_from_file
>     result = impala_client.execute(query)
>   File 
> "/media/quanlong/hdd-backup/impala-apache-hive/tests/beeswax/impala_beeswax.py",
>  line 191, in execute
>     handle = self.__execute_query(query_string.strip(), user=user)
>   File 
> "/media/quanlong/hdd-backup/impala-apache-hive/tests/beeswax/impala_beeswax.py",
>  line 382, in __execute_query
>     handle = self.execute_query_async(query_string, user=user)
>   File 
> "/media/quanlong/hdd-backup/impala-apache-hive/tests/beeswax/impala_beeswax.py",
>  line 376, in execute_query_async
>     handle = self.__do_rpc(lambda: self.imp_service.query(query,))
>   File 
> "/media/quanlong/hdd-backup/impala-apache-hive/tests/beeswax/impala_beeswax.py",
>  line 539, in __do_rpc
>     raise ImpalaBeeswaxException(self.__build_error_message(b), b)
> ImpalaBeeswaxException: ImpalaBeeswaxException:
>  INNER EXCEPTION: <class 'beeswaxd.ttypes.BeeswaxException'>
>  MESSAGE: AnalysisException: CONVERT TO ICEBERG is not supported for managed 
> tables{noformat}
> This is due to the table is created as a managed table in Hive.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to