Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14527 )

Change subject: IMPALA-9071: Handle translated external HDFS table in CATS
......................................................................


Patch Set 7:

(17 comments)

Thank Csaba! Addressed your comments.

> About the design: if I understand correctly, Hive does some transformation on 
> the table we want to create, and we rely on HiveConf to do a part of that 
> transformation (changing path) the same way as Hive. This seems error prone 
> to me, as there can be other things in the transformation that we do not 
> copy, and I am not sure that Impala sees the correct hive-site.xml in every 
> installation.
>
> I would prefer to do the non-acid non-external->external transformation on 
> Impala side completely to avoid relying on HMS translation logic. This way we 
> could simply use a flag with the default external table path and users could 
> set this independently from Hive.

Agree with this. It's not easy to handle the transformation as same as hive 
(IMPALA-9088). Avoiding our tables being translated is a better way.

> I do not want to block the commit if you want to merge it urgently, so I can 
> give +2 if my comments for the test are resolved.

I reached to Joe before. He might want to review the infra changes. If not or 
if this also looks good to him. Please help to run the GVO for me.

http://gerrit.cloudera.org:8080/#/c/14527/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14527/7//COMMIT_MSG@14
PS7, Line 14: locate
> nit: be located
Done


http://gerrit.cloudera.org:8080/#/c/14527/7//COMMIT_MSG@33
PS7, Line 33:  - To support customizing hive configuration, add a env var,
> nit: an
Done


http://gerrit.cloudera.org:8080/#/c/14527/7/bin/create-test-configuration.sh
File bin/create-test-configuration.sh:

http://gerrit.cloudera.org:8080/#/c/14527/7/bin/create-test-configuration.sh@143
PS7, Line 143: ext
> Can you give this a more descriptive name?
Done


http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@459
PS7, Line 459: results
> nit: result
Done


http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java:

http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@937
PS7, Line 937: not allows
> nit: doesn't allow
Done


http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@938
PS7, Line 938: s
> nit: result
Done


http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/test/resources/hive-site.xml.py
File fe/src/test/resources/hive-site.xml.py:

http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/test/resources/hive-site.xml.py@63
PS7, Line 63: if variant is not None:
            :   CONFIG.update({
            :     'hive.metastore.warehouse.external.dir': 
'${WAREHOUSE_LOCATION_PREFIX}/test-warehouse-external',
            :   })
> Can you also check the value of the variant? We could also raise an excepti
Sure. Good point!


http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py
File tests/custom_cluster/test_custom_hive_configs.py:

http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@39
PS7, Line 39: test_ctas_read_write_consistence
> Does it make sense to run this test in Hive 2 configurations? If I understa
Yes, we can skip this test in Hive-2 until HIVE-19837 is backported to CDH6 
someday.


http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@41
PS7, Line 41:     IMPALA-9071: Check CTAS inserts data to the correct directory 
when
> nit: missing "that"?
Done


http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@42
PS7, Line 42: with
> nit: from
Done


http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@46
PS7, Line 46:       self.execute_query_expect_success(
            :         self.client, 'create database if not exists ' + 
unique_database)
> I think that this is not necessary, as unique_database is created by defaul
Ah, yes!


http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@49
PS7, Line 49: s
> nit: +2 indent
Done


http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@51
PS7, Line 51:         self.client, 'select * from %s.ctas_tbl' % 
unique_database)
> nit: +2 indent
Done


http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@54
PS7, Line 54:         self.client, 'create external table %s.ctas_ext_tbl as 
select 1, 2, "name"' %
> nit: +2 indent
Done


http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@57
PS7, Line 57:         self.client, 'select * from %s.ctas_ext_tbl' % 
unique_database)
> nit: +2 indent
Done


http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@60
PS7, Line 60:       # Set "external.table.purge"="true" so we can clean files 
of the external table
            :       self.execute_query_expect_success(
            :         self.client, 'alter table %s.ctas_ext_tbl set 
tblproperties'
            :                      '("external.table.purge"="true")' % 
unique_database)
> I would move this just after table creation, or merge it in the creation SQ
OK. Move it after the table creation. BTW, we can't specify table properties in 
the CTAS statement, so an Alter statement is needed.


http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@64
PS7, Line 64:       self.execute_query_expect_success(
            :         self.client, 'drop database if exists cascade' + 
unique_database)
> Similarly to line 46, the unique_database's cleanup does not have to be don
hmm. I found the database is dropped but the underlying files of the external 
table still exists. So I add this with "cascade" to clean all generated files. 
Just want to make the test idempotent. Otherwise, running the test at the 
second time, there will be double rows found in the external table.

Let me remove the try-finally clause to avoid omitting the original error.



--
To view, visit http://gerrit.cloudera.org:8080/14527
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I460a57dc877ef68ad7dd0864a33b1599b1e9a8d9
Gerrit-Change-Number: 14527
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Anurag Mantripragada <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Thu, 24 Oct 2019 15:42:35 +0000
Gerrit-HasComments: Yes

Reply via email to