Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/14527 )
Change subject: IMPALA-9071: Handle translated external HDFS table in CATS ...................................................................... Patch Set 7: (17 comments) Thank Csaba! Addressed your comments. > About the design: if I understand correctly, Hive does some transformation on > the table we want to create, and we rely on HiveConf to do a part of that > transformation (changing path) the same way as Hive. This seems error prone > to me, as there can be other things in the transformation that we do not > copy, and I am not sure that Impala sees the correct hive-site.xml in every > installation. > > I would prefer to do the non-acid non-external->external transformation on > Impala side completely to avoid relying on HMS translation logic. This way we > could simply use a flag with the default external table path and users could > set this independently from Hive. Agree with this. It's not easy to handle the transformation as same as hive (IMPALA-9088). Avoiding our tables being translated is a better way. > I do not want to block the commit if you want to merge it urgently, so I can > give +2 if my comments for the test are resolved. I reached to Joe before. He might want to review the infra changes. If not or if this also looks good to him. Please help to run the GVO for me. http://gerrit.cloudera.org:8080/#/c/14527/7//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/14527/7//COMMIT_MSG@14 PS7, Line 14: locate > nit: be located Done http://gerrit.cloudera.org:8080/#/c/14527/7//COMMIT_MSG@33 PS7, Line 33: - To support customizing hive configuration, add a env var, > nit: an Done http://gerrit.cloudera.org:8080/#/c/14527/7/bin/create-test-configuration.sh File bin/create-test-configuration.sh: http://gerrit.cloudera.org:8080/#/c/14527/7/bin/create-test-configuration.sh@143 PS7, Line 143: ext > Can you give this a more descriptive name? Done http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java@459 PS7, Line 459: results > nit: result Done http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java File fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java: http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@937 PS7, Line 937: not allows > nit: doesn't allow Done http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java@938 PS7, Line 938: s > nit: result Done http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/test/resources/hive-site.xml.py File fe/src/test/resources/hive-site.xml.py: http://gerrit.cloudera.org:8080/#/c/14527/7/fe/src/test/resources/hive-site.xml.py@63 PS7, Line 63: if variant is not None: : CONFIG.update({ : 'hive.metastore.warehouse.external.dir': '${WAREHOUSE_LOCATION_PREFIX}/test-warehouse-external', : }) > Can you also check the value of the variant? We could also raise an excepti Sure. Good point! http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py File tests/custom_cluster/test_custom_hive_configs.py: http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@39 PS7, Line 39: test_ctas_read_write_consistence > Does it make sense to run this test in Hive 2 configurations? If I understa Yes, we can skip this test in Hive-2 until HIVE-19837 is backported to CDH6 someday. http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@41 PS7, Line 41: IMPALA-9071: Check CTAS inserts data to the correct directory when > nit: missing "that"? Done http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@42 PS7, Line 42: with > nit: from Done http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@46 PS7, Line 46: self.execute_query_expect_success( : self.client, 'create database if not exists ' + unique_database) > I think that this is not necessary, as unique_database is created by defaul Ah, yes! http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@49 PS7, Line 49: s > nit: +2 indent Done http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@51 PS7, Line 51: self.client, 'select * from %s.ctas_tbl' % unique_database) > nit: +2 indent Done http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@54 PS7, Line 54: self.client, 'create external table %s.ctas_ext_tbl as select 1, 2, "name"' % > nit: +2 indent Done http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@57 PS7, Line 57: self.client, 'select * from %s.ctas_ext_tbl' % unique_database) > nit: +2 indent Done http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@60 PS7, Line 60: # Set "external.table.purge"="true" so we can clean files of the external table : self.execute_query_expect_success( : self.client, 'alter table %s.ctas_ext_tbl set tblproperties' : '("external.table.purge"="true")' % unique_database) > I would move this just after table creation, or merge it in the creation SQ OK. Move it after the table creation. BTW, we can't specify table properties in the CTAS statement, so an Alter statement is needed. http://gerrit.cloudera.org:8080/#/c/14527/7/tests/custom_cluster/test_custom_hive_configs.py@64 PS7, Line 64: self.execute_query_expect_success( : self.client, 'drop database if exists cascade' + unique_database) > Similarly to line 46, the unique_database's cleanup does not have to be don hmm. I found the database is dropped but the underlying files of the external table still exists. So I add this with "cascade" to clean all generated files. Just want to make the test idempotent. Otherwise, running the test at the second time, there will be double rows found in the external table. Let me remove the try-finally clause to avoid omitting the original error. -- To view, visit http://gerrit.cloudera.org:8080/14527 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I460a57dc877ef68ad7dd0864a33b1599b1e9a8d9 Gerrit-Change-Number: 14527 Gerrit-PatchSet: 7 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Anurag Mantripragada <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Thu, 24 Oct 2019 15:42:35 +0000 Gerrit-HasComments: Yes
