Dimitris Tsirogiannis has posted comments on this change. Change subject: Simplify creating external Kudu tables and add DROP DATABASE CASCADE ......................................................................
Patch Set 10: (4 comments) http://gerrit.cloudera.org:8080/#/c/2617/10/fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java File fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java: Line 1113: // The use of DdlDelegates combined with the fact that the catalog is disconnected : // from the source of truth, the Hive Metastore, leads to some complications here. : // If 'db' is null, that should mean either the database doesn't really exist or the : // database was created outside of Impala and the user didn't run "invalidate : // metadata". If 'db' is not null, there still may be tables in the Metastore that : // would require a DdlDelegate. Another scenario is when the user creates tables : // that require a DdlDelegate in Impala, then drops the database in Hive. Ideally the : // user visible result of DROP DATABASE CASCADE would be the same in any case. The : // best solution is probably to move the DdlDelegate functionality into Hive. For : // now Impala will assume that any table not in its cache also doesn't require the : // use of a DdlDelegate (ignoring the no-op UnsupportedDDLDelegate class). If that : // assumption isn't correct, the database will still be dropped in the Metastore : // but the underlying data would remain. Users can issue a REFRESH command to load : // the database metadata before dropping to ensure delegates will be used when needed. I feel this comment is still a bit confusing. First of all it lists 3 cases that need to be handled consistently when the user issues a DROP DB CASCADE but the code below doesn't seem to be handling the db == null case. What happens in this case in terms of Kudu tables? Also, the last part, wrt the assumption that Impala is making regarding the use of DdlDelegate or not. It is still not clear how this translates to the code below. I believe part of my confusion stems from the this DDlDelegate concept :). My recommendation would be to remove this thing and then rewrite the comment to simply describe what the code block is doing (if the code is not self descriptive). http://gerrit.cloudera.org:8080/#/c/2617/10/fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java File fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java: Line 1580: } If you don't have that already can you add the following test cases: 1. table name in CREATE TABLE is not the same as the table name in tblproperties. 2. table name in CREATE TABLE is fully qualified (i.e. has a db name). http://gerrit.cloudera.org:8080/#/c/2617/10/tests/common/kudu_test_suite.py File tests/common/kudu_test_suite.py: Line 52: def get_db_name(cls): : # When py.test runs with the xdist plugin, several processes are started and each : # process runs some partition of the tests. It's possible that multiple processes : # will call this method. A random value is generated so the processes won't try : # to use the same database at the same time. The value is cached so within a single : # process the same database name is always used for the class. This doesn't need to : # be thread-safe since multi-threading is never used. : if not cls.__DB_NAME: : cls.__DB_NAME = \ : choice(ascii_lowercase) + "".join(sample(ascii_lowercase + digits, 5)) : return cls.__DB_NAME Can't you use the db fixture that Michael created instead? http://gerrit.cloudera.org:8080/#/c/2617/10/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: Line 230: Can you add a test (if not exists) where a database that has kudu tables is dropped using hive and then a drop database cascade stmt is run through Impala? Also, what will happen if the user drops the db (that contains managed kudu tables) through Hive and then runs invalidate metadata in Impala? Can we add a test to describe the expected behavior? -- To view, visit http://gerrit.cloudera.org:8080/2617 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ic141102818b6dad3016181b179a14024d0ff709d Gerrit-PatchSet: 10 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Casey Ching <[email protected]> Gerrit-Reviewer: Casey Ching <[email protected]> Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-HasComments: Yes
