Dimitris Tsirogiannis has posted comments on this change.

Change subject: Simplify creating external Kudu tables and add DROP DATABASE 
CASCADE
......................................................................


Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/2617/10/fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
File fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java:

Line 1113: // The use of DdlDelegates combined with the fact that the catalog 
is disconnected
         :     // from the source of truth, the Hive Metastore, leads to some 
complications here.
         :     // If 'db' is null, that should mean either the database doesn't 
really exist or the
         :     // database was created outside of Impala and the user didn't 
run "invalidate
         :     // metadata". If 'db' is not null, there still may be tables in 
the Metastore that
         :     // would require a DdlDelegate. Another scenario is when the 
user creates tables
         :     // that require a DdlDelegate in Impala, then drops the database 
in Hive. Ideally the
         :     // user visible result of DROP DATABASE CASCADE would be the 
same in any case. The
         :     // best solution is probably to move the DdlDelegate 
functionality into Hive. For
         :     // now Impala will assume that any table not in its cache also 
doesn't require the
         :     // use of a DdlDelegate (ignoring the no-op 
UnsupportedDDLDelegate class). If that
         :     // assumption isn't correct, the database will still be dropped 
in the Metastore
         :     // but the underlying data would remain. Users can issue a 
REFRESH command to load
         :     // the database metadata before dropping to ensure delegates 
will be used when needed.
I feel this comment is still a bit confusing. First of all it lists 3 cases 
that need to be handled consistently when the user issues a DROP DB CASCADE but 
the code below doesn't seem to be handling the db == null case. What happens in 
this case in terms of Kudu tables? Also, the last part, wrt the assumption that 
Impala is making regarding the use of DdlDelegate or not. It is still not clear 
how this translates to the code below. I believe part of my confusion stems 
from the this DDlDelegate concept :). My recommendation would be to remove this 
thing and then rewrite the comment to simply describe what the code block is 
doing (if the code is not self descriptive).


http://gerrit.cloudera.org:8080/#/c/2617/10/fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java:

Line 1580:     }
If you don't have that already can you add the following test cases:
1. table name in CREATE TABLE is not the same as the table name in 
tblproperties.
2. table name in CREATE TABLE is fully qualified (i.e. has a db name).


http://gerrit.cloudera.org:8080/#/c/2617/10/tests/common/kudu_test_suite.py
File tests/common/kudu_test_suite.py:

Line 52: def get_db_name(cls):
       :     # When py.test runs with the xdist plugin, several processes are 
started and each
       :     # process runs some partition of the tests. It's possible that 
multiple processes
       :     # will call this method. A random value is generated so the 
processes won't try
       :     # to use the same database at the same time. The value is cached 
so within a single
       :     # process the same database name is always used for the class. 
This doesn't need to
       :     # be thread-safe since multi-threading is never used.
       :     if not cls.__DB_NAME:
       :       cls.__DB_NAME = \
       :           choice(ascii_lowercase) + "".join(sample(ascii_lowercase + 
digits, 5))
       :     return cls.__DB_NAME
Can't you use the db fixture that Michael created instead?


http://gerrit.cloudera.org:8080/#/c/2617/10/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

Line 230: 
Can you add a test (if not exists) where a database that has kudu tables is 
dropped using hive and then a drop database cascade stmt is run through Impala?

Also, what will happen if the user drops the db (that contains managed kudu 
tables) through Hive and then runs invalidate metadata in Impala? Can we add a 
test to describe the expected behavior?


-- 
To view, visit http://gerrit.cloudera.org:8080/2617
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic141102818b6dad3016181b179a14024d0ff709d
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Casey Ching <[email protected]>
Gerrit-Reviewer: Casey Ching <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-HasComments: Yes

Reply via email to