[Impala-CR](cdh5-trunk) Simplify creating external Kudu tables and add DROP DATABASE CASCADE

Dimitris Tsirogiannis (Code Review) Wed, 27 Apr 2016 20:34:32 -0700

Dimitris Tsirogiannis has posted comments on this change.

Change subject: Simplify creating external Kudu tables and add DROP DATABASE 
CASCADE
......................................................................

Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/2617/10/fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java
File fe/src/main/java/com/cloudera/impala/service/CatalogOpExecutor.java:

Line 1113: // The use of DdlDelegates combined with the fact that the catalog
is disconnected
: // from the source of truth, the Hive Metastore, leads to some
complications here.
: // If 'db' is null, that should mean either the database doesn't
really exist or the
: // database was created outside of Impala and the user didn't
run "invalidate
: // metadata". If 'db' is not null, there still may be tables in
the Metastore that
: // would require a DdlDelegate. Another scenario is when the
user creates tables
: // that require a DdlDelegate in Impala, then drops the database
in Hive. Ideally the
: // user visible result of DROP DATABASE CASCADE would be the
same in any case. The
: // best solution is probably to move the DdlDelegate
functionality into Hive. For
: // now Impala will assume that any table not in its cache also
doesn't require the
: // use of a DdlDelegate (ignoring the no-op
UnsupportedDDLDelegate class). If that
: // assumption isn't correct, the database will still be dropped
in the Metastore
: // but the underlying data would remain. Users can issue a
REFRESH command to load
: // the database metadata before dropping to ensure delegates
will be used when needed.
I feel this comment is still a bit confusing. First of all it lists 3 cases
that need to be handled consistently when the user issues a DROP DB CASCADE but
the code below doesn't seem to be handling the db == null case. What happens in
this case in terms of Kudu tables? Also, the last part, wrt the assumption that
Impala is making regarding the use of DdlDelegate or not. It is still not clear
how this translates to the code below. I believe part of my confusion stems
from the this DDlDelegate concept :). My recommendation would be to remove this
thing and then rewrite the comment to simply describe what the code block is
doing (if the code is not self descriptive).

http://gerrit.cloudera.org:8080/#/c/2617/10/fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java
File fe/src/test/java/com/cloudera/impala/analysis/AnalyzeDDLTest.java:

Line 1580: }
If you don't have that already can you add the following test cases:
1. table name in CREATE TABLE is not the same as the table name in
tblproperties.
2. table name in CREATE TABLE is fully qualified (i.e. has a db name).

http://gerrit.cloudera.org:8080/#/c/2617/10/tests/common/kudu_test_suite.py
File tests/common/kudu_test_suite.py:

Line 52: def get_db_name(cls):
: # When py.test runs with the xdist plugin, several processes are
started and each
: # process runs some partition of the tests. It's possible that
multiple processes
: # will call this method. A random value is generated so the
processes won't try
: # to use the same database at the same time. The value is cached
so within a single
: # process the same database name is always used for the class.
This doesn't need to
: # be thread-safe since multi-threading is never used.
: if not cls.__DB_NAME:
: cls.__DB_NAME = \
: choice(ascii_lowercase) + "".join(sample(ascii_lowercase +
digits, 5))
: return cls.__DB_NAME
Can't you use the db fixture that Michael created instead?

http://gerrit.cloudera.org:8080/#/c/2617/10/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

Line 230:
Can you add a test (if not exists) where a database that has kudu tables is
dropped using hive and then a drop database cascade stmt is run through Impala?

Also, what will happen if the user drops the db (that contains managed kudu
tables) through Hive and then runs invalidate metadata in Impala? Can we add a
test to describe the expected behavior?

--
To view, visit http://gerrit.cloudera.org:8080/2617
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic141102818b6dad3016181b179a14024d0ff709d
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Casey Ching <[email protected]>
Gerrit-Reviewer: Casey Ching <[email protected]>
Gerrit-Reviewer: Dimitris Tsirogiannis <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-HasComments: Yes

[Impala-CR](cdh5-trunk) Simplify creating external Kudu tables and add DROP DATABASE CASCADE

Reply via email to