Dan Burkert has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10075 )

Change subject: KUDU-2191: Metadata Upgrade Tool
......................................................................


Patch Set 1:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/10075/1/java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java
File 
java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java:

http://gerrit.cloudera.org:8080/#/c/10075/1/java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java@174
PS1, Line 174:     // Test altering a Kudu (or a historical) table.
nit: I think the term 'legacy' is better than historical, because historical 
can mean a bunch of different things.  But even better would be to be specific 
with what type of table this is, e.g. an Impala managed table or Impala 
external table.


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/hms/hms_catalog.h
File src/kudu/hms/hms_catalog.h:

http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/hms/hms_catalog.h@95
PS1, Line 95:   Status RetrieveKuduTablesList(const char* kudu_storage_handler,
As far as I can tell this isn't specific to Kudu at all, it's a generic lookup 
for any table with the provided storage handler.


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/hms/hms_client.h
File src/kudu/hms/hms_client.h:

http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/hms/hms_client.h@94
PS1, Line 94:   static const char* const kKuduTableNameKey;
Are we going to be using this going forward?  If not, maybe add 'Deprecated' to 
the name.


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/mini-cluster/external_mini_cluster.h
File src/kudu/mini-cluster/external_mini_cluster.h:

http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/mini-cluster/external_mini_cluster.h@148
PS1, Line 148:   bool enable_metastore_integration;
Could you tweak this a bit so there is only one flag to flip in the common case 
of wanting a mini HMS plus wanting the mini kudu cluster to have the 
integration enabled?  I think the case of wanting a mini HMS but not wanting 
the integration enabled will be very rare.  One way that could be done is have 
this new flag imply enable_hive_metastore, and switch all call sites just to 
use the new one.


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc
File src/kudu/tools/kudu-tool-test.cc:

http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1854
PS1, Line 1854:   // Do not add other parameters such as 
'kudu.master_addresses' since
This is problematic because today's HMS table entries that impala creates can 
have the kudu.master_addresses property.  Here's a sample:

  parameters={
    STATS_GENERATED: TASK,
    kudu.master_addresses: nightly6x-1.gce.cloudera.com,
    kudu.table_name: impala::default.managed_table,
    numRows: 80241,
    storage_handler: com.cloudera.kudu.hive.KuduStorageHandler,
    transient_lastDdlTime: 1523648851
  },


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1870
PS1, Line 1870:          !MatchPattern(table_name,
Shouldn't this be a strict prefix match?  If so HasPrefixString is probably 
more straightforward.


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1874
PS1, Line 1874: TEST_F(ToolTest, TestHmsUpgrade) {
There are three distinct kinds of tables we need to upgrade:

* managed impala table
* external impala table
* non-impala Kudu table which doesn't have an HMS entry

This test should differentiate these cases and ensure each works.


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1888
PS1, Line 1888:   string historical_table_name = 
StrCat(HmsClient::kHistoricalTablePrefix,
The existing managed table format is 'impala::my_db.my_table'.


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1904
PS1, Line 1904: StrCat(HmsClient::kHistoricalTablePrefix, table_name)
historical_table_name


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1945
PS1, Line 1945:         ASSERT_EQ(database_name, dbname);
If there is only one database and one table, why the for loops?


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc
File src/kudu/tools/tool_action_hms.cc:

http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@64
PS1, Line 64: const char* const kMasterAddressesArg = "master_addresses";
This is defined in tool_action_common.h.


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@66
PS1, Line 66: const char* const kHmsUrisArg = "hms_uris";
Could you use the existing hive_metastore_uris flag to be consistent?


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@69
PS1, Line 69: bool IsHistoricalTable(const string& table_name) {
Doc this?  Kudu doesn't currently have any restrictions on table names, so what 
significance is the period?


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@84
PS1, Line 84:       // Renaming existing Impala-managed table name to drop the 
‘impala::’ prefix
Looks like this is handling managed tables, but not external tables (where the 
Kudu table name is contained in the storage property).


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@94
PS1, Line 94:         kudu_table_name = StrCat(HmsClient::kDefaultDatabaseName, 
".", table_name);
What's the reasoning behind having a default database, especially one other 
than 'default'?  What if this database doesn't exist?  What about sentry 
permissions?  I'm pretty uneasy about renaming tables silently.


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@107
PS1, Line 107:   vector<string> master_addresses = Split(master_addresses_str, 
",");
not used?


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@138
PS1, Line 138:     string historical_table_name = 
StrCat(HmsClient::kHistoricalTablePrefix,
impala::db_name.table_name


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@142
PS1, Line 142: AlterHistoricalTable
Instead of 'historical' everywhere, I think it would be better to name this 
something like 'AlterImpalaFormattedTable'.  Open to suggestions, but 
'historical' is kind of vague.


http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@155
PS1, Line 155:   // TODO(Hao): update to use AddOptionalParameter once 
hive_metastore_uris
ah, so we can't have experimental flags be included in tools?



--
To view, visit http://gerrit.cloudera.org:8080/10075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c8488e1c7b04b8bedc31c83b1496b53bae49cb3
Gerrit-Change-Number: 10075
Gerrit-PatchSet: 1
Gerrit-Owner: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Mon, 16 Apr 2018 18:40:37 +0000
Gerrit-HasComments: Yes

Reply via email to