Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/10075 )
Change subject: KUDU-2191: Metadata Upgrade Tool ...................................................................... Patch Set 1: (19 comments) http://gerrit.cloudera.org:8080/#/c/10075/1/java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java File java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java: http://gerrit.cloudera.org:8080/#/c/10075/1/java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java@174 PS1, Line 174: // Test altering a Kudu (or a historical) table. nit: I think the term 'legacy' is better than historical, because historical can mean a bunch of different things. But even better would be to be specific with what type of table this is, e.g. an Impala managed table or Impala external table. http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/hms/hms_catalog.h File src/kudu/hms/hms_catalog.h: http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/hms/hms_catalog.h@95 PS1, Line 95: Status RetrieveKuduTablesList(const char* kudu_storage_handler, As far as I can tell this isn't specific to Kudu at all, it's a generic lookup for any table with the provided storage handler. http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/hms/hms_client.h File src/kudu/hms/hms_client.h: http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/hms/hms_client.h@94 PS1, Line 94: static const char* const kKuduTableNameKey; Are we going to be using this going forward? If not, maybe add 'Deprecated' to the name. http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/mini-cluster/external_mini_cluster.h File src/kudu/mini-cluster/external_mini_cluster.h: http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/mini-cluster/external_mini_cluster.h@148 PS1, Line 148: bool enable_metastore_integration; Could you tweak this a bit so there is only one flag to flip in the common case of wanting a mini HMS plus wanting the mini kudu cluster to have the integration enabled? I think the case of wanting a mini HMS but not wanting the integration enabled will be very rare. One way that could be done is have this new flag imply enable_hive_metastore, and switch all call sites just to use the new one. http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc File src/kudu/tools/kudu-tool-test.cc: http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1854 PS1, Line 1854: // Do not add other parameters such as 'kudu.master_addresses' since This is problematic because today's HMS table entries that impala creates can have the kudu.master_addresses property. Here's a sample: parameters={ STATS_GENERATED: TASK, kudu.master_addresses: nightly6x-1.gce.cloudera.com, kudu.table_name: impala::default.managed_table, numRows: 80241, storage_handler: com.cloudera.kudu.hive.KuduStorageHandler, transient_lastDdlTime: 1523648851 }, http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1870 PS1, Line 1870: !MatchPattern(table_name, Shouldn't this be a strict prefix match? If so HasPrefixString is probably more straightforward. http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1874 PS1, Line 1874: TEST_F(ToolTest, TestHmsUpgrade) { There are three distinct kinds of tables we need to upgrade: * managed impala table * external impala table * non-impala Kudu table which doesn't have an HMS entry This test should differentiate these cases and ensure each works. http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1888 PS1, Line 1888: string historical_table_name = StrCat(HmsClient::kHistoricalTablePrefix, The existing managed table format is 'impala::my_db.my_table'. http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1904 PS1, Line 1904: StrCat(HmsClient::kHistoricalTablePrefix, table_name) historical_table_name http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/kudu-tool-test.cc@1945 PS1, Line 1945: ASSERT_EQ(database_name, dbname); If there is only one database and one table, why the for loops? http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc File src/kudu/tools/tool_action_hms.cc: http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@64 PS1, Line 64: const char* const kMasterAddressesArg = "master_addresses"; This is defined in tool_action_common.h. http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@66 PS1, Line 66: const char* const kHmsUrisArg = "hms_uris"; Could you use the existing hive_metastore_uris flag to be consistent? http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@69 PS1, Line 69: bool IsHistoricalTable(const string& table_name) { Doc this? Kudu doesn't currently have any restrictions on table names, so what significance is the period? http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@84 PS1, Line 84: // Renaming existing Impala-managed table name to drop the ‘impala::’ prefix Looks like this is handling managed tables, but not external tables (where the Kudu table name is contained in the storage property). http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@94 PS1, Line 94: kudu_table_name = StrCat(HmsClient::kDefaultDatabaseName, ".", table_name); What's the reasoning behind having a default database, especially one other than 'default'? What if this database doesn't exist? What about sentry permissions? I'm pretty uneasy about renaming tables silently. http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@107 PS1, Line 107: vector<string> master_addresses = Split(master_addresses_str, ","); not used? http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@138 PS1, Line 138: string historical_table_name = StrCat(HmsClient::kHistoricalTablePrefix, impala::db_name.table_name http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@142 PS1, Line 142: AlterHistoricalTable Instead of 'historical' everywhere, I think it would be better to name this something like 'AlterImpalaFormattedTable'. Open to suggestions, but 'historical' is kind of vague. http://gerrit.cloudera.org:8080/#/c/10075/1/src/kudu/tools/tool_action_hms.cc@155 PS1, Line 155: // TODO(Hao): update to use AddOptionalParameter once hive_metastore_uris ah, so we can't have experimental flags be included in tools? -- To view, visit http://gerrit.cloudera.org:8080/10075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1c8488e1c7b04b8bedc31c83b1496b53bae49cb3 Gerrit-Change-Number: 10075 Gerrit-PatchSet: 1 Gerrit-Owner: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Mon, 16 Apr 2018 18:40:37 +0000 Gerrit-HasComments: Yes