Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/10075 )
Change subject: KUDU-2191: Metadata Upgrade Tool ...................................................................... Patch Set 4: (22 comments) http://gerrit.cloudera.org:8080/#/c/10075/4/java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java File java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java: http://gerrit.cloudera.org:8080/#/c/10075/4/java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java@61 PS4, Line 61: DEPRECATED nit: LEGACY http://gerrit.cloudera.org:8080/#/c/10075/4/java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java@127 PS4, Line 127: checkKuduProperties(newTable); This is going to reject any alterations of a legacy Kudu table entry that doesn't upgrade it to the new format. I _think_ that's OK, but I want to call it out just so we make sure to think through the ramifications of that. http://gerrit.cloudera.org:8080/#/c/10075/4/java/kudu-hive/src/main/java/org/apache/kudu/hive/metastore/KuduMetastorePlugin.java@150 PS4, Line 150: deprecated nit: legacy http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/hms/hms_catalog.h File src/kudu/hms/hms_catalog.h: http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/hms/hms_catalog.h@89 PS4, Line 89: Status AlterImpalaFormattedTable(const std::string& id, maybe this should be called something like 'UpgradeLegacyTable'? I think that may capture the intended usage better. http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/hms/hms_catalog.h@91 PS4, Line 91: const Schema& schema) WARN_UNUSED_RESULT; the ordering of the new methods is flipped WRT the .cc http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/hms/hms_catalog.h@95 PS4, Line 95: sList It's not clear what is 'List' means here. http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/hms/hms_catalog.h@96 PS4, Line 96: <std::string Please doc what this key is. http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/hms/hms_catalog.cc File src/kudu/hms/hms_catalog.cc: http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/hms/hms_catalog.cc@183 PS4, Line 183: Status HmsCatalog::RetrieveTablesList(const char* storage_handler, Add a unit test for this method. http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/hms/hms_catalog.cc@190 PS4, Line 190: RETURN_NOT_OK(client->GetAllTables(database_name, &table_names)); Probably worth adding a 'table_names.clear();' line above this, just to be defensive. http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/hms/hms_catalog.cc@196 PS4, Line 196: hms_tables->emplace(table_name, move(hms_table)); What happens if you have tables 'a.foo' and 'b.foo', won't this only retain one of those? http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/hms/hms_catalog.cc@216 PS4, Line 216: A begin status messages with lowercase. Also maybe reword this to call out the error that occcurred (right now it sounds like an action that will be taken). http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/mini-cluster/external_mini_cluster.h File src/kudu/mini-cluster/external_mini_cluster.h: http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/mini-cluster/external_mini_cluster.h@332 PS4, Line 332: SetMetastoreIntegration Maybe call it EnableMetastoreIntegration in order to be consistent (here and below). http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/kudu-tool-test.cc File src/kudu/tools/kudu-tool-test.cc: http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/kudu-tool-test.cc@1903 PS4, Line 1903: const shared_ptr<KuduClient>& kudu_client, alignment http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/kudu-tool-test.cc@1971 PS4, Line 1971: // 3. Create a non-impala Kudu table. Would be good to add test cases for a table which already has a period in the name (foo.bar), and a table which has an invalid hive character (foo!). http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/kudu-tool-test.cc@2007 PS4, Line 2007: NO_FATALS(ValidateHmsEntries(&hms_client, kudu_client, default_database_name, Should the external table entry be validated? http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/tool_action_hms.cc File src/kudu/tools/tool_action_hms.cc: http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/tool_action_hms.cc@73 PS4, Line 73: Status AlterLegacyKuduTables(const client::sp::shared_ptr<KuduClient>& kudu_client, In general I'm finding it difficult to figure out which kinds of tables are being handled here vs in HmsUpgrade itself. Maybe more documentation or examples would help. http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/tool_action_hms.cc@85 PS4, Line 85: // Renaming existing Impala-managed table name to drop the ‘impala::’ prefix Shouldn't this case have already been handled in step 2. below? http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/tool_action_hms.cc@98 PS4, Line 98: RETURN_NOT_OK(alterer->RenameTo(new_table_name) We talked yesterday about changing HmsCatalog to not ccreate table entries when they aren't present in the HMS. I think that would break this workflow, because it's relying on that behavior, right? I don't immediately see a solution to that, but I'll think about it. Curious if you have thoughts. http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/tool_action_hms.cc@118 PS4, Line 118: // }, This indenting is a little funky, I think the parameters should be indented one more level, as well as the closing brace. http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/tool_action_hms.cc@165 PS4, Line 165: for (auto itr = hms_tables_map.begin(); itr != hms_tables_map.end(); itr = std::next(itr)) { Can be a simple for/each loop? for (pair<string, hive::Table>& entry : hms_tables_map) { } http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/tool_action_hms.cc@179 PS4, Line 179: legit s/legit/Hive-compatible http://gerrit.cloudera.org:8080/#/c/10075/4/src/kudu/tools/tool_action_hms.cc@202 PS4, Line 202: hsm_upgrade hms -- To view, visit http://gerrit.cloudera.org:8080/10075 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1c8488e1c7b04b8bedc31c83b1496b53bae49cb3 Gerrit-Change-Number: 10075 Gerrit-PatchSet: 4 Gerrit-Owner: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Alexey Serbin <aser...@cloudera.com> Gerrit-Reviewer: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Fri, 20 Apr 2018 18:23:20 +0000 Gerrit-HasComments: Yes