[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@268 PS8, Line 268: ASSERT_TRUE(s.IsIllegalState()) << s.ToString(); : ASSERT_STR_CONTAINS(s.ToString(), " > Do you have a suggestion for how to rename? We have three table name local Hmm. The "hms" prefix is doubly confusing in that it is used for hms_client_ too. So I'd replace that with "full" or some other descriptive prefix. And instead of omitting the prefix for Kudu names, maybe prefix with "kudu" instead. It's also possible that a scheme like "table1", "table2", "table3", etc. would be easier in that it's easier to search and track. -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 10 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 03 Apr 2018 18:04:05 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@268 PS8, Line 268: ASSERT_TRUE(s.IsIllegalState()) << s.ToString(); : ASSERT_STR_CONTAINS(s.ToString(), " > I don't think swapping variables for literals will necessarily change anyth Do you have a suggestion for how to rename? We have three table name locals that are being used here: table_name external_table_name renamed_table_name It's additionally confusing because there are additional locals which are bound to just the table portion of then names (as opposed to database.table). -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 10 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 03 Apr 2018 17:56:50 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 10: (1 comment) http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@268 PS8, Line 268: ASSERT_TRUE(s.IsIllegalState()) << s.ToString(); : ASSERT_STR_CONTAINS(s.ToString(), " > 1. Done I don't think swapping variables for literals will necessarily change anything. I mean, if you use literals with the same names as the variables, that just means you're adding double quotes, right? So I think improving the names is what's more important. -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 10 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 03 Apr 2018 17:23:00 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 10: (7 comments) http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/hms_client.cc File src/kudu/hms/hms_client.cc: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/hms_client.cc@202 PS8, Line 202: // operations properly. > Yes, I believe it's taken directly from the hive-site.xml so it's possible Gotcha. Thanks for the explanation. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@162 PS8, Line 162: } : : // Checks that a table does not exist in the Kudu and HMS catalogs. : void CheckCatalogsNegative(const string& database_n > I've tried to reduce the number of tests cases since spinning up an HMS is No good ideas here short of changing the MiniHMS lifecycle to outlive its minicluster (so that it's reset once per test suite rather than once per test). But that's a really bad idea for other reasons. If you have the time, it might be fruitful to profile the HMS startup and figure out why it takes so long. If that time is spent searching an enormous classpath, we may be able to prune it to improve startup time. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@268 PS8, Line 268: ASSERT_TRUE(s.IsIllegalState()) << s.ToString(); : ASSERT_STR_CONTAINS(s.ToString(), " > You may be reading too much into this. This is just setting up a hypotheti But isn't the CheckCatalogs() call on L277 ensuring that the Kudu table (renamed_table_name) and the HMS table (hms_renamed_table_name) have the same names? I was still confused so I went through the entire test and wrote down the actions taking place: // Create the Kudu table. Create Kudu table rename_db.kudu_table - Implicit create of same HMS table // Create a non-Kudu HMS table entry. Create non-Kudu HMS table rename_db.external_table // Drop the HMS table entry and rename the table. This tests that the // HmsCatalog will create a new entry when necessary. Drop HMS table rename_db.kudu_table Rename Kudu table from rename_db.kudu_table to rename_db.renamed_table - Implicit create of Kudu HMS table rename_db.renamed_table // Rename the table back to the original name. This tests the happy path. Rename Kudu table from rename_db.renamed_table to rename_db.kudu_table - Implicit rename of Kudu HMS table rename_db.renamed_table to rename_db.kudu_table // Drop the HMS table entry, then create a non-Kudu table entry in it's place, // and attempt to rename the table. Drop Kudu HMS table rename_db.kudu_table Create non-Kudu HMS table rename_db.kudu_table Rename Kudu table from rename_db.kudu_table to rename_db.renamed_table - No implicit rename because rename_db.kudu_table is non-Kudu HMS entry - Implicit create of Kudu HMS table rename_db.renamed_table What I was missing was that in the last Kudu rename, not only does Kudu NOT rename the existing HMS entry (because it's a non-Kudu entry), but it'll implicitly create a new HMS entry for the destination table name. A couple things that would make this easier to understand: 1. hms_client_->CreateTable() and CreateTable() are similar looking; rename CreateTable() to something else. 2. The table name variables are also pretty similar looking; rename them so that they stand out from one another more. http://gerrit.cloudera.org:8080/#/c/8312/9/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8312/9/src/kudu/master/catalog_manager.cc@1755 PS9, Line 1755: Schema schema; > It's a hack that allows using the RETURN_NOT_OK_LOG macro inside the lambda Hmm, I do think it's somewhat unintuitive; I could see someone else reading this and thinking that they've misunderstood how ScopedCleanup works. I wouldn't mind it if there was a comment saying the return value is ignored. Or if you wrote it out manually. http://gerrit.cloudera.org:8080/#/c/8312/9/src/kudu/master/hms_catalog.cc File src/kudu/master/hms_catalog.cc: http://gerrit.cloudera.org:8080/#/c/8312/9/src/kudu/master/hms_catalog.cc@201 PS9, Line 201: HMS table entry: " : << s.ToString(); : return create_table(); > Well you're right to expect there are TOCTOU races going on all over the pl I don't understand your explanation. On L172 we retrieve the original HMS table entry. On L207, if that entry's name matches the new name, we short-circuit. What's stopping another HMS client (perhaps another Kudu cluster?) from renaming the entry from the original
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Dan Burkert has abandoned this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Abandoned split into smaller chunks -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 10 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 10: > Patch Set 10: > > (6 comments) This has been split into the following reviews: https://gerrit.cloudera.org/c/9861/ https://gerrit.cloudera.org/c/9862/ https://gerrit.cloudera.org/c/9863/ -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 10 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Fri, 30 Mar 2018 01:26:26 + Gerrit-HasComments: No
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 10: As discussed on Slack I'm going to split this commit in two, but I wanted to put up a revision with the fixes made in response to Adar's comments. -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 10 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 28 Mar 2018 23:24:18 + Gerrit-HasComments: No
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, Hao Hao, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8312 to look at the new patch set (#10). Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. KUDU-2191 (6/n): Hive Metastore catalog manager integration This commit adds integration with the Hive Metastore to CatalogManager, which can be enabled with a new flag: 'hive-metastore-addresses'. When the flag is provided, the Master will push DDL changes in its catalog to the Hive Metastore. In particular: - Create table, delete table, and alter table operations in the Kudu master will synchronously update the corresponding entry in the HMS. - New table and column names are required to be valid according to the Hive identifier rules, which are much stricter than Kudu's existing identifier rules. I think the code-paths in hms_catalog are pretty well covered by the tests added in this commit, however there is a dearth of stress and chaos tests covering the actual CatalogManager integration. In particular I've left some TODOs of roll-back code that could benefit from more coverage, as well as a TODO in master-stress-test about enabling HMS integration. Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf --- M java/kudu-hive/src/test/java/org/apache/kudu/hive/metastore/TestKuduMetastorePlugin.java M src/kudu/hms/hms_client-test.cc M src/kudu/hms/hms_client.cc M src/kudu/hms/hms_client.h M src/kudu/hms/mini_hms.cc M src/kudu/integration-tests/CMakeLists.txt M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/master-stress-test.cc M src/kudu/integration-tests/master_failover-itest.cc A src/kudu/integration-tests/master_hms-itest.cc M src/kudu/master/CMakeLists.txt M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/hms_catalog-test.cc A src/kudu/master/hms_catalog.cc A src/kudu/master/hms_catalog.h M src/kudu/master/master.proto M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h 20 files changed, 1,405 insertions(+), 58 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/12/8312/10 -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 10 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 9: (52 comments) http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/hms_client.cc File src/kudu/hms/hms_client.cc: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/hms_client.cc@193 PS8, Line 193: // Also check that the HMS is configured to allow changing the type of : // columns, which is required to support dropping columns. > Could you elaborate on this a bit? It's not intuitive why it needs to be le Added more elaboration. tldr; hive is a real :dumpster_fire2: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/hms_client.cc@202 PS8, Line 202: if (boost::iequals(disallow_incompatible_column_type_changes, "true")) { > Hmm, is it really possible for the config's value to have a mixed case? Ask Yes, I believe it's taken directly from the hive-site.xml so it's possible to have mixed case. Line 185 is matching on a class name, and Java is case sensitive so it's not a valid configuration to screw that up. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/hms_client.cc@218 PS8, Line 218: bool HmsClient::IsConnected() { > Maybe add a bit of test coverage for this new method? Done http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/mini_hms.cc File src/kudu/hms/mini_hms.cc: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/mini_hms.cc@214 PS8, Line 214: jdbc:derby:$1/metadb;create=true > Why the change from 'memory' back to 'directory'? I changed it originally in the hope that it would make things a bit faster if derby didn't have to write to disk. It's being changed back because I've added tests that stop and restart the HMS, and if it's not on disk the HMS will come back with an empty DB. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/CMakeLists.txt File src/kudu/integration-tests/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/CMakeLists.txt@87 PS8, Line 87: ADD_KUDU_TEST(master_hms-itest) > Could you run this new test with KUDU_MEASURE_TEST_CPU_CONSUMPTION=1 (see r Working on this. http://gerrit.cloudera.org:8080/#/c/8312/9/src/kudu/integration-tests/master-stress-test.cc File src/kudu/integration-tests/master-stress-test.cc: http://gerrit.cloudera.org:8080/#/c/8312/9/src/kudu/integration-tests/master-stress-test.cc@343 PS9, Line 343: return Substitute("default.table_$0", oid_generator_.Next()); > Not necessary yet? No, it's not but it doesn't hurt anything so I figured I may as well. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@122 PS8, Line 122: > Could RETURN_NOT_OK instead. Done http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@162 PS8, Line 162: string table_name = Substitute("$0.$1", hms_database_name, hms_table_name); : : // Attempt to create the table before the database is created. : Status s = CreateTable(hms_database_name, hms_table > Could be its own test (i.e. TestCreateTableWithoutDatabase) I've tried to reduce the number of tests cases since spinning up an HMS is very expensive (~10 seconds), so having many short tests becomes really expensive. If you have an idea for how to fix that, or still feel like they should be split up I can go ahead, though. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@190 PS8, Line 190: : // Shutdown the HMS and try to create a table. : ASSERT_OK(StopHms()); : : s = CreateTable(hms_database_name, "foo"); : ASSERT_TRUE(s.IsNetworkError()) << s.ToString(); : : // Start the HMS and try again. : ASSERT_OK(StartHms()); : ASSERT_OK(CreateTable(hms_database_name, "foo")); > Could break out into a separate test since it only shares HMS database stat Likewise. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@207 PS8, Line 207: : // Create the database. : hive::Database db; : db.name = hms_database_name; : ASSERT_OK(hms_client_->CreateDatabase(db)); > Every test does this; consider doing it in the test fixture. Done http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@255 PS8, Line 255: hive::EnvironmentContext env_ctx; > Why is it necessary to provide the table's ID here, on L270, and on L371, b This wasn't really necessary. If you do provide the table ID when dropping a Kudu table, the
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 9: (55 comments) http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/hms_client.cc File src/kudu/hms/hms_client.cc: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/hms_client.cc@193 PS8, Line 193: // Also check that the HMS is configured to allow changing the type of : // columns, which is required to support dropping columns. Could you elaborate on this a bit? It's not intuitive why it needs to be legal to change a column's type in order to drop it. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/hms_client.cc@202 PS8, Line 202: if (boost::iequals(disallow_incompatible_column_type_changes, "true")) { Hmm, is it really possible for the config's value to have a mixed case? Asking because L185 doesn't do a case-insensitive comparison when verifying that the required listeners are in hive.metastore.transactional.event.listeners. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/hms_client.cc@218 PS8, Line 218: bool HmsClient::IsConnected() { Maybe add a bit of test coverage for this new method? http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/mini_hms.cc File src/kudu/hms/mini_hms.cc: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/mini_hms.cc@186 PS8, Line 186: //Configures the HMS to allow altering and dropping columns. Nit: indentation is off by one character. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/hms/mini_hms.cc@214 PS8, Line 214: jdbc:derby:$1/metadb;create=true Why the change from 'memory' back to 'directory'? http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/CMakeLists.txt File src/kudu/integration-tests/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/CMakeLists.txt@87 PS8, Line 87: ADD_KUDU_TEST(master_hms-itest) Could you run this new test with KUDU_MEASURE_TEST_CPU_CONSUMPTION=1 (see run-test.sh) to figure out whether to set RUN_SERIAL or to enumerate the number of PROCESSORS? http://gerrit.cloudera.org:8080/#/c/8312/9/src/kudu/integration-tests/master-stress-test.cc File src/kudu/integration-tests/master-stress-test.cc: http://gerrit.cloudera.org:8080/#/c/8312/9/src/kudu/integration-tests/master-stress-test.cc@343 PS9, Line 343: return Substitute("default.table_$0", oid_generator_.Next()); Not necessary yet? http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@122 PS8, Line 122: Could RETURN_NOT_OK instead. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@162 PS8, Line 162: string table_name = Substitute("$0.$1", hms_database_name, hms_table_name); : : // Attempt to create the table before the database is created. : Status s = CreateTable(hms_database_name, hms_table Could be its own test (i.e. TestCreateTableWithoutDatabase) http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@190 PS8, Line 190: : // Shutdown the HMS and try to create a table. : ASSERT_OK(StopHms()); : : s = CreateTable(hms_database_name, "foo"); : ASSERT_TRUE(s.IsNetworkError()) << s.ToString(); : : // Start the HMS and try again. : ASSERT_OK(StartHms()); : ASSERT_OK(CreateTable(hms_database_name, "foo")); Could break out into a separate test since it only shares HMS database state with the previous test. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@202 PS8, Line 202: Maybe add a test that if the HMS is down renaming fails? http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@207 PS8, Line 207: : // Create the database. : hive::Database db; : db.name = hms_database_name; : ASSERT_OK(hms_client_->CreateDatabase(db)); Every test does this; consider doing it in the test fixture. http://gerrit.cloudera.org:8080/#/c/8312/8/src/kudu/integration-tests/master_hms-itest.cc@243 PS8, Line 243: // TODO(HIVE-18852): match on the error message. : : // Attempt to rename the Kudu table to an invalid table name. : table_alterer.reset(client_->NewTableAlterer(table_name)); : s = table_alterer->RenameTo(Substitute("$0.☃", hms_database_name))->Alter(); : ASSERT_TRUE(s.IsIllegalState()) << s.ToString(); Would be a good negative test for creation too, right?
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, Hao Hao, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8312 to look at the new patch set (#9). Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. KUDU-2191 (6/n): Hive Metastore catalog manager integration This commit adds integration with the Hive Metastore to CatalogManager, which can be enabled with a new flag: 'hive-metastore-addresses'. When the flag is provided, the Master will push DDL changes in its catalog to the Hive Metastore. In particular: - Create table, delete table, and alter table operations in the Kudu master will synchronously update the corresponding entry in the HMS. - New table and column names are required to be valid according to the Hive identifier rules, which are much stricter than Kudu's existing identifier rules. I think the code-paths in hms_catalog are pretty well covered by the tests added in this commit, however there is a dearth of stress and chaos tests covering the actual CatalogManager integration. In particular I've left some TODOs of roll-back code that could benefit from more coverage, as well as a TODO in master-stress-test about enabling HMS integration. Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf --- M src/kudu/hms/hms_client-test.cc M src/kudu/hms/hms_client.cc M src/kudu/hms/hms_client.h M src/kudu/hms/mini_hms.cc M src/kudu/integration-tests/CMakeLists.txt M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/master-stress-test.cc M src/kudu/integration-tests/master_failover-itest.cc A src/kudu/integration-tests/master_hms-itest.cc M src/kudu/master/CMakeLists.txt M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/hms_catalog-test.cc A src/kudu/master/hms_catalog.cc A src/kudu/master/hms_catalog.h M src/kudu/master/master.proto M src/kudu/mini-cluster/external_mini_cluster-test.cc M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h 20 files changed, 1,309 insertions(+), 42 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/12/8312/9 -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 9 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 7: (7 comments) http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/hms/hms_client.cc File src/kudu/hms/hms_client.cc: http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/hms/hms_client.cc@204 PS5, Line 204: "Hive Metastore configuration is invalid: $0 must be set to false", > Add "to support dropping columns" to specify the reason. I'm not against adding that, but consider that this error will occur for every operation that hits the HMS (create table/all alter tables/drop table), not just ALTER TABLE DROP COLUMN ops. In that context I think it makes sense to just have a blanket statement 'this config is required', and not get in to too much detail. This is the same reason I didn't include the reasoning in the kudu metastore plugin check above (plus that one is way more nuanced :). http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/integration-tests/alter_table-randomized-test.cc File src/kudu/integration-tests/alter_table-randomized-test.cc: http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/integration-tests/alter_table-randomized-test.cc@79 PS5, Line 79: const char* kTableName = "default.test_table"; > Do we need to add a comment that when enable_hive_metastore is true, the ta We're going to have to document this new behavior very prominently, because it will surely trip up pretty much every current Kudu user that goes to turn on the HMS integration. Given that it will hopefully be documented widely in other public facing locations, I'm not sure it's worth calling out in specific unit tests. http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@176 PS6, Line 176: ASSERT_EQ(table->id(), hms_table.parameters[hms::HmsClient::kKuduTableIdKey]); > Assert master addresses as well? Done http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@236 PS6, Line 236: // Drop the HMS table entry and rename the table. > I do not quite follow why do we need to drop the hms table entry? Will the I've added the comment, and added the 'happy' path, I was so focused on the edge cases I forgot about that! http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@245 PS6, Line 245: ASSERT_OK(table_alterer->RenameTo(renamed_table_name)->Alter()); > Assert table id/master addresses/table schema. Done http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@285 PS6, Line 285: ASSERT_OK(hms_client_->AlterTable(hms_database_name, hms_table_name, hms_table)); > So alter column through HMS would not affect the table schema stored in Kud Correct. http://gerrit.cloudera.org:8080/#/c/8312/7/src/kudu/master/hms_catalog.cc File src/kudu/master/hms_catalog.cc: http://gerrit.cloudera.org:8080/#/c/8312/7/src/kudu/master/hms_catalog.cc@164 PS7, Line 164: create it in the HMS > I am not sure it is good to create a HMS entry if not present. What if the Users can't directly call this method, its called by the catalog manager, who checks that the Kudu table does exist before proceeding with an alter operation. It's important to support this usecase so that users can rename 'legacy' Kudu tables that don't have an entry in the HMS, and have that create the new entry with the corrected name. -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 7 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 18:57:57 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, Hao Hao, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8312 to look at the new patch set (#8). Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. KUDU-2191 (6/n): Hive Metastore catalog manager integration This commit adds integration with the Hive Metastore to CatalogManager, which can be enabled with a new flag: 'hive-metastore-addresses'. When the flag is provided, the Master will push DDL changes in its catalog to the Hive Metastore. In particular: - Create table, delete table, and alter table operations in the Kudu master will synchronously update the corresponding entry in the HMS. - New table and column names are required to be valid according to the Hive identifier rules, which are much stricter than Kudu's existing identifier rules. I think the code-paths in hms_catalog are pretty well covered by the tests added in this commit, however there is a dearth of stress and chaos tests covering the actual CatalogManager integration. In particular I've left some TODOs of roll-back code that could benefit from more coverage, as well as a TODO in master-stress-test about enabling HMS integration. Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf --- M src/kudu/hms/hms_client-test.cc M src/kudu/hms/hms_client.cc M src/kudu/hms/hms_client.h M src/kudu/hms/mini_hms.cc M src/kudu/integration-tests/CMakeLists.txt M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/master-stress-test.cc M src/kudu/integration-tests/master_failover-itest.cc A src/kudu/integration-tests/master_hms-itest.cc M src/kudu/master/CMakeLists.txt M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/hms_catalog-test.cc A src/kudu/master/hms_catalog.cc A src/kudu/master/hms_catalog.h M src/kudu/master/master.proto M src/kudu/mini-cluster/external_mini_cluster-test.cc M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h 20 files changed, 1,300 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/12/8312/8 -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 8 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Hao Hao has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 7: (7 comments) http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/hms/hms_client.cc File src/kudu/hms/hms_client.cc: http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/hms/hms_client.cc@204 PS5, Line 204: "Hive Metastore configuration is invalid: $0 must be set to false", Add "to support dropping columns" to specify the reason. http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/integration-tests/alter_table-randomized-test.cc File src/kudu/integration-tests/alter_table-randomized-test.cc: http://gerrit.cloudera.org:8080/#/c/8312/5/src/kudu/integration-tests/alter_table-randomized-test.cc@79 PS5, Line 79: const char* kTableName = "default.test_table"; Do we need to add a comment that when enable_hive_metastore is true, the table name has to have databasename.tablename pattern? http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@176 PS6, Line 176: ASSERT_EQ(table->id(), hms_table.parameters[hms::HmsClient::kKuduTableIdKey]); Assert master addresses as well? http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@236 PS6, Line 236: // Drop the HMS table entry and rename the table. I do not quite follow why do we need to drop the hms table entry? Will the drop table in HMS also trigger drop table in Kudu? If you are testing the corner case when the entry is not present in HMS, maybe add more comment here to be more clear. Can we have a most common rename table test case without any corner cases? http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@245 PS6, Line 245: ASSERT_OK(table_alterer->RenameTo(renamed_table_name)->Alter()); Assert table id/master addresses/table schema. http://gerrit.cloudera.org:8080/#/c/8312/6/src/kudu/integration-tests/master_hms-itest.cc@285 PS6, Line 285: ASSERT_OK(hms_client_->AlterTable(hms_database_name, hms_table_name, hms_table)); So alter column through HMS would not affect the table schema stored in Kudu? http://gerrit.cloudera.org:8080/#/c/8312/7/src/kudu/master/hms_catalog.cc File src/kudu/master/hms_catalog.cc: http://gerrit.cloudera.org:8080/#/c/8312/7/src/kudu/master/hms_catalog.cc@164 PS7, Line 164: create it in the HMS I am not sure it is good to create a HMS entry if not present. What if the users make some mistakes when specifying the original table name? -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 7 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 06:40:50 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, Hao Hao, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8312 to look at the new patch set (#6). Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. KUDU-2191 (6/n): Hive Metastore catalog manager integration This commit adds integration with the Hive Metastore to CatalogManager, which can be enabled with a new flag: 'hive-metastore-addresses'. When the flag is provided, the Master will push DDL changes in its catalog to the Hive Metastore. In particular: - Create table, delete table, and alter table operations in the Kudu master will synchronously update the corresponding entry in the HMS. - New table and column names are required to be valid according to the Hive identifier rules, which are much stricter than Kudu's existing identifier rules. I think the code-paths in hms_catalog are pretty well covered by the tests added in this commit, however there is a dearth of stress and chaos tests covering the actual CatalogManager integration. In particular I've left some TODOs of roll-back code that could benefit from more coverage, as well as a TODO in master-stress-test about enabling HMS integration. Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf --- M src/kudu/hms/hms_client-test.cc M src/kudu/hms/hms_client.cc M src/kudu/hms/hms_client.h M src/kudu/hms/mini_hms.cc M src/kudu/integration-tests/CMakeLists.txt M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/master-stress-test.cc M src/kudu/integration-tests/master_failover-itest.cc A src/kudu/integration-tests/master_hms-itest.cc M src/kudu/master/CMakeLists.txt M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/hms_catalog-test.cc A src/kudu/master/hms_catalog.cc A src/kudu/master/hms_catalog.h M src/kudu/master/master.proto M src/kudu/mini-cluster/external_mini_cluster-test.cc M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h 20 files changed, 1,263 insertions(+), 24 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/12/8312/6 -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 6 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 4: (7 comments) http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/CMakeLists.txt File src/kudu/integration-tests/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/CMakeLists.txt@79 PS2, Line 79: ADD_KUDU_TEST(fuzz-itest RUN_SERIAL true) > I think you missed this one. Done http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@1066 PS2, Line 1066: return s; > I think you missed this one. I just did it in reverse order of Init(), I'm not aware of any ordering issues per se, as long as Shutdown() isn't called concurrently with any other use of the hms catalog. http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@1598 PS2, Line 1598: } > Hmm, the new behavior treats HMS table entry dropping as fatal, so this new Done http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@2092 PS2, Line 2092: case AlterTableRequestPB::DROP_RANGE_PARTITION: { > I think you missed this. Done http://gerrit.cloudera.org:8080/#/c/8312/4/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8312/4/src/kudu/master/catalog_manager.cc@1561 PS4, Line 1561: // TODO(dan): figure out how to test this. > You can inject faults into SysCatalogTable::SyncWrite. Maybe that will help That's a good idea, I'm going to leave these here for now until I can write some fault tests. Until then, I've changed MasterFailoverTest to have the HMS enabled and it does appear to be triggering these codepaths. http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h File src/kudu/master/hms_catalog.h: http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h@29 PS3, Line 29: #include "kudu/util/net/net_util.h" > Not using optional anymore? right, but HmsClient is now a field so I don't think it can be forward declared. http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h@51 PS3, Line 51: ~HmsCatalog(); > Please doc the class and its methods. I'm especially interested in the sync I've documented the public api in the .h, there are already pretty extensive notes in the .cc about synchronization, retry behavior, etc. -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 4 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Thu, 08 Mar 2018 00:47:12 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, Hao Hao, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8312 to look at the new patch set (#5). Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. KUDU-2191 (6/n): Hive Metastore catalog manager integration This commit adds integration with the Hive Metastore to CatalogManager, which can be enabled with a new flag: 'hive-metastore-addresses'. When the flag is provided, the Master will push DDL changes in its catalog to the Hive Metastore. In particular: - Create table, delete table, and alter table operations in the Kudu master will synchronously update the corresponding entry in the HMS. - New table and column names are required to be valid according to the Hive identifier rules, which are much stricter than Kudu's existing identifier rules. I think the code-paths in hms_catalog are pretty well covered by the tests added in this commit, however there is a dearth of stress and chaos tests covering the actual CatalogManager integration. In particular I've left some TODOs of roll-back code that could benefit from more coverage, as well as a TODO in master-stress-test about enabling HMS integration. Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf --- M src/kudu/hms/hms_client-test.cc M src/kudu/hms/hms_client.cc M src/kudu/hms/hms_client.h M src/kudu/hms/mini_hms.cc M src/kudu/integration-tests/CMakeLists.txt M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/master-stress-test.cc M src/kudu/integration-tests/master_failover-itest.cc A src/kudu/integration-tests/master_hms-itest.cc M src/kudu/master/CMakeLists.txt M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/hms_catalog-test.cc A src/kudu/master/hms_catalog.cc A src/kudu/master/hms_catalog.h M src/kudu/master/master.proto M src/kudu/mini-cluster/external_mini_cluster-test.cc M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h 20 files changed, 1,263 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/12/8312/5 -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 5 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 4: (10 comments) Just responding to your responses, will defer a rereview until the tests pass (it's been long enough since my first review that I've forgotten all of the new code). http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/CMakeLists.txt File src/kudu/integration-tests/CMakeLists.txt: http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/CMakeLists.txt@79 PS2, Line 79: ADD_KUDU_TEST(fuzz-itest RUN_SERIAL true) > Nit: sort order. I think you missed this one. http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/master_hms-itest.cc@75 PS2, Line 75: op()); > Maybe use ExternalMiniClusterITestBase? Done http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/master_hms-itest.cc@79 PS2, Line 79: Status StopHms() { > Can this be omitted? Done http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/master_hms-itest.cc@112 PS2, Line 112: b.AddColumn("decimal32_val")->Type(KuduColumnSchema::DECIMAL) > ??? The table_name() function expects a cref string, so std::move(table_name) has no effect. http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@1066 PS2, Line 1066: return s; > The order of operations in CatalogManager::Shutdown is tricky and has bitte I think you missed this one. http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@1598 PS2, Line 1598: } > Why is it OK if this succeeds but step 3 fails? I can take my answer in the Hmm, the new behavior treats HMS table entry dropping as fatal, so this new comment is no longer correct. http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@2092 PS2, Line 2092: case AlterTableRequestPB::DROP_RANGE_PARTITION: { > Shouldn't some aspect of this be conditioned on one (or several) of has_foo I think you missed this. http://gerrit.cloudera.org:8080/#/c/8312/4/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8312/4/src/kudu/master/catalog_manager.cc@1561 PS4, Line 1561: // TODO(dan): figure out how to test this. You can inject faults into SysCatalogTable::SyncWrite. Maybe that will help? http://gerrit.cloudera.org:8080/#/c/8312/4/src/kudu/master/catalog_manager.cc@2253 PS4, Line 2253: // TODO(dan): figure out how to test this. See an earlier comment about injecting failures into syscatalog writes. http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h File src/kudu/master/hms_catalog.h: http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h@29 PS3, Line 29: #include "kudu/util/net/net_util.h" > optional seems to require the include in order to compile. Not using optional anymore? -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 4 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Hao Hao Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Tue, 06 Mar 2018 19:48:33 + Gerrit-HasComments: Yes
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/8312 ) Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. Patch Set 4: (39 comments) http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/master_hms-itest.cc@144 PS2, Line 144: db.name = hms_database_name; > I think you can use ContainsKey here, from map-util.h. ContainsKey doesn't work on vector. http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/integration-tests/master_hms-itest.cc@155 PS2, Line 155: s = CreateTable(hms_database_name, hms_table_name); > ContainsKey here too. Ack http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/integration-tests/master_hms-itest.cc@102 PS3, Line 102: b.AddColumn("int8_val")->Type(KuduColumnSchema::INT8); > warning: passing result of std::move() as a const reference argument; no mo Done http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.h@33 PS2, Line 33: #include > Not used? Done http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@231 PS2, Line 231: TAG_FLAG(catalog_manager_inject_latency_load_ca_info_ms, hidden); > Should probably beef this up a bit to explain what setting this actually do Done http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@634 PS2, Line 634: // Propagate the 'read_default' to the 'write_default' in 'col', > 9083 is the default HMS port? Probably deserves more visibility than inline Done http://gerrit.cloudera.org:8080/#/c/8312/2/src/kudu/master/catalog_manager.cc@1415 PS2, Line 1415: > It's worth noting what happens to this table if step e fails. Done http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/catalog_manager.cc@231 PS3, Line 231: TAG_FLAG(catalog_manager_inject_latency_load_ca_info_ms, hidden); > a bit more docs here, eg what the multiple addrs mean, etc. Also does the H Done http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/catalog_manager.cc@1419 PS3, Line 1419: } > I think it's worth logging such errors as well, since it may be an end user Done http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h File src/kudu/master/hms_catalog.h: http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h@29 PS3, Line 29: #include "kudu/util/net/net_util.h" > Seems like you may be able to forward declare HmsClient. optional seems to require the include in order to compile. http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h@42 PS3, Line 42: // > Maybe define this as a private nested class of HmsCatalog, to make it clear Done http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h@70 PS3, Line 70: st, Te > Can you get away with forward declaring Schema? Done http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h@90 PS3, Line 90:std::string* hms_table) WARN_UNUSED_RESULT; > Maybe it shouldn't be a class member then? It could be declared on the stac Done http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.h@91 PS3, Line 91: > Worth a comment explaining why this is optional. Done http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc File src/kudu/master/hms_catalog.cc: http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc@31 PS3, Line 31: #include "kudu/common/schema.h" : #include "kudu/gutil/strings/split.h > Why do you need these? Done http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc@34 PS3, Line 34: #include "kudu/hms/hive_metastore_constants.h" > Not used? Done http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc@38 PS3, Line 38: #include "kudu/util/net/net_util.h" : #include "kudu/util/thr > These aren't used. Done http://gerrit.cloudera.org:8080/#/c/8312/3/src/kudu/master/hms_catalog.cc@73 PS3, Line 73: RETURN_NOT_OK(ThreadPoolBuilder("hms-catalog") > Should this also null out worker_? What's the expected behavior of multiple Stop() should be idempotent, which I think it is here, assuming ThreadJoiner::Join is idempotent (which it appears to be from reading its source).
[kudu-CR] KUDU-2191 (6/n): Hive Metastore catalog manager integration
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, Todd Lipcon, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/8312 to look at the new patch set (#4). Change subject: KUDU-2191 (6/n): Hive Metastore catalog manager integration .. KUDU-2191 (6/n): Hive Metastore catalog manager integration This commit adds integration with the Hive Metastore to CatalogManager, which can be enabled with a new flag: 'hive-metastore-addresses'. When the flag is provided, the Master will push DDL changes in its catalog to the Hive Metastore. In particular: - Create table, delete table, and alter table operations in the Kudu master will synchronously update the corresponding entry in the HMS. - New table and column names are required to be valid according to the Hive identifier rules, which are much stricter than Kudu's existing identifier rules. I think the code-paths in hms_catalog are pretty well covered by the tests added in this commit, however there is a dearth of stress and chaos tests covering the actual CatalogManager integration. In particular I've left some TODOs of roll-back code that could benefit from more coverage, as well as a TODO in master-stress-test about enabling HMS integration. Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf --- M src/kudu/hms/hms_client-test.cc M src/kudu/hms/hms_client.cc M src/kudu/hms/hms_client.h M src/kudu/hms/mini_hms.cc M src/kudu/integration-tests/CMakeLists.txt M src/kudu/integration-tests/alter_table-randomized-test.cc M src/kudu/integration-tests/master-stress-test.cc A src/kudu/integration-tests/master_hms-itest.cc M src/kudu/master/CMakeLists.txt M src/kudu/master/catalog_manager.cc M src/kudu/master/catalog_manager.h A src/kudu/master/hms_catalog-test.cc A src/kudu/master/hms_catalog.cc A src/kudu/master/hms_catalog.h M src/kudu/master/master.proto M src/kudu/mini-cluster/external_mini_cluster-test.cc M src/kudu/mini-cluster/external_mini_cluster.cc M src/kudu/util/net/net_util.cc M src/kudu/util/net/net_util.h 19 files changed, 1,214 insertions(+), 10 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/12/8312/4 -- To view, visit http://gerrit.cloudera.org:8080/8312 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I1b8f360408b05d6ac4f7359a8ac0dd889b196baf Gerrit-Change-Number: 8312 Gerrit-PatchSet: 4 Gerrit-Owner: Dan BurkertGerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Dan Burkert Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon