Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8313 )
Change subject: KUDU-2191 (5/n): WIP: Hive Metastore notification log event listener ...................................................................... Patch Set 3: (24 comments) http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/integration-tests/master_hms-itest.cc File src/kudu/integration-tests/master_hms-itest.cc: http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/integration-tests/master_hms-itest.cc@183 PS3, Line 183: ASSERT_EQ(0, CountTableRows(table.get())); Is this important? OpenTable() already verified the existence of the table, no? http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/integration-tests/master_hms-itest.cc@188 PS3, Line 188: bool exists; Nit: I think you're indenting two chars too much here and in the other ASSERT_EVENTUALLY. http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/catalog_manager.h@520 PS3, Line 520: // Delete the specified table Please update the method docs (for AlterTable too). http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/catalog_manager.h@527 PS3, Line 527: boost::optional<int64_t> notification_log_event_id = boost::none); Nit: maybe hms_notification_log_event_id so it's clear this is an HMS construct? http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/catalog_manager.cc@1600 PS3, Line 1600: if (notification_log_event_id) { Should include a comment about how if there's this ID, it means the request originated in HMS so we shouldn't go right back to HMS to do it. In AlterTable too. http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/catalog_manager.cc@1601 PS3, Line 1601: actions.hms_notification_log_event_id = notification_log_event_id; std::move? I thought optionals should be moved, though maybe that depends on their contents. http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.h File src/kudu/master/hms_catalog.h: http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.h@54 PS3, Line 54: HmsCatalog(std::vector<HostPort> addresses, CatalogManager* catalog_manager); This dependency implies something about the create/destruct ordering of HmsCatalog w.r.t. CatalogManager (and its SysCatalogTable). That's worth documenting in CatalogManager::Init() and Shutdown(), and possibly here too. http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc File src/kudu/master/hms_catalog.cc: http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@270 PS3, Line 270: Status s = catalog_manager_->sys_catalog()->GetNotificationLogEventId(&last_event_id); You may want to have CatalogManager mediate this access, so it can fail when the SysCatalogTable doesn't exist, or when the catalog manager isn't the leader in a multi-master group, etc. Plus it means HmsCatalog won't need sys_catalog.h. http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@306 PS3, Line 306: if (rpc_queue_.empty()) { What happened here? http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@343 PS3, Line 343: CatalogManager::ScopedLeaderSharedLock l(catalog_manager_); : if (!l.first_failed_status().ok()) { : VLOG(1) << "Catalog manager not leader; skipping notification log retrieval: " : << l.first_failed_status().ToString(); : continue; : } Why is this necessary? This will be checked by the CatalogManager in AlterTable/DropTable anyway. http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@360 PS3, Line 360: LOG(WARNING) << "Failed to retrieve notification log events: " << s.ToString(); This was already warned on L356. http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@361 PS3, Line 361: continue; What's preventing this from creating a tight loop, if GetNotificationEvents() fails over and over? Worse, if the failure was an IOError, when we loop back we'll dereference a null pointer on L354. http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@373 PS3, Line 373: if (event.eventType == "ALTER_TABLE") { : s = HandleAlterTableEvent(event); : } else if (event.eventType == "DROP_TABLE") { : s = HandleDropTableEvent(event); : } I'm too lazy to check, but can eventType be switched on? http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@381 PS3, Line 381: LOG(INFO) << "Term changed while processing notification log; " : "skipping remaining entries: " : << s.ToString(); Can't you extract this from the status instead of relying on the ScopedLeaderSharedLock instead? http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@421 PS3, Line 421: Document doc; Unused? http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@423 PS3, Line 423: if (event.messageFormat != "json-0.2" || Might be nice to include a comment with a sample JSON representation of an event, so this is more clear. http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@429 PS3, Line 429: return Status::Corruption("invalid Hive Metastore event", table_name); Should we include the event too, so we can more easily debug what happened? http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@432 PS3, Line 432: hive::Table before_table; : const Value& beforeTableObjJson = document[kBeforeTableJsonKey]; : RETURN_NOT_OK(hms::HmsClient::DeserializeJsonTable(Slice(beforeTableObjJson.GetString(), : beforeTableObjJson.GetStringLength()), : &before_table)); : : const string* storage_handler = : FindOrNull(before_table.parameters, hive::g_hive_metastore_constants.META_TABLE_STORAGE); : if (!storage_handler || *storage_handler != hms::HmsClient::kKuduStorageHandler) { : // Not a Kudu table; skip it. : return Status::OK(); : } This is kind of duplicated below; can you decompose it into a helper method? http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@492 PS3, Line 492: Document doc; : CHECK_EQ(event.messageFormat, "json-0.2"); : CHECK(!doc.Parse<0>(event.message.c_str()).HasParseError()); : CHECK(doc.HasMember(kTableJsonKey)) << event; : CHECK(doc[kTableJsonKey].IsString()); Holdover from before your checks below? http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/hms_catalog.cc@498 PS3, Line 498: if (event.messageFormat != "json-0.2" || Likewise, a comment here with a sample JSON. http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/sys_catalog.h File src/kudu/master/sys_catalog.h: http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/sys_catalog.h@121 PS3, Line 121: HMS_NOTIFICATION_LOG = 5, // HMS notification log latest event ID. Would be good to explain somewhere (maybe in the class comment) what the data model for this is. From reading the code I see it's a singleton that's always UPSERTed. http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/sys_catalog.h@170 PS3, Line 170: Status GetNotificationLogEventId(int64_t* event_id) WARN_UNUSED_RESULT; Nit: could you work the word 'latest' or 'newest' into the name, since it's important? http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/sys_catalog.cc File src/kudu/master/sys_catalog.cc: http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/sys_catalog.cc@647 PS3, Line 647: *event_id = entry_data.value(); There should only be one of these entries, right? So we could DCHECK that event_id is -1? http://gerrit.cloudera.org:8080/#/c/8313/3/src/kudu/master/sys_catalog.cc@810 PS3, Line 810: enc.Add(RowOperationsPB::UPSERT, row); How do we guarantee that, if this becomes an UPDATE, event_id is newer than whatever was stored before? Do we just trust the HMS on this? -- To view, visit http://gerrit.cloudera.org:8080/8313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ed099c44a593ffe514152135957018f21ed775 Gerrit-Change-Number: 8313 Gerrit-PatchSet: 3 Gerrit-Owner: Dan Burkert <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Thu, 19 Oct 2017 00:52:31 +0000 Gerrit-HasComments: Yes
