Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/8313 )
Change subject: KUDU-2191 (10/n): Hive Metastore notification log event listener ...................................................................... Patch Set 7: (13 comments) http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/catalog_manager.h File src/kudu/master/catalog_manager.h: http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/catalog_manager.h@980 PS7, Line 980: Protected by leader_lock_; I think it makes more sense to protect with 'lock_', since this is cached in-memory state much like the info maps. But see the comment I left around StoreLatestNotificationLogEventId; I think this will be cleanest if converted into a CowObject. http://gerrit.cloudera.org:8080/#/c/8313/4/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8313/4/src/kudu/master/catalog_manager.cc@2285 PS4, Line 2285: es while the HMS is : // unavailable, because that would cause the catalogs t > I've added more commentary, let me know if it's still confusing. I was actually wrong in my statement anyway: we don't roll back HMS-initiated DROP TABLE either. The only inconsistency between the two in behavior is that we'll send an ALTER to HMS even if it was originally initiated by HMS. You wrote that this helps keep the catalogs converged. Can you provide an example scenario? I'm asking because this seems a little fragile to me: I think it'd be easy for someone to accidentally (or intentionally) change HMS's no-op ALTER behavior to publish notification events anyway, at which point this ALTER would generate an infinite loop. http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/catalog_manager.cc File src/kudu/master/catalog_manager.cc: http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/catalog_manager.cc@2304 PS7, Line 2304: Rollback is skipped when the : // alter is in response to a notification log event, since unhandled : // notification log events are automatically retried during the next polling : // period. This seems somewhat fragile since there's nothing actually enforcing that the polling task retries events whose Kudu-side changes failed. Meaning, someone could change the retry behavior tomorrow and then we'd have diverging catalogs. Wouldn't it be simpler to do the rollback here anyway? SysCatalogTable::Write() failures are super rare, so it's not like it'd be a perf hit to rollback, right? http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/catalog_manager.cc@2423 PS7, Line 2423: // Update the cached HMS notification log event ID, if it changed. I think it'd be cleaner if you routed all requests to update hms_notification_log_event_id_ through StoreLatestNotificationLogEventId(). For example, it'll let you centralize all of the locking in one place. To do that you'll need to parameterize the SysCatalogTable::Write(). http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/catalog_manager.cc@3935 PS7, Line 3935: Status CatalogManager::InitLatestNotificationLogEventId() { DCHECK(hms_catalog_) here too, and in StoreLatest? http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/catalog_manager.cc@3949 PS7, Line 3949: // It wouldn't typically be correct to write to the field under only a read : // lock, however this instance is safe because it's only called by the : // HmsNotificationLogListenerTask singleton thread. I'm not a fan of this shortcut. Why not use proper locking? Something like this: 1. Create lock::guard for some mutex (maybe a new one). 2. Create unique_lock lock_ 3. Short-circuit success if hms_notification_log_event_id_ >= event_id 4. Release lock_ 5. Perform the Write() 6. If Write() failed, return the error 7. Lock the unique_lock for lock_ 8. Update hms_notification_log_event_id_ 9. Return success It's actually not much more work to store hms_notification_log_event_id_ as a CowObject and protect it via CowLock. It may even be simpler, since it'll hide the new mutex from the rest of CatalogManager by building it (well, an RWCLock) into the CowObject. Then this looks something like this: 1. CowLock hms_notification_log_event_id_ for write mode 2. Short-circuit success if hms_notification_log_event_id_ >= event_id 3. Update the value of hms_notification_log_event_id_ via mutable_data 4. Perform the Write() 5. If Write() failed, return the error 6. Commit the CowLock 7. Return success http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/hms_notification_log_listener.h File src/kudu/master/hms_notification_log_listener.h: http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/hms_notification_log_listener.h@86 PS7, Line 86: // Initialize the HMS notification log listener. The catalog manager must be Nit: Initializes (to match the tense in Handle*Event). Please do the same for Init, Shutdown, and Poll too. http://gerrit.cloudera.org:8080/#/c/8313/4/src/kudu/master/hms_notification_log_listener.cc File src/kudu/master/hms_notification_log_listener.cc: http://gerrit.cloudera.org:8080/#/c/8313/4/src/kudu/master/hms_notification_log_listener.cc@35 PS4, Line 35: #include "kudu/hms/hive_metastore_constants.h" > I'm always in favor of shorter flag names over longer, however I feel it's OK, that's a reasonable argument. http://gerrit.cloudera.org:8080/#/c/8313/4/src/kudu/master/hms_notification_log_listener.cc@116 PS4, Line 116: ask::Poll( > Isn't that just a special case of 'not leader'? I suppose, but if you look at the various exit paths out of the ScopedLeaderSharedLock constructor, you'll see that not every error message is about not being the leader. Anyway, it's just a VLOG, so I don't really care. http://gerrit.cloudera.org:8080/#/c/8313/4/src/kudu/master/hms_notification_log_listener.cc@291 PS4, Line 291: // Kudu catalog's metadata for the table. Checking the name in addition to the : // ID prevents the HMS from altering a Kudu table when it has an outdated : / > No, the columns could have changed or some other piece of important metadat Yeah but you're not including the new schema in 'req'. I assumed that was intentional; that this is for renames only. Was not the case? http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/hms_notification_log_listener.cc File src/kudu/master/hms_notification_log_listener.cc: http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/hms_notification_log_listener.cc@87 PS7, Line 87: CHECK_OK(ThreadJoiner(thread_.get()).Join()); Should probably null out thread_ when you're done so the CHECK on L85 will fire if someone calls Shutdown twice. http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/hms_notification_log_listener.cc@201 PS7, Line 201: // The durable event ID gets updated every time we make a change in response : // to a log notification, however not every log notification results in a : // change (for instance, a notification pertaining to a Parquet table). To : // avoid replaying these notifications we persist the latest processed : // notification log event ID after polling. This is best effort, since failing : // to update the ID should only results in wasted work, not an unsynchronized : // catalog. Hmm, upon second thought, updating the event ID with every reflected operation could lead to a lot of unnecessary deltas in the master tablet after having worked through a particularly large poll. Perhaps we shouldn't do that, and just update once here when the poll is finished? The disadvantage to doing just one update is that if we lose leadership as we're processing HMS events, the new leader master will have to retry the entire HMS event set. But maybe that exact scenario is rare enough that the trade-off (reduced delta compaction in the master) is worth it. Certainly the code is less complex if we just update once. http://gerrit.cloudera.org:8080/#/c/8313/7/src/kudu/master/hms_notification_log_listener.cc@290 PS7, Line 290: drop alter -- To view, visit http://gerrit.cloudera.org:8080/8313 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ed099c44a593ffe514152135957018f21ed775 Gerrit-Change-Number: 8313 Gerrit-PatchSet: 7 Gerrit-Owner: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Dan Burkert <danburk...@apache.org> Gerrit-Reviewer: Hao Hao <hao....@cloudera.com> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Comment-Date: Wed, 11 Apr 2018 18:31:38 +0000 Gerrit-HasComments: Yes