Bharath Vissapragada has posted comments on this change. ( http://gerrit.cloudera.org:8080/8449 )
Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber instances ...................................................................... Patch Set 2: (7 comments) http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore-subscriber.h File be/src/statestore/statestore-subscriber.h: http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore-subscriber.h@165 PS2, Line 165: typedef TUniqueId RegistrationId; > You can move this typedef to statestore.h and use the same type in statesto Done http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h File be/src/statestore/statestore.h: http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@383 PS2, Line 383: SubscriberId > Where ever the 'SubscriberId' is > required, we just get it from the Subscriber object anyway, and > that object can be retrieved using the unique registration id. Not sure I follow. If you see OfferUpdate()/DoSubscriberUpdate(), we only keep track of ScheduledSubscriberUpdate objects and get the corresponding Subscriber based on ScheduledSubscriberUpdate.subscriber_Id. So we don't have a Subscriber object handy to get the subscriber_id in all cases. Also, if we remove SubscriberId everywhere, this means that we need to change the subscribers_ map structure to map from RegistrationId -> Subscriber objects. Doing so we can't look up by subscriber_id, which is required in RegisterSubscriber() (At that point, we don't assign a RegistrationId yet to the new instance) SubscriberMap::iterator subscriber_it = subscribers_.find(subscriber_id); if (subscriber_it != subscribers_.end()) { UnregisterSubscriber(subscriber_it->second.get()); } We can still figure out a way, but it seemed unnecessarily complex to me. Thoughts? http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@481 PS2, Line 481: subscriber exists > Could you clarify what "exists" means here exactly? It could be confused wi Done http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@482 PS2, Line 482: registration_ids > nit: registration_id Done http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@484 PS2, Line 484: std::shared_ptr<Subscriber>* subscriber > Add a comment about what is returned in this out parameter. clarified http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.cc File be/src/statestore/statestore.cc: http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.cc@414 PS2, Line 414: onst SubscriberId& subscriber_id, : const TUniqueId& registration_id > It looks like it just makes sense to pass 'const Subscriber&' here? Is ther Not sure I understand. We get the subscriber/registration_id from the ScheduledSubscriberUpdate object and not the Subscriber object. Do you mean we should pass directly pass ScheduledSubscriberUpdate instead? If thats the case, the signature seems kinda weird :) http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.cc@634 PS2, Line 634: if (!RegisteredSubscriberExists(subscriber_to_update.first, subscriber_to_update.second, > I'm a little worried that we're contending for a mutex two more times in th Good point. I too think (theoretically) spinlock is probably a better choice to avoid context switching. Also, ~1000 entries seem like a reasonable estimate for foreseeable future :). Also, I think for string based hashing of ~1000 entries, it is reasonable to assume a O(1) average case lookup (even though the worst case is O(N)). I created a microbenchmark to see if the lock type makes a difference. In the benchmark, I measured how long it takes to get 100 heartbeats for a given subscriber (with heart beating thread pool sizes of 10/100). I didn't see any noticeable difference for 100 subscribers, but beyond that, the test runs into flaky socket connection issues. I admit that this is not representative of the real world use case because in real clusters, the statestore CPU would be much busier and the context switching could be more expensive. -- To view, visit http://gerrit.cloudera.org:8080/8449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65 Gerrit-Change-Number: 8449 Gerrit-PatchSet: 2 Gerrit-Owner: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com> Gerrit-Comment-Date: Fri, 03 Nov 2017 06:22:18 +0000 Gerrit-HasComments: Yes