Hello Sailesh Mukil, Dimitris Tsirogiannis, Alex Behm, Dan Hecht, I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/8449 to look at the new patch set (#8). Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber instances ...................................................................... IMPALA-3613: Avoid topic updates to unregistered subscriber instances Bug: Without this patch, when a subscriber repeatedly reconnects to the statestore, the latter queues the initial heartbeat message and a bunch of topic updates to every instance of the registered subscriber. These queued updates are eventually picked up by the heartbeating/topic update threads and the corresponding RPCs are made to the subscribers. The subscriber then rejects these updates since they were meant for an earlier registration. This is usually possible if the subscriber has some network problems leading to failing RPCs. Such a node is eventually marked by the statestore as bad, but depending on the configurations set, the issue can snowball into DDOS kind of attack when the entire thread pool of heartbeating/topic updates is filled with instances from the problematic host. This can result in the statestore missing timely heartbeats to other subscribers making them reconnect. This worsens the situation and the resulting topic updates for the reconnects will fully saturate the network on the statestore host, until the statestore daemon is restarted. Fix: This patch maps topic updates/heartbeats to a specific subscriber registered instance rather to a subscriber id (that stays same across reconnects). That way, when we encounter a topic update that was meant to a stale subscriber, we can simply reject it. Testing: Tested this locally by adding relevant logging. I made the subscribers to reconnect aggressively(a) and delaying heartbeats from the statestore side (b,c). (a) --statestore_subscriber_timeout_seconds=1 (b) --statestore_max_missed_heartbeats=1000 (c) --statestore_heartbeat_frequency_ms=60000 Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65 --- M be/src/statestore/statestore-subscriber.cc M be/src/statestore/statestore-subscriber.h M be/src/statestore/statestore.cc M be/src/statestore/statestore.h 4 files changed, 88 insertions(+), 45 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/8449/8 -- To view, visit http://gerrit.cloudera.org:8080/8449 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65 Gerrit-Change-Number: 8449 Gerrit-PatchSet: 8 Gerrit-Owner: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogian...@cloudera.com> Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com>