[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..

IMPALA-3613: Avoid topic updates to unregistered subscriber instances

Bug:

Without this patch, when a subscriber repeatedly reconnects to the
statestore, the latter queues the initial heartbeat message and a
bunch of topic updates to every instance of the registered subscriber.
These queued updates are eventually picked up by the heartbeating/topic
update threads and the corresponding RPCs are made to the subscribers.
The subscriber then rejects these updates since they were meant for an
earlier registration. This is usually possible if the subscriber has
some network problems leading to failing RPCs.

Such a node is eventually marked by the statestore as bad, but depending
on the configurations set, the issue can snowball into DDOS kind of
attack when the entire thread pool of heartbeating/topic updates is
filled with instances from the problematic host. This can result in
the statestore missing timely heartbeats to other subscribers making
them reconnect. This worsens the situation and the resulting topic
updates for the reconnects will fully saturate the network on the
statestore host, until the statestore daemon is restarted.

Fix:

This patch maps topic updates/heartbeats to a specific subscriber
registered instance rather to a subscriber id (that stays same across
reconnects). That way, when we encounter a topic update that was meant to
a stale subscriber, we can simply reject it.

Testing:

Tested this locally by adding relevant logging. I made the subscribers
to reconnect aggressively(a) and delaying heartbeats from the statestore
side (b,c).

(a) --statestore_subscriber_timeout_seconds=1
(b) --statestore_max_missed_heartbeats=1000
(c) --statestore_heartbeat_frequency_ms=6

Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Reviewed-on: http://gerrit.cloudera.org:8080/8449
Reviewed-by: Bharath Vissapragada 
Tested-by: Impala Public Jenkins
---
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
4 files changed, 86 insertions(+), 45 deletions(-)

Approvals:
  Bharath Vissapragada: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 10
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 9:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1478/


--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 9
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 15 Nov 2017 20:32:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 9: Code-Review+2

Carrying +2.


--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 9
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 15 Nov 2017 20:31:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-15 Thread Bharath Vissapragada (Code Review)
Hello Sailesh Mukil, Dimitris Tsirogiannis, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8449

to look at the new patch set (#9).

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..

IMPALA-3613: Avoid topic updates to unregistered subscriber instances

Bug:

Without this patch, when a subscriber repeatedly reconnects to the
statestore, the latter queues the initial heartbeat message and a
bunch of topic updates to every instance of the registered subscriber.
These queued updates are eventually picked up by the heartbeating/topic
update threads and the corresponding RPCs are made to the subscribers.
The subscriber then rejects these updates since they were meant for an
earlier registration. This is usually possible if the subscriber has
some network problems leading to failing RPCs.

Such a node is eventually marked by the statestore as bad, but depending
on the configurations set, the issue can snowball into DDOS kind of
attack when the entire thread pool of heartbeating/topic updates is
filled with instances from the problematic host. This can result in
the statestore missing timely heartbeats to other subscribers making
them reconnect. This worsens the situation and the resulting topic
updates for the reconnects will fully saturate the network on the
statestore host, until the statestore daemon is restarted.

Fix:

This patch maps topic updates/heartbeats to a specific subscriber
registered instance rather to a subscriber id (that stays same across
reconnects). That way, when we encounter a topic update that was meant to
a stale subscriber, we can simply reject it.

Testing:

Tested this locally by adding relevant logging. I made the subscribers
to reconnect aggressively(a) and delaying heartbeats from the statestore
side (b,c).

(a) --statestore_subscriber_timeout_seconds=1
(b) --statestore_max_missed_heartbeats=1000
(c) --statestore_heartbeat_frequency_ms=6

Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
---
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
4 files changed, 86 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/8449/9
--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 9
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-15 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.h@381
PS8, Line 381:  both kinds of subscriber up
> or just say "in Unix time"
Done


http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.cc@415
PS8, Line 415: Id& registration_id, share
> Seems like this should just be "FindSubscriber()" or "FindRegisteredSubscri
FindSubscriber() sounds better.



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 15 Nov 2017 20:28:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 8: Code-Review+2

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.h@381
PS8, Line 381: in microseconds since epoch)
or just say "in Unix time"


http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h@385
PS3, Line 385: strationId of the registered subscriber
> Discussed this a little more with Dimitris, leaving it as-is for now. We di
Thanks. yes, let's avoid shared_ptrs and especially weak_ptrs, and move toward 
single ownership when possible.


http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc@278
PS3, Line 278:   lock_guard l(subscribers_lock_);
 :   lock_guard t(topic_lock_);
> Fair point, I'll revert the spinlock change. Maybe we can address it again
Note that SpinLock is not a traditional spin-lock -- it's adaptive and will 
block like a mutex after attempting to spin for a while. So, it's pretty 
general-purpose.


http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/8/be/src/statestore/statestore.cc@415
PS8, Line 415: RegisteredSubscriberExists
Seems like this should just be "FindSubscriber()" or 
"FindRegisteredSubscriber()" but okay to leave if you prefer the "exists" name. 
 The "exists" naming makes it a bit surprising that it also returns the pointer.



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 8
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 15 Nov 2017 19:12:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-14 Thread Bharath Vissapragada (Code Review)
Hello Sailesh Mukil, Dimitris Tsirogiannis, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8449

to look at the new patch set (#8).

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..

IMPALA-3613: Avoid topic updates to unregistered subscriber instances

Bug:

Without this patch, when a subscriber repeatedly reconnects to the
statestore, the latter queues the initial heartbeat message and a
bunch of topic updates to every instance of the registered subscriber.
These queued updates are eventually picked up by the heartbeating/topic
update threads and the corresponding RPCs are made to the subscribers.
The subscriber then rejects these updates since they were meant for an
earlier registration. This is usually possible if the subscriber has
some network problems leading to failing RPCs.

Such a node is eventually marked by the statestore as bad, but depending
on the configurations set, the issue can snowball into DDOS kind of
attack when the entire thread pool of heartbeating/topic updates is
filled with instances from the problematic host. This can result in
the statestore missing timely heartbeats to other subscribers making
them reconnect. This worsens the situation and the resulting topic
updates for the reconnects will fully saturate the network on the
statestore host, until the statestore daemon is restarted.

Fix:

This patch maps topic updates/heartbeats to a specific subscriber
registered instance rather to a subscriber id (that stays same across
reconnects). That way, when we encounter a topic update that was meant to
a stale subscriber, we can simply reject it.

Testing:

Tested this locally by adding relevant logging. I made the subscribers
to reconnect aggressively(a) and delaying heartbeats from the statestore
side (b,c).

(a) --statestore_subscriber_timeout_seconds=1
(b) --statestore_max_missed_heartbeats=1000
(c) --statestore_heartbeat_frequency_ms=6

Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
---
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
4 files changed, 88 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/8449/8
--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 8
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-14 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8449/7/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/7/be/src/statestore/statestore.h@394
PS7, Line 394:   typedef std::pair 
ScheduledSubscriberUpdate;
> I meant flatten both pairs -- i.e. turn ScheduledSubscriberUpdate into a st
oops sorry, got confused because your comment just highlighted the second part 
of the pair. Redid this, makes more sense to flatten the whole thing.



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 7
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 15 Nov 2017 02:27:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-13 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 7:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8449/7/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/7/be/src/statestore/statestore.h@394
PS7, Line 394:   typedef std::pair 
ScheduledSubscriberUpdate;
I meant flatten both pairs -- i.e. turn ScheduledSubscriberUpdate into a struct 
(with three fields).



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 7
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 14 Nov 2017 05:40:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-13 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8449/6/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/6/be/src/statestore/statestore.h@383
PS6, Line 383: std::pair
> once we have two level pair, I think it's time to start naming the fields.
Done



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 6
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 14 Nov 2017 00:18:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-13 Thread Bharath Vissapragada (Code Review)
Hello Sailesh Mukil, Dimitris Tsirogiannis, Alex Behm, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8449

to look at the new patch set (#7).

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..

IMPALA-3613: Avoid topic updates to unregistered subscriber instances

Bug:

Without this patch, when a subscriber repeatedly reconnects to the
statestore, the latter queues the initial heartbeat message and a
bunch of topic updates to every instance of the registered subscriber.
These queued updates are eventually picked up by the heartbeating/topic
update threads and the corresponding RPCs are made to the subscribers.
The subscriber then rejects these updates since they were meant for an
earlier registration. This is usually possible if the subscriber has
some network problems leading to failing RPCs.

Such a node is eventually marked by the statestore as bad, but depending
on the configurations set, the issue can snowball into DDOS kind of
attack when the entire thread pool of heartbeating/topic updates is
filled with instances from the problematic host. This can result in
the statestore missing timely heartbeats to other subscribers making
them reconnect. This worsens the situation and the resulting topic
updates for the reconnects will fully saturate the network on the
statestore host, until the statestore daemon is restarted.

Fix:

This patch maps topic updates/heartbeats to a specific subscriber
registered instance rather to a subscriber id (that stays same across
reconnects). That way, when we encounter a topic update that was meant to
a stale subscriber, we can simply reject it.

Testing:

Tested this locally by adding relevant logging. I made the subscribers
to reconnect aggressively(a) and delaying heartbeats from the statestore
side (b,c).

(a) --statestore_subscriber_timeout_seconds=1
(b) --statestore_max_missed_heartbeats=1000
(c) --statestore_heartbeat_frequency_ms=6

Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
---
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
4 files changed, 91 insertions(+), 42 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/8449/7
--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 7
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-13 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8449/6/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/6/be/src/statestore/statestore.h@383
PS6, Line 383: std::pair
once we have two level pair, I think it's time to start naming the fields. How 
about defining a struct for this thing instead?



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 6
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 13 Nov 2017 21:13:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-08 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 6: Code-Review+1

Carrying +1 (Thanks Sailesh). Any volunteers for a +2 review, thanks.


--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 6
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Wed, 08 Nov 2017 21:08:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-06 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 5: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Tue, 07 Nov 2017 00:29:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-06 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h@385
PS3, Line 385: te;
> I was chatting with Michael and he mentioned that in general we don't recom
Discussed this a little more with Dimitris, leaving it as-is for now. We didn't 
want to extend the usage of shared/weak_ptrs for readability sake.


http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc@278
PS3, Line 278:   lock_guard l(subscribers_lock_);
 :   lock_guard t(topic_lock_);
> My general opinion on this is "if it ain't broke, don't fix it". Do we have
Fair point, I'll revert the spinlock change. Maybe we can address it again if 
it really turns out to be a bottleneck.



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 06 Nov 2017 23:55:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-06 Thread Bharath Vissapragada (Code Review)
Hello Sailesh Mukil, Dimitris Tsirogiannis, Alex Behm,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8449

to look at the new patch set (#5).

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..

IMPALA-3613: Avoid topic updates to unregistered subscriber instances

Bug:

Without this patch, when a subscriber repeatedly reconnects to the
statestore, the latter queues the initial heartbeat message and a
bunch of topic updates to every instance of the registered subscriber.
These queued updates are eventually picked up by the heartbeating/topic
update threads and the corresponding RPCs are made to the subscribers.
The subscriber then rejects these updates since they were meant for an
earlier registration. This is usually possible if the subscriber has
some network problems leading to failing RPCs.

Such a node is eventually marked by the statestore as bad, but depending
on the configurations set, the issue can snowball into DDOS kind of
attack when the entire thread pool of heartbeating/topic updates is
filled with instances from the problematic host. This can result in
the statestore missing timely heartbeats to other subscribers making
them reconnect. This worsens the situation and the resulting topic
updates for the reconnects will fully saturate the network on the
statestore host, until the statestore daemon is restarted.

Fix:

This patch maps topic updates/heartbeats to a specific subscriber
registered instance rather to a subscriber id (that stays same across
reconnects). That way, when we encounter a topic update that was meant to
a stale subscriber, we can simply reject it.

Testing:

Tested this locally by adding relevant logging. I made the subscribers
to reconnect aggressively(a) and delaying heartbeats from the statestore
side (b,c).

(a) --statestore_subscriber_timeout_seconds=1
(b) --statestore_max_missed_heartbeats=1000
(c) --statestore_heartbeat_frequency_ms=6

Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
---
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
4 files changed, 76 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/8449/5
--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 5
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-06 Thread Alex Behm (Code Review)
Alex Behm has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc@278
PS3, Line 278:   lock_guard l(subscribers_lock_);
 :   lock_guard t(topic_lock_);
> IMO, we shouldn't use spinlock for topic_lock_ since we can potentially do
My general opinion on this is "if it ain't broke, don't fix it". Do we have any 
evidence that these locks are a problem, or is this a case of premature 
optimization?

In any case, this is not the main focus of this patch, so let's avoid creeping 
in unrelated changes. I vote for reverting.

If we have evidence that a mutex is not good here, then let's change that in a 
separate patch.



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Alex Behm 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 06 Nov 2017 22:05:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-06 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h@385
PS3, Line 385: std::pair
> It will until doSubscriberUpdate is called which will remove the entry. You
I was chatting with Michael and he mentioned that in general we don't recommend 
the use of weak_ptrs. So, if keeping the entries for some time is a concern by 
using shared_ptrs, you may want to ignore my recommendation.



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 06 Nov 2017 21:44:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-06 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h@385
PS3, Line 385: std::pair
> Wouldn't that keep a bunch of unregistered 'Subscriber' objects around due
It will until doSubscriberUpdate is called which will remove the entry. You can 
even use a weak_ptr here if keeping these entries for some time is a concern.



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 06 Nov 2017 21:28:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-06 Thread Bharath Vissapragada (Code Review)
Hello Sailesh Mukil, Dimitris Tsirogiannis,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8449

to look at the new patch set (#4).

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..

IMPALA-3613: Avoid topic updates to unregistered subscriber instances

Bug:

Without this patch, when a subscriber repeatedly reconnects to the
statestore, the latter queues the initial heartbeat message and a
bunch of topic updates to every instance of the registered subscriber.
These queued updates are eventually picked up by the heartbeating/topic
update threads and the corresponding RPCs are made to the subscribers.
The subscriber then rejects these updates since they were meant for an
earlier registration. This is usually possible if the subscriber has
some network problems leading to failing RPCs.

Such a node is eventually marked by the statestore as bad, but depending
on the configurations set, the issue can snowball into DDOS kind of
attack when the entire thread pool of heartbeating/topic updates is
filled with instances from the problematic host. This can result in
the statestore missing timely heartbeats to other subscribers making
them reconnect. This worsens the situation and the resulting topic
updates for the reconnects will fully saturate the network on the
statestore host, until the statestore daemon is restarted.

Fix:

This patch maps topic updates/heartbeats to a specific subscriber
registered instance rather to a subscriber id (that stays same across
reconnects). That way, when we encounter a topic update that was meant to
a stale subscriber, we can simply reject it.

Testing:

Tested this locally by adding relevant logging. I made the subscribers
to reconnect aggressively(a) and delaying heartbeats from the statestore
side (b,c).

(a) --statestore_subscriber_timeout_seconds=1
(b) --statestore_max_missed_heartbeats=1000
(c) --statestore_heartbeat_frequency_ms=6

Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
---
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
4 files changed, 81 insertions(+), 47 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/8449/4
--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 4
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-06 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h@385
PS3, Line 385: std::pair
> I think the code would be much simpler if you stored a pointer (probably a
Wouldn't that keep a bunch of unregistered 'Subscriber' objects around due to 
shared_ptr references? I agree the code might be simple though.


http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc@278
PS3, Line 278:   lock_guard l(subscribers_lock_);
 :   lock_guard t(topic_lock_);
> I just noticed this. Getting a SpinLock before getting a mutex is an anti-p
IMO, we shouldn't use spinlock for topic_lock_ since we can potentially do some 
heavy work in GatherTopicUpdates().  If this is an anti-pattern I'm ok 
reverting the change to a mutex. May we can ask others opinions on it?

Dimitris/Dan/Alex do you have any opinion on this?


http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc@415
PS3, Line 415: const TUniqueId&
> const RegistrationId&
Changed at other places too.



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 06 Nov 2017 20:38:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-06 Thread Dimitris Tsirogiannis (Code Review)
Dimitris Tsirogiannis has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.h@385
PS3, Line 385: std::pair
I think the code would be much simpler if you stored a pointer (probably a 
shared_ptr is needed) to the Subscriber here and simply compared it to the 
registered Subscriber.



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Dimitris Tsirogiannis 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Mon, 06 Nov 2017 20:08:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-03 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@383
PS2, Line 383: d subscriber
> >  Where ever the 'SubscriberId' is
Thanks for the explanation.

Yea my point was if we're going to have a unique RegistrationId anyway, why 
have a SubscriberId. It seemed redundant.

But as you pointed out, it looks like the subscriber chooses the subscriber_id 
and not the statestore. So, it would be hard to enforce this.

Let's leave this for now.


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.cc@414
PS2, Line 414: onst SubscriberId& subscriber_id,
 : const TUniqueId& registration_id
> Not sure I understand. We get the subscriber/registration_id from the Sched
Nvm, my bad, I thought both were coming from the Subscriber object. Ignore this.


http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc@278
PS3, Line 278:   lock_guard l(subscribers_lock_);
 :   lock_guard t(topic_lock_);
I just noticed this. Getting a SpinLock before getting a mutex is an 
anti-pattern.

Even attempting to get a spinlock while already holding a spinlock is also not 
exactly a great idea. However, our SpinLock implementation sleeps after a few 
cycles of trying to obtain the lock anyway.

Do we know if we do a lot of work holding the topic_lock_? If not, let's change 
this to a SpinLock too. (The GatherTopicUpdates() holds topic_lock_ and 
iterates through a nested loop, but I'm not sure how many iterations that would 
be in the worst case).

If it looks like we will end up doing a lot of work holing the lock, we can be 
safe and just turn the 'subscribers_lock_' back to a mutex.


http://gerrit.cloudera.org:8080/#/c/8449/3/be/src/statestore/statestore.cc@415
PS3, Line 415: const TUniqueId&
const RegistrationId&



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Sat, 04 Nov 2017 00:18:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-03 Thread Bharath Vissapragada (Code Review)
Hello Sailesh Mukil,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/8449

to look at the new patch set (#3).

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..

IMPALA-3613: Avoid topic updates to unregistered subscriber instances

Bug:

Without this patch, when a subscriber repeatedly reconnects to the
statestore, the latter queues the initial heartbeat message and a
bunch of topic updates to every instance of the registered subscriber.
These queued updates are eventually picked up by the heartbeating/topic
update threads and the corresponding RPCs are made to the subscribers.
The subscriber then rejects these updates since they were meant for an
earlier registration. This is usually possible if the subscriber has
some network problems leading to failing RPCs.

Such a node is eventually marked by the statestore as bad, but depending
on the configurations set, the issue can snowball into DDOS kind of
attack when the entire thread pool of heartbeating/topic updates is
filled with instances from the problematic host. This can result in
the statestore missing timely heartbeats to other subscribers making
them reconnect. This worsens the situation and the resulting topic
updates for the reconnects will fully saturate the network on the
statestore host, until the statestore daemon is restarted.

Fix:

This patch maps topic updates/heartbeats to a specific subscriber
registered instance rather to a subscriber id (that stays same across
reconnects). That way, when we encounter a topic update that was meant to
a stale subscriber, we can simply reject it.

Testing:

Tested this locally by adding relevant logging. I made the subscribers
to reconnect aggressively(a) and delaying heartbeats from the statestore
side (b,c).

(a) --statestore_subscriber_timeout_seconds=1
(b) --statestore_max_missed_heartbeats=1000
(c) --statestore_heartbeat_frequency_ms=6

Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
---
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
4 files changed, 78 insertions(+), 44 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/8449/3
--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 3
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-03 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore-subscriber.h
File be/src/statestore/statestore-subscriber.h:

http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore-subscriber.h@165
PS2, Line 165:   typedef TUniqueId RegistrationId;
> You can move this typedef to statestore.h and use the same type in statesto
Done


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@383
PS2, Line 383: SubscriberId
>  Where ever the 'SubscriberId' is
 >  required, we just get it from the Subscriber object anyway, and
 >  that object can be retrieved using the unique registration id.

Not sure I follow. If you see OfferUpdate()/DoSubscriberUpdate(), we only keep 
track of ScheduledSubscriberUpdate objects and get the corresponding Subscriber 
based on ScheduledSubscriberUpdate.subscriber_Id. So we don't have a Subscriber 
object handy to get the subscriber_id in all cases.

Also, if we remove SubscriberId everywhere, this means that we need to change 
the subscribers_ map structure to map from RegistrationId -> Subscriber 
objects. Doing so we can't look up by subscriber_id, which is required in 
RegisterSubscriber() (At that point, we don't assign a RegistrationId yet to 
the new instance)

SubscriberMap::iterator subscriber_it = subscribers_.find(subscriber_id);
if (subscriber_it != subscribers_.end()) {
  UnregisterSubscriber(subscriber_it->second.get());
}

We can still figure out a way, but it seemed unnecessarily complex to me. 
Thoughts?


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@481
PS2, Line 481: subscriber exists
> Could you clarify what "exists" means here exactly? It could be confused wi
Done


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@482
PS2, Line 482: registration_ids
> nit: registration_id
Done


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@484
PS2, Line 484: std::shared_ptr* subscriber
> Add a comment about what is returned in this out parameter.
clarified


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.cc@414
PS2, Line 414: onst SubscriberId& subscriber_id,
 : const TUniqueId& registration_id
> It looks like it just makes sense to pass 'const Subscriber&' here? Is ther
Not sure I understand. We get the subscriber/registration_id from the 
ScheduledSubscriberUpdate object and not the Subscriber object. Do you mean we 
should pass directly pass ScheduledSubscriberUpdate instead? If thats the case, 
the signature seems kinda weird :)


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.cc@634
PS2, Line 634:   if (!RegisteredSubscriberExists(subscriber_to_update.first, 
subscriber_to_update.second,
> I'm a little worried that we're contending for a mutex two more times in th
Good point. I too think (theoretically) spinlock is probably a better choice to 
avoid context switching. Also, ~1000 entries seem like a reasonable estimate 
for foreseeable future :). Also, I think for string based hashing of ~1000 
entries, it is reasonable to assume a O(1) average case lookup (even though the 
worst case is O(N)).

I created a microbenchmark to see if the lock type makes a difference. In the 
benchmark, I measured how long it takes to get 100 heartbeats for a given 
subscriber (with heart beating thread pool sizes of 10/100).
I didn't see any noticeable difference for 100 subscribers, but beyond that, 
the test runs into flaky socket connection issues. I admit that this is not 
representative of the real world use case because in real clusters, the 
statestore CPU would be much busier and the context switching could be more 
expensive.



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Fri, 03 Nov 2017 06:22:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-01 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/8449 )

Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..


Patch Set 2:

(7 comments)

Thanks for doing this patch. This will help reduce a lot of unnecessary network 
traffic.

http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore-subscriber.h
File be/src/statestore/statestore-subscriber.h:

http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore-subscriber.h@165
PS2, Line 165:   typedef TUniqueId RegistrationId;
You can move this typedef to statestore.h and use the same type in 
statestore.h/cc too.


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h
File be/src/statestore/statestore.h:

http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@383
PS2, Line 383: SubscriberId
Do we even need to store the SubscriberId here? Can't we just store a unique 
registration ID? Where ever the 'SubscriberId' is required, we just get it from 
the Subscriber object anyway, and that object can be retrieved using the unique 
registration id.


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@481
PS2, Line 481: subscriber exists
Could you clarify what "exists" means here exactly? It could be confused with a 
node just existing as a part of the cluster. I think we want to say that it 
exists in the subscribers_ map.


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@482
PS2, Line 482: registration_ids
nit: registration_id


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.h@484
PS2, Line 484: std::shared_ptr* subscriber
Add a comment about what is returned in this out parameter.


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.cc
File be/src/statestore/statestore.cc:

http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.cc@414
PS2, Line 414: onst SubscriberId& subscriber_id,
 : const TUniqueId& registration_id
It looks like it just makes sense to pass 'const Subscriber&' here? Is there a 
case where we would not get a subscriber_id and a registration_id from the same 
Subscriber object while calling this function?


http://gerrit.cloudera.org:8080/#/c/8449/2/be/src/statestore/statestore.cc@634
PS2, Line 634:   if (!RegisteredSubscriberExists(subscriber_to_update.first, 
subscriber_to_update.second,
I'm a little worried that we're contending for a mutex two more times in this 
function. Do you anticipate any performance regression due to increased context 
switching?

Consider using a spin lock if we won't have more than ~1000 entries in the map 
at one time. (unordered_map has a worst-case O(N) time complexity)



--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Comment-Date: Thu, 02 Nov 2017 03:19:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3613: Avoid topic updates to unregistered subscriber instances

2017-11-01 Thread Bharath Vissapragada (Code Review)
Bharath Vissapragada has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/8449


Change subject: IMPALA-3613: Avoid topic updates to unregistered subscriber 
instances
..

IMPALA-3613: Avoid topic updates to unregistered subscriber instances

Bug:

Without this patch, when a subscriber repeatedly reconnects to the
statestore, the latter queues the initial heartbeat message and a
bunch of topic updates to every instance of the registered subscriber.
These queued updates are eventually picked up by the heartbeating/topic
update threads and the corresponding RPCs are made to the subscribers.
The subscriber then rejects these updates since they were meant for an
earlier registration. This is usually possible if the subscriber has
some network problems leading to failing RPCs.

Such a node is eventually marked by the statestore as bad, but depending
on the configurations set, the issue can snowball into DDOS kind of
attack when the entire thread pool of heartbeating/topic updates is
filled with instances from the problematic host. This can result in
the statestore missing timely heartbeats to other subscribers making
them reconnect. This worsens the situation and the resulting topic
updates for the reconnects will fully saturate the network on the
statestore host, until the statestore daemon is restarted.

Fix:

This patch maps topic updates/heartbeats to a specific subscriber
registered instance rather to a subscriber id (that stays same across
reconnects). That way, when we encounter a topic update that was meant to
a stale subscriber, we can simply reject it.

Testing:

Tested this locally by adding relevant logging. I made the subscribers
to reconnect aggressively(a) and delaying heartbeats from the statestore
side (b,c).

(a) --statestore_subscriber_timeout_seconds=1
(b) --statestore_max_missed_heartbeats=1000
(c) --statestore_heartbeat_frequency_ms=6

Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
---
M be/src/statestore/statestore-subscriber.cc
M be/src/statestore/statestore-subscriber.h
M be/src/statestore/statestore.cc
M be/src/statestore/statestore.h
4 files changed, 65 insertions(+), 33 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/8449/2
--
To view, visit http://gerrit.cloudera.org:8080/8449
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0329ae7d23dc6e9b04b7bc3ee8d89cbc73756f65
Gerrit-Change-Number: 8449
Gerrit-PatchSet: 2
Gerrit-Owner: Bharath Vissapragada