Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/12647 )
Change subject: [TS heartbeater] avoid reconnecting to master too often ...................................................................... Patch Set 1: (6 comments) http://gerrit.cloudera.org:8080/#/c/12647/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12647/1//COMMIT_MSG@9 PS1, Line 9: don't > "won't" or "doesn't" Done http://gerrit.cloudera.org:8080/#/c/12647/1/src/kudu/tserver/heartbeater.cc File src/kudu/tserver/heartbeater.cc: http://gerrit.cloudera.org:8080/#/c/12647/1/src/kudu/tserver/heartbeater.cc@592 PS1, Line 592: Pretty arbitrary number to determine > Can you reword this? "Pretty arbitrary" makes this sound like not a whole l Done http://gerrit.cloudera.org:8080/#/c/12647/1/src/kudu/tserver/heartbeater.cc@597 PS1, Line 597: (heartbeat_rpc_timeout / heartbeat_interval) : // time interval > This isn't a timeout-- it's a unitless quantity, the number of heartbeats e yep, that should have been heartbeat_rpc_timeout; sorry for the mess. http://gerrit.cloudera.org:8080/#/c/12647/1/src/kudu/tserver/heartbeater.cc@600 PS1, Line 600: num_consecutive_failures_proxy_reset > Would be nice to describe what this variable is. Something like "the period Done http://gerrit.cloudera.org:8080/#/c/12647/1/src/kudu/tserver/heartbeater.cc@607 PS1, Line 607: s.IsNetworkError() > How did you test this? I thought I saw cases where this actually was a netw I'm not sure what you want me to test, but I added the corresponding comment into the commit message. In my scenario I clearly saw it was not a network error, but ServiceUnavailable each heartbeat while master was bootstrapping. However, the proxy was reset each hearbeat after FLAGS_heartbeat_max_failures_before_backoff ServiceUnavailable errors in a row. And that's exactly what I'm addressing here -- we don't want to reset the proxy and incur connection negotiation latency plus 'Ping' RPC after each heartbeat in such a case. http://gerrit.cloudera.org:8080/#/c/12647/1/src/kudu/tserver/heartbeater.cc@607 PS1, Line 607: consecutive_failed_heartbeats_ % : num_consecutive_failures_proxy_reset == 0 > Could you split the line so each condition goes on a separate line? Done -- To view, visit http://gerrit.cloudera.org:8080/12647 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I961ae453ffd6ce343574ce58cb0e13fdad218078 Gerrit-Change-Number: 12647 Gerrit-PatchSet: 1 Gerrit-Owner: Alexey Serbin <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Andrew Wong <[email protected]> Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Will Berkeley <[email protected]> Gerrit-Comment-Date: Thu, 07 Mar 2019 22:29:59 +0000 Gerrit-HasComments: Yes
