Jean-Daniel Cryans has submitted this change and it was merged. Change subject: [java client] RPCs can get lost in a TabletClient race ......................................................................
[java client] RPCs can get lost in a TabletClient race We saw hangs after running ITBLL for hours. Turns out that the recent fixes in TabletClient introduced a new race condition. rpcs_inflight is being cleaned in cleanup() by copying all the elements from it and then calling clear(). Even though this is done under a lock, that lock isn't protecting rpcs_inflight so it's possible to clear() rpcs that were not copied out. I haven't been able to recreate this race in unit tests, but it fixed ITBLL. Change-Id: Iaff89eb832d0d6f0dede198661856fae1a8585a0 Reviewed-on: http://gerrit.cloudera.org:8080/3541 Tested-by: Kudu Jenkins Reviewed-by: Adar Dembo <a...@cloudera.com> --- M java/kudu-client/src/main/java/org/kududb/client/TabletClient.java 1 file changed, 20 insertions(+), 11 deletions(-) Approvals: Adar Dembo: Looks good to me, approved Kudu Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/3541 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Iaff89eb832d0d6f0dede198661856fae1a8585a0 Gerrit-PatchSet: 4 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Jean-Daniel Cryans <jdcry...@apache.org> Gerrit-Reviewer: Adar Dembo <a...@cloudera.com> Gerrit-Reviewer: Dan Burkert <d...@cloudera.com> Gerrit-Reviewer: Jean-Daniel Cryans <jdcry...@apache.org> Gerrit-Reviewer: Kudu Jenkins