Henry Robinson has uploaded a new change for review. http://gerrit.cloudera.org:8080/7036
Change subject: IMPALA-5377: Impala may crash if given a fragment instance while restarting ...................................................................... IMPALA-5377: Impala may crash if given a fragment instance while restarting If an Impala daemon restarts quickly enough so that the statestore does not detect a failure, coordinators will still issue fragment instances to that daemon. There's a race in that situation, where the BE service port was opened before ExecEnv::StartServices() had initialized the process-wide memtracker; trying to run the finstance would then crash the process. This patch inverts the order of ExecEnv::StartServices() and starting the backend and client servers. Testing: reproduced by manually injecting sleeps after be_server->Start() and running a query on the cluster. After the patch, sleeping *before or after* be_server->Start() did not trigger a crash, and the query failed as expected if it was executed in that narrow window. No automated testing added: this is a hard bug to hit without sleeps and would need some repeated kill-and-restart loops to have a reasonable chance of triggering. Change-Id: I1d53b36304cb86c43e110e10cf76a39976ae3bd5 --- M be/src/service/impalad-main.cc 1 file changed, 8 insertions(+), 7 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/7036/1 -- To view, visit http://gerrit.cloudera.org:8080/7036 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I1d53b36304cb86c43e110e10cf76a39976ae3bd5 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Henry Robinson <[email protected]>
