Thomas Tauber-Marshall has posted comments on this change. ( http://gerrit.cloudera.org:8080/15666 )
Change subject: IMPALA-5746: Test case for remote fragments releasing memory ...................................................................... Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/15666/1/tests/custom_cluster/test_restart_services.py File tests/custom_cluster/test_restart_services.py: http://gerrit.cloudera.org:8080/#/c/15666/1/tests/custom_cluster/test_restart_services.py@240 PS1, Line 240: status_report_max_retry_s So its true that IMPALA-2990 fixes the issue that fragments will eventually get cancelled after the coordinator goes down, but the default value for this is 10 minutes, which is a long time to be holding on to resources after a coordinator failure. Of course, if the fragment has finished executing, then all we're doing for those 10 minutes is retrying sending the final status report, which shouldn't require holding onto many resources, but I'm not sure if we're smart enough to release everything that's not needed before the final status report is done. It would be good to figure that out, and if we are smart enough then maybe have a test where we leave this at the default 10 minutes and check that the resources are released quickly anyways even if the fragments stick around, and if we're not smart enough then it would be a good thing to fix. Fwiw, I was just working on a case yesterday where we think this was an issue - a coordinator went down and clients were able to fail over to another one, but the autoscaler was triggered, presumably because it detected a spike in load due to the old queries not getting cancelled quickly. -- To view, visit http://gerrit.cloudera.org:8080/15666 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If9fe8309f80f797d205b756ba58219f595aba4e5 Gerrit-Change-Number: 15666 Gerrit-PatchSet: 1 Gerrit-Owner: Sahil Takiar <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Comment-Date: Tue, 07 Apr 2020 17:20:07 +0000 Gerrit-HasComments: Yes
