Hi Bikas, no problem. Thanks for the clear answer. Regards
Fabio On Tue, Mar 3, 2015 at 8:21 PM, Bikas Saha <[email protected]> wrote: > Apologies for the delayed response. > > > > IIRC, the scheduler in the AM does not release containers upon AM exit. > Perhaps we should do that to make the hand-off to the RM explicit. > > The ContainerLauncher, though, does try to send a stop request to the NM > for the containers it has launched. > > You are right, the RM will garbage collect all containers from an > application after the AM for that application has finished. This would > happen in the next heartbeat with the NMs after the RM has figured out that > the application is done. > > > > Bikas > > > > > > *From:* Fabio C. [mailto:[email protected]] > *Sent:* Friday, February 20, 2015 2:10 AM > *To:* [email protected] > *Subject:* Container release time at end of AM > > > > Hi guys, > I was measuring the time it takes to a delayed container (kept for > container reuse) to be released when the tez application master is going to > shutdown at the end of its life. > I run the same Hive-on-Tez query 100 times, and as you can see in the > attached plot there is something strange: > - most of the containers (around 80%) are released in almost exactly one > second > - a few containers are released in a time that spans from a very few > milliseconds to approximately a time equal to the AM-RM heartbeat > (suggesting that the AM is the one telling the RM about the end of the > container). > The NM-RM heartbeat time is 1s and I consider the release interval to be > between the "Sending a stop request to the NM for ContainerId" log entry > (AM side) and the queue update (RM side). > I could manually check just a few logs, but it seems the second case > happens when the container is actually able to stop before the end of the > AM, while if the AM dies we fall in the first case. > I have a suspect that if the AM is dead, the RM will wait for the NM > heartbeat to consider the resources available, anyway what I would expect > in this case is to have a uniform distribution between delta and 1s+delta > (with delta equal to a few ms). > What is really happening here in your opinion? How can the variance of the > first case be so small? > > Thanks > > Fabio >
