Hi Bikas, no problem.
Thanks for the clear answer.

Regards

Fabio

On Tue, Mar 3, 2015 at 8:21 PM, Bikas Saha <[email protected]> wrote:

>  Apologies for the delayed response.
>
>
>
> IIRC, the scheduler in the AM does not release containers upon AM exit.
> Perhaps we should do that to make the hand-off to the RM explicit.
>
> The ContainerLauncher, though, does try to send a stop request to the NM
> for the containers it has launched.
>
> You are right, the RM will garbage collect all containers from an
> application after the AM for that application has finished. This would
> happen in the next heartbeat with the NMs after the RM has figured out that
> the application is done.
>
>
>
> Bikas
>
>
>
>
>
> *From:* Fabio C. [mailto:[email protected]]
> *Sent:* Friday, February 20, 2015 2:10 AM
> *To:* [email protected]
> *Subject:* Container release time at end of AM
>
>
>
> Hi guys,
> I was measuring the time it takes to a delayed container (kept for
> container reuse) to be released when the tez application master is going to
> shutdown at the end of its life.
> I run the same Hive-on-Tez query 100 times, and as you can see in the
> attached plot there is something strange:
> - most of the containers (around 80%) are released in almost exactly one
> second
> - a few containers are released in a time that spans from a very few
> milliseconds to approximately a time equal to the AM-RM heartbeat
> (suggesting that the AM is the one telling the RM about the end of the
> container).
> The NM-RM heartbeat time is 1s and I consider the release interval to be
> between the "Sending a stop request to the NM for ContainerId" log entry
> (AM side) and the queue update (RM side).
> I could manually check just a few logs, but it seems the second case
> happens when the container is actually able to stop before the end of the
> AM, while if the AM dies we fall in the first case.
> I have a suspect that if the AM is dead, the RM will wait for the NM
> heartbeat to consider the resources available, anyway what I would expect
> in this case is to have a uniform distribution between delta and 1s+delta
> (with delta equal to a few ms).
> What is really happening here in your opinion? How can the variance of the
> first case be so small?
>
> Thanks
>
> Fabio
>

Reply via email to