Re: Spark hangs on bad Mesos slave

Gary Malouf Wed, 11 Dec 2013 05:38:34 -0800

As an addendum, I see a large number of the following in the mesos slave
info logs:


W1211 05:44:37.057456 14205 monitor.cpp:186] Failed to collect resource
usage for executor '201312061449-1315739402-5050-23513-0' of framework
'201312061449-1315739402-5050-23513-0026': Future discarded

W1211 05:44:42.057998 14207 monitor.cpp:186] Failed to collect resource
usage for executor '201312061449-1315739402-5050-23513-0' of framework
'201312061449-1315739402-5050-23513-0026': Future discarded




On Tue, Dec 10, 2013 at 6:27 PM, Gary Malouf <[email protected]> wrote:

> Hi guys,
>
> For reference, we are on a master build of spark from November 19 and
> Mesos 0.13.
>
> Periodically, we run into an issue where one of our Mesos slaves takes
> some tasks from a Spark query and according to the Mesos ui they are stuck
> in 'STAGING'.  This ends up blocking the query from running and blocks
> future queries until we stop and restart the slave in question.
>
> Has anyone else seen and/or resolved this type of issue?
>
> Thanks,
>
> Gary
>

Re: Spark hangs on bad Mesos slave

Reply via email to