[ 
https://issues.apache.org/jira/browse/MESOS-9667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804778#comment-16804778
 ] 

Benjamin Bannier commented on MESOS-9667:
-----------------------------------------

We saw this again when an agent which already had such tasks running was 
restarted. This ticket should be a blocker.

> Check failure when executor for task using resource provider resources 
> subscribes before agent is registered
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-9667
>                 URL: https://issues.apache.org/jira/browse/MESOS-9667
>             Project: Mesos
>          Issue Type: Bug
>          Components: agent
>    Affects Versions: 1.8.0
>            Reporter: Benjamin Bannier
>            Priority: Blocker
>              Labels: foundations, mesosphere, mesosphere-dss-ga
>
> When an executor for a task using resource provider resources subscribes 
> before the agent has registered with the master, we trigger a fatal assertion,
> {code:java}
> Mar 21 13:42:47 agent1 mesos-agent[17277]: F0321 13:42:46.845535 17295 
> slave.cpp:8834] Check failed: 'resourceProviderManager.get()' Must be non NULL
> Mar 21 13:42:47 agent1 mesos-agent[17277]: *** Check failure stack trace: 
> *{code}
> The reason for this failure is that we attempt to publish resources to the 
> resource provider via the resource provider manager, but the resource 
> provider manager is only created once the agent has registered with the 
> master.
> As a workaround one can terminate the executors and their tasks, and let the 
> framework relaunch the tasks (provided it supports that).
> A possible workaround could be to prevent such executors from subscribing 
> until the resource provider manager is available.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to