[ 
https://issues.apache.org/jira/browse/MESOS-8963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16492803#comment-16492803
 ] 

Benno Evers commented on MESOS-8963:
------------------------------------

https://reviews.apache.org/r/67345/

> Executor crash trying to print container ID
> -------------------------------------------
>
>                 Key: MESOS-8963
>                 URL: https://issues.apache.org/jira/browse/MESOS-8963
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Benno Evers
>            Priority: Major
>
> As observed in an internal cluster:
> {noformat}
> mesos-default-executor: 
> /pkg/src/mesos/3rdparty/stout/include/stout/option.hpp:112: T& 
> Option<T>::get() & [with T = mesos::ContainerID]: Assertion `isSome()' failed.
> *** Aborted at 1527514147 (unix time) try "date -d @1527514147" if you are 
> using GNU date ***
> PC: @     0x7f9fe3b5c1f7 (unknown)
> *** SIGABRT (@0x6300000005) received by PID 5 (TID 0x7f9fdfe8e700) from PID 
> 5; stack trace: ***
>     @     0x7f9fe3ef95e0 (unknown)
>     @     0x7f9fe3b5c1f7 (unknown)
>     @     0x7f9fe3b5d8e8 (unknown)
>     @     0x7f9fe3b55266 (unknown)
>     @     0x7f9fe3b55312 (unknown)
>     @     0x7f9fe581b9b0 _ZNR6OptionIN5mesos11ContainerIDEE3getEv.part.134
>     @     0x7f9fe58a19f5 
> _ZZN5mesos8internal6checks14CheckerProcess18nestedCommandCheckEvENKUlRKN7process4http8ResponseEE0_clES7_
>     @     0x7f9fe66a8edc process::ProcessManager::resume()
>     @     0x7f9fe66ae856 
> _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUlvE_vEEE6_M_runEv
>     @     0x7f9fe46d32b0 (unknown)
>     @     0x7f9fe3ef1e25 (unknown)
>     @     0x7f9fe3c1f34d (unknown)
> {noformat}
> The issue is caused by not this block in CheckerProcess not checking that 
> previousCheckContainerId is still some after it had yielded control:
> {noformat}
> // checker_process.cpp:649
> LOG(WARNING) << "Connection to remove the nested container '"
>                               << previousCheckContainerId.get() << "' used 
> for the "
>                               << name << " for task '" << taskId << "' 
> failed: "
>                               << failure;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to