> On Feb. 27, 2014, 11:09 a.m., Ben Mahler wrote:
> > Hey Dominic, I found an issue applying these Option patches, looks like 
> > slave.cpp:3001 ( 
> > https://github.com/apache/mesos/blob/20a6993e1b50bf46089cd5e059634ac446b08ac2/src/slave/slave.cpp#L3001
> >  ) is problematic:
> > 
> > struct ExecutorState
> > {
> >   ExecutorID id;
> >   Option<ExecutorInfo> info;
> >   Option<ContainerID> latest;
> >   hashmap<ContainerID, RunState> runs;
> >   unsigned int errors;
> > };
> > 
> > void Framework::recoverExecutor(const ExecutorState& state)
> > {
> >   ...
> >   const RunState& run = state.runs.get(latest).get();
> >                         <----- rvalue ------->
> >                         < lvalue of temp Option::t >
> > }
> > 
> > I'm not 100% clear on what the standard specifies here, but it doesn't look 
> > like we're extending the life of a temporary here, since we're grabbing a 
> > _member_ from a temporary.
> > 
> > [ RUN      ] SlaveRecoveryTest/0.RecoverStatusUpdateManager
> > Detaching after fork from child process 51693.
> > Detaching after fork from child process 51694.
> > Detaching after fork from child process 51708.
> > Detaching after fork from child process 51709.
> > WARNING: Logging before InitGoogleLogging() is written to STDERR
> > I0227 18:46:06.952373 51709 exec.cpp:131] Version: 0.19.0
> > I0227 18:46:06.956939 51742 exec.cpp:205] Executor registered on slave 
> > 2014-02-27-18:46:06-1740121354-59276-51387-0
> > Registered executor on smfd-bkq-03-sr4.devel.twitter.com
> > Starting task 9fe870d9-4b47-489f-b863-2a929ade31a8
> > sh -c 'sleep 1000'
> > Forked command at 51749
> > 
> > Program received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 0x7fffecb03940 (LWP 51403)]
> > 0x0000000000a09656 in 
> > boost::unordered::detail::table<boost::unordered::detail::map<std::allocator<std::pair<mesos::TaskID
> >  const, mesos::internal::slave::state::TaskState> >, mesos::TaskID, 
> > mesos::internal::slave::state::TaskState, boost::hash<mesos::TaskID>, 
> > std::equal_to<mesos::TaskID> > >::begin (this=0x7fffd803d810) at 
> > ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/unordered/detail/table.hpp:249
> > 249             return size_ ? iterator(get_previous_start()->next_) : 
> > iterator();
> > (gdb) where
> > #0  0x0000000000a09656 in 
> > boost::unordered::detail::table<boost::unordered::detail::map<std::allocator<std::pair<mesos::TaskID
> >  const, mesos::internal::slave::state::TaskState> >, mesos::TaskID, 
> > mesos::internal::slave::state::TaskState, boost::hash<mesos::TaskID>, 
> > std::equal_to<mesos::TaskID> > >::begin (this=0x7fffd803d810) at 
> > ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/unordered/detail/table.hpp:249
> > #1  0x00007ffff62deccb in boost::unordered::unordered_map<mesos::TaskID, 
> > mesos::internal::slave::state::TaskState, boost::hash<mesos::TaskID>, 
> > std::equal_to<mesos::TaskID>, std::allocator<std::pair<mesos::TaskID const, 
> > mesos::internal::slave::state::TaskState> > >::begin (this=0x7fffd803d810) 
> > at 
> > ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/unordered/unordered_map.hpp:209
> > #2  0x00007ffff62decf7 in 
> > boost::range_detail::range_begin<hashmap<mesos::TaskID, 
> > mesos::internal::slave::state::TaskState> const> (c=...)
> >     at ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/range/begin.hpp:49
> > #3  0x00007ffff62ded0f in 
> > boost::range_adl_barrier::begin<hashmap<mesos::TaskID, 
> > mesos::internal::slave::state::TaskState> > (r=...) at 
> > ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/range/begin.hpp:119
> > #4  0x00007ffff62ded37 in 
> > boost::foreach_detail_::begin<hashmap<mesos::TaskID, 
> > mesos::internal::slave::state::TaskState>, mpl_::bool_<true> > (col=...)
> >     at ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/foreach.hpp:674
> > #5  0x00007ffff62a4868 in 
> > mesos::internal::slave::Framework::recoverExecutor (this=0x7fffd804d6c0, 
> > state=...) at ../../src/slave/slave.cpp:3025
> > #6  0x00007ffff62aa668 in mesos::internal::slave::Slave::recoverFramework 
> > (this=0x7fffdc01c230, state=...) at ../../src/slave/slave.cpp:2767
> > #7  0x00007ffff62ab115 in mesos::internal::slave::Slave::recover 
> > (this=0x7fffdc01c230, _state=...) at ../../src/slave/slave.cpp:2576
> > #8  0x00007ffff62c77c4 in std::tr1::_Mem_fn<process::Future<Nothing> 
> > (mesos::internal::slave::Slave::*)(Result<mesos::internal::slave::state::SlaveState>
> >  const&)>::operator() (this=0x7fffe0036a40,
> >     __object=0x7fffdc01c230, __a1=...) at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:214
> > #9  0x00007ffff62cbdb1 in 
> > std::tr1::_Bind<std::tr1::_Mem_fn<process::Future<Nothing> 
> > (mesos::internal::slave::Slave::*)(const 
> > Result<mesos::internal::slave::state::SlaveState>&)> 
> > ()(std::tr1::_Placeholder<1>, 
> > Result<mesos::internal::slave::state::SlaveState>)>::operator()<mesos::internal::slave::Slave*>(mesos::internal::slave::Slave
> >  *&) (this=0x7fffe0036a40, __u1=@0x7fffecb02d40)
> >     at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/bind_iterate.h:45
> > #10 0x00007ffff62cbdf1 in 
> > std::tr1::_Function_handler<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*),std::tr1::_Bind<std::tr1::_Mem_fn<process::Future<Nothing>
> >  (mesos::internal::slave::Slave::*)(const 
> > Result<mesos::internal::slave::state::SlaveState>&)> 
> > ()(std::tr1::_Placeholder<1>, 
> > Result<mesos::internal::slave::state::SlaveState>)> >::_M_invoke(const 
> > std::tr1::_Any_data &, mesos::internal::slave::Slave *) (__functor=..., 
> > __a1=0x7fffdc01c230) at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:488
> > #11 0x00007ffff62e1b4d in std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)>::operator()(mesos::internal::slave::Slave
> >  *) const (this=0x7fffe0036760, __a1=0x7fffdc01c230)
> >     at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:868
> > #12 0x00007ffff62f8a5d in process::internal::pdispatcher<Nothing, 
> > mesos::internal::slave::Slave>(process::ProcessBase *, 
> > std::tr1::shared_ptr<std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)> >, 
> > std::tr1::shared_ptr<process::Promise<Nothing> >) (process=0x7fffdc01c590, 
> > thunk=std::tr1::shared_ptr (count 2) 0x7fffecb02e50, 
> > promise=std::tr1::shared_ptr (count -536638560) 0x7fffecb02e30)
> >     at ../../3rdparty/libprocess/include/process/dispatch.hpp:88
> > #13 0x00007ffff62eedd5 in std::tr1::_Bind<void (* 
> > ()(std::tr1::_Placeholder<1>, 
> > std::tr1::shared_ptr<std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)> >, 
> > std::tr1::shared_ptr<process::Promise<Nothing> >))(process::ProcessBase*, 
> > std::tr1::shared_ptr<std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)> >, 
> > std::tr1::shared_ptr<process::Promise<Nothing> 
> > >)>::operator()<process::ProcessBase*>(process::ProcessBase *&) 
> > (this=0x7fffe002d340, __u1=@0x7fffecb02ea0) at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/bind_iterate.h:45
> > #14 0x00007ffff62eee4d in std::tr1::_Function_handler<void 
> > ()(process::ProcessBase*),std::tr1::_Bind<void (* 
> > ()(std::tr1::_Placeholder<1>, 
> > std::tr1::shared_ptr<std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)> >, 
> > std::tr1::shared_ptr<process::Promise<Nothing> >))(process::ProcessBase*, 
> > std::tr1::shared_ptr<std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)> >, 
> > std::tr1::shared_ptr<process::Promise<Nothing> >)> >::_M_invoke(const 
> > std::tr1::_Any_data &, process::ProcessBase *) (__functor=..., 
> > __a1=0x7fffdc01c590)
> >     at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:502
> > #15 0x00007ffff66bcded in std::tr1::function<void 
> > ()(process::ProcessBase*)>::operator()(process::ProcessBase *) const 
> > (this=0x7fffe0035500, __a1=0x7fffdc01c590)
> >     at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:868
> > #16 0x00007ffff6679593 in process::ProcessBase::visit (this=0x7fffdc01c590, 
> > event=...) at ../../../3rdparty/libprocess/src/process.cpp:3183
> > #17 0x00007ffff6690e3e in process::DispatchEvent::visit 
> > (this=0x7fffe0035580, visitor=0x7fffdc01c590) at 
> > ../../../3rdparty/libprocess/include/process/event.hpp:136
> > #18 0x000000000066b1fa in process::ProcessBase::serve (this=0x7fffdc01c590, 
> > event=...) at ../../3rdparty/libprocess/include/process/process.hpp:38
> > #19 0x00007ffff66840f4 in process::ProcessManager::resume (this=0x10c41d0, 
> > process=0x7fffdc01c590) at ../../../3rdparty/libprocess/src/process.cpp:2611
> > #20 0x00007ffff6684948 in process::schedule (arg=0x0) at 
> > ../../../3rdparty/libprocess/src/process.cpp:1305
> > #21 0x00007ffff4ffe83d in start_thread () from /lib64/libpthread.so.0
> > #22 0x00007ffff3d6526d in clone () from /lib64/libc.so.6
> 
> Dominic Hamon wrote:
>     If i read this right, the hashmap is returning a copy of an Option<T> and 
> then we call get() on that which should be returning a const RunState&. There 
> shouldn't be any need to extend any lifetime.
>     
>     If the hashmap was returning a reference to an Option, I can see the 
> problem. What am I missing?
> 
> Ben Mahler wrote:
>     Previously, we were binding to an rvalue RunState, as Option::get was 
> returning a copy. The standard guarantees that the lifetime of this RunState 
> temporary would be extended (8.5.3 of the standard).
>     
>     Now Option::get is returning an rvalue of a RunState&, this reference 
> lives inside the intermediate temporary Option in the expression. The 
> lifetime of this Option _must_ be extended for the RunState reference to 
> remain valid. But, I don't think the standard ensure this since the Option 
> rvalue is an intermediate part of the expression. 12.2 of the standard is 
> interesting here, but it's important to note that get() is returning a 
> _reference_, not an object.

the fix for the callsite is to put the Option into an explicit temporary as a 
copy, however this makes me uncomfortable as the error is a runtime one. Ie, 
it's not obvious to someone that writing it as written is invalid.

Another option is to tighten up hashmap so that get returns the value by 
reference, not an Option, and it's illegal to call 'get' when the key doesn't 
exist. Similar to C++11's map::at() method.

The final option is to not change the Option return to a reference.


- Dominic


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18386/#review35673
-----------------------------------------------------------


On Feb. 27, 2014, 9:12 a.m., Dominic Hamon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18386/
> -----------------------------------------------------------
> 
> (Updated Feb. 27, 2014, 9:12 a.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1008
>     https://issues.apache.org/jira/browse/MESOS-1008
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> See summary
> 
> 
> Diffs
> -----
> 
>   src/linux/fs.hpp 1d86dd0d24c3daae957b5eec387638d1e8e6d7db 
>   src/linux/fs.cpp e5f4f9a16becd4e5960d0cbb7f988736188b2426 
>   src/log/log.cpp 62dc9286285d5981aa5fc63e125d3c3f51d4f457 
>   src/sched/sched.cpp dcb3158d2acd93a107fda51b19e92899a092e628 
> 
> Diff: https://reviews.apache.org/r/18386/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 'grep' for cases where Options are reassigned after references are taken.
> 
> 
> Thanks,
> 
> Dominic Hamon
> 
>

Reply via email to