> On Feb. 27, 2014, 7:09 p.m., Ben Mahler wrote:
> > Hey Dominic, I found an issue applying these Option patches, looks like 
> > slave.cpp:3001 ( 
> > https://github.com/apache/mesos/blob/20a6993e1b50bf46089cd5e059634ac446b08ac2/src/slave/slave.cpp#L3001
> >  ) is problematic:
> > 
> > struct ExecutorState
> > {
> >   ExecutorID id;
> >   Option<ExecutorInfo> info;
> >   Option<ContainerID> latest;
> >   hashmap<ContainerID, RunState> runs;
> >   unsigned int errors;
> > };
> > 
> > void Framework::recoverExecutor(const ExecutorState& state)
> > {
> >   ...
> >   const RunState& run = state.runs.get(latest).get();
> >                         <----- rvalue ------->
> >                         < lvalue of temp Option::t >
> > }
> > 
> > I'm not 100% clear on what the standard specifies here, but it doesn't look 
> > like we're extending the life of a temporary here, since we're grabbing a 
> > _member_ from a temporary.
> > 
> > [ RUN      ] SlaveRecoveryTest/0.RecoverStatusUpdateManager
> > Detaching after fork from child process 51693.
> > Detaching after fork from child process 51694.
> > Detaching after fork from child process 51708.
> > Detaching after fork from child process 51709.
> > WARNING: Logging before InitGoogleLogging() is written to STDERR
> > I0227 18:46:06.952373 51709 exec.cpp:131] Version: 0.19.0
> > I0227 18:46:06.956939 51742 exec.cpp:205] Executor registered on slave 
> > 2014-02-27-18:46:06-1740121354-59276-51387-0
> > Registered executor on smfd-bkq-03-sr4.devel.twitter.com
> > Starting task 9fe870d9-4b47-489f-b863-2a929ade31a8
> > sh -c 'sleep 1000'
> > Forked command at 51749
> > 
> > Program received signal SIGSEGV, Segmentation fault.
> > [Switching to Thread 0x7fffecb03940 (LWP 51403)]
> > 0x0000000000a09656 in 
> > boost::unordered::detail::table<boost::unordered::detail::map<std::allocator<std::pair<mesos::TaskID
> >  const, mesos::internal::slave::state::TaskState> >, mesos::TaskID, 
> > mesos::internal::slave::state::TaskState, boost::hash<mesos::TaskID>, 
> > std::equal_to<mesos::TaskID> > >::begin (this=0x7fffd803d810) at 
> > ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/unordered/detail/table.hpp:249
> > 249             return size_ ? iterator(get_previous_start()->next_) : 
> > iterator();
> > (gdb) where
> > #0  0x0000000000a09656 in 
> > boost::unordered::detail::table<boost::unordered::detail::map<std::allocator<std::pair<mesos::TaskID
> >  const, mesos::internal::slave::state::TaskState> >, mesos::TaskID, 
> > mesos::internal::slave::state::TaskState, boost::hash<mesos::TaskID>, 
> > std::equal_to<mesos::TaskID> > >::begin (this=0x7fffd803d810) at 
> > ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/unordered/detail/table.hpp:249
> > #1  0x00007ffff62deccb in boost::unordered::unordered_map<mesos::TaskID, 
> > mesos::internal::slave::state::TaskState, boost::hash<mesos::TaskID>, 
> > std::equal_to<mesos::TaskID>, std::allocator<std::pair<mesos::TaskID const, 
> > mesos::internal::slave::state::TaskState> > >::begin (this=0x7fffd803d810) 
> > at 
> > ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/unordered/unordered_map.hpp:209
> > #2  0x00007ffff62decf7 in 
> > boost::range_detail::range_begin<hashmap<mesos::TaskID, 
> > mesos::internal::slave::state::TaskState> const> (c=...)
> >     at ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/range/begin.hpp:49
> > #3  0x00007ffff62ded0f in 
> > boost::range_adl_barrier::begin<hashmap<mesos::TaskID, 
> > mesos::internal::slave::state::TaskState> > (r=...) at 
> > ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/range/begin.hpp:119
> > #4  0x00007ffff62ded37 in 
> > boost::foreach_detail_::begin<hashmap<mesos::TaskID, 
> > mesos::internal::slave::state::TaskState>, mpl_::bool_<true> > (col=...)
> >     at ../3rdparty/libprocess/3rdparty/boost-1.53.0/boost/foreach.hpp:674
> > #5  0x00007ffff62a4868 in 
> > mesos::internal::slave::Framework::recoverExecutor (this=0x7fffd804d6c0, 
> > state=...) at ../../src/slave/slave.cpp:3025
> > #6  0x00007ffff62aa668 in mesos::internal::slave::Slave::recoverFramework 
> > (this=0x7fffdc01c230, state=...) at ../../src/slave/slave.cpp:2767
> > #7  0x00007ffff62ab115 in mesos::internal::slave::Slave::recover 
> > (this=0x7fffdc01c230, _state=...) at ../../src/slave/slave.cpp:2576
> > #8  0x00007ffff62c77c4 in std::tr1::_Mem_fn<process::Future<Nothing> 
> > (mesos::internal::slave::Slave::*)(Result<mesos::internal::slave::state::SlaveState>
> >  const&)>::operator() (this=0x7fffe0036a40,
> >     __object=0x7fffdc01c230, __a1=...) at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:214
> > #9  0x00007ffff62cbdb1 in 
> > std::tr1::_Bind<std::tr1::_Mem_fn<process::Future<Nothing> 
> > (mesos::internal::slave::Slave::*)(const 
> > Result<mesos::internal::slave::state::SlaveState>&)> 
> > ()(std::tr1::_Placeholder<1>, 
> > Result<mesos::internal::slave::state::SlaveState>)>::operator()<mesos::internal::slave::Slave*>(mesos::internal::slave::Slave
> >  *&) (this=0x7fffe0036a40, __u1=@0x7fffecb02d40)
> >     at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/bind_iterate.h:45
> > #10 0x00007ffff62cbdf1 in 
> > std::tr1::_Function_handler<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*),std::tr1::_Bind<std::tr1::_Mem_fn<process::Future<Nothing>
> >  (mesos::internal::slave::Slave::*)(const 
> > Result<mesos::internal::slave::state::SlaveState>&)> 
> > ()(std::tr1::_Placeholder<1>, 
> > Result<mesos::internal::slave::state::SlaveState>)> >::_M_invoke(const 
> > std::tr1::_Any_data &, mesos::internal::slave::Slave *) (__functor=..., 
> > __a1=0x7fffdc01c230) at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:488
> > #11 0x00007ffff62e1b4d in std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)>::operator()(mesos::internal::slave::Slave
> >  *) const (this=0x7fffe0036760, __a1=0x7fffdc01c230)
> >     at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:868
> > #12 0x00007ffff62f8a5d in process::internal::pdispatcher<Nothing, 
> > mesos::internal::slave::Slave>(process::ProcessBase *, 
> > std::tr1::shared_ptr<std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)> >, 
> > std::tr1::shared_ptr<process::Promise<Nothing> >) (process=0x7fffdc01c590, 
> > thunk=std::tr1::shared_ptr (count 2) 0x7fffecb02e50, 
> > promise=std::tr1::shared_ptr (count -536638560) 0x7fffecb02e30)
> >     at ../../3rdparty/libprocess/include/process/dispatch.hpp:88
> > #13 0x00007ffff62eedd5 in std::tr1::_Bind<void (* 
> > ()(std::tr1::_Placeholder<1>, 
> > std::tr1::shared_ptr<std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)> >, 
> > std::tr1::shared_ptr<process::Promise<Nothing> >))(process::ProcessBase*, 
> > std::tr1::shared_ptr<std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)> >, 
> > std::tr1::shared_ptr<process::Promise<Nothing> 
> > >)>::operator()<process::ProcessBase*>(process::ProcessBase *&) 
> > (this=0x7fffe002d340, __u1=@0x7fffecb02ea0) at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/bind_iterate.h:45
> > #14 0x00007ffff62eee4d in std::tr1::_Function_handler<void 
> > ()(process::ProcessBase*),std::tr1::_Bind<void (* 
> > ()(std::tr1::_Placeholder<1>, 
> > std::tr1::shared_ptr<std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)> >, 
> > std::tr1::shared_ptr<process::Promise<Nothing> >))(process::ProcessBase*, 
> > std::tr1::shared_ptr<std::tr1::function<process::Future<Nothing> 
> > ()(mesos::internal::slave::Slave*)> >, 
> > std::tr1::shared_ptr<process::Promise<Nothing> >)> >::_M_invoke(const 
> > std::tr1::_Any_data &, process::ProcessBase *) (__functor=..., 
> > __a1=0x7fffdc01c590)
> >     at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:502
> > #15 0x00007ffff66bcded in std::tr1::function<void 
> > ()(process::ProcessBase*)>::operator()(process::ProcessBase *) const 
> > (this=0x7fffe0035500, __a1=0x7fffdc01c590)
> >     at 
> > /usr/lib/gcc/x86_64-redhat-linux/4.1.2/../../../../include/c++/4.1.2/tr1/functional_iterate.h:868
> > #16 0x00007ffff6679593 in process::ProcessBase::visit (this=0x7fffdc01c590, 
> > event=...) at ../../../3rdparty/libprocess/src/process.cpp:3183
> > #17 0x00007ffff6690e3e in process::DispatchEvent::visit 
> > (this=0x7fffe0035580, visitor=0x7fffdc01c590) at 
> > ../../../3rdparty/libprocess/include/process/event.hpp:136
> > #18 0x000000000066b1fa in process::ProcessBase::serve (this=0x7fffdc01c590, 
> > event=...) at ../../3rdparty/libprocess/include/process/process.hpp:38
> > #19 0x00007ffff66840f4 in process::ProcessManager::resume (this=0x10c41d0, 
> > process=0x7fffdc01c590) at ../../../3rdparty/libprocess/src/process.cpp:2611
> > #20 0x00007ffff6684948 in process::schedule (arg=0x0) at 
> > ../../../3rdparty/libprocess/src/process.cpp:1305
> > #21 0x00007ffff4ffe83d in start_thread () from /lib64/libpthread.so.0
> > #22 0x00007ffff3d6526d in clone () from /lib64/libc.so.6
> 
> Dominic Hamon wrote:
>     If i read this right, the hashmap is returning a copy of an Option<T> and 
> then we call get() on that which should be returning a const RunState&. There 
> shouldn't be any need to extend any lifetime.
>     
>     If the hashmap was returning a reference to an Option, I can see the 
> problem. What am I missing?
> 
> Ben Mahler wrote:
>     Previously, we were binding to an rvalue RunState, as Option::get was 
> returning a copy. The standard guarantees that the lifetime of this RunState 
> temporary would be extended (8.5.3 of the standard).
>     
>     Now Option::get is returning an rvalue of a RunState&, this reference 
> lives inside the intermediate temporary Option in the expression. The 
> lifetime of this Option _must_ be extended for the RunState reference to 
> remain valid. But, I don't think the standard ensure this since the Option 
> rvalue is an intermediate part of the expression. 12.2 of the standard is 
> interesting here, but it's important to note that get() is returning a 
> _reference_, not an object.
> 
> Dominic Hamon wrote:
>     the fix for the callsite is to put the Option into an explicit temporary 
> as a copy, however this makes me uncomfortable as the error is a runtime one. 
> Ie, it's not obvious to someone that writing it as written is invalid.
>     
>     Another option is to tighten up hashmap so that get returns the value by 
> reference, not an Option, and it's illegal to call 'get' when the key doesn't 
> exist. Similar to C++11's map::at() method.
>     
>     The final option is to not change the Option return to a reference.

What about just making a copy in the caller? :)

const RunState& run = state.runs.get(latest).get();

becomes:

RunState run = state.runs.get(latest).get();
const RunState run = state.runs.get(latest).get(); // If const semantics are 
strongly desired.


- Ben


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18386/#review35673
-----------------------------------------------------------


On Feb. 27, 2014, 5:12 p.m., Dominic Hamon wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18386/
> -----------------------------------------------------------
> 
> (Updated Feb. 27, 2014, 5:12 p.m.)
> 
> 
> Review request for mesos and Ben Mahler.
> 
> 
> Bugs: MESOS-1008
>     https://issues.apache.org/jira/browse/MESOS-1008
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> See summary
> 
> 
> Diffs
> -----
> 
>   src/linux/fs.hpp 1d86dd0d24c3daae957b5eec387638d1e8e6d7db 
>   src/linux/fs.cpp e5f4f9a16becd4e5960d0cbb7f988736188b2426 
>   src/log/log.cpp 62dc9286285d5981aa5fc63e125d3c3f51d4f457 
>   src/sched/sched.cpp dcb3158d2acd93a107fda51b19e92899a092e628 
> 
> Diff: https://reviews.apache.org/r/18386/diff/
> 
> 
> Testing
> -------
> 
> make check
> 
> 'grep' for cases where Options are reassigned after references are taken.
> 
> 
> Thanks,
> 
> Dominic Hamon
> 
>

Reply via email to