race condition on MesosExecutorDriver destruction for Python
------------------------------------------------------------

                 Key: MESOS-36
                 URL: https://issues.apache.org/jira/browse/MESOS-36
             Project: Mesos
          Issue Type: Bug
         Environment: OS X 10.6.8, Python 2.6.7
            Reporter: brian wickman


There's a race condition between driver.stop/abort and the destruction of the 
Python driver object.

For example:

def main(args, options):
  thermos_executor = ThermosExecutor(options)
  mesos.MesosExecutorDriver(thermos_executor).run()

will routinely cause the segfault as attached at the end of the thread.

If this is changed to
def main(args, options):
  thermos_executor = ThermosExecutor(options)
  drv = mesos.MesosExecutorDriver(thermos_executor)
  drv.run()

The code works, which indicates an issue (as you can see in the stack trace 
below) in the implicit reference counting on the MesosExecutorDriver object.
The launchTask in my example does some work, sends a framework message when the 
task is finished, then issues a driver.stop().


Process:         Python [64054]
Path:            
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python
Identifier:      Python
Version:         ??? (???)
Code Type:       X86-64 (Native)
Parent Process:  java [64016]

Date/Time:       2011-10-07 14:30:01.103 -0700
OS Version:      Mac OS X 10.6.8 (10K549)
Report Version:  6

Interval Since Last Report:          944086 sec
Crashes Since Last Report:           18
Per-App Crashes Since Last Report:   12
Anonymous UUID:                      3A043B60-6C64-4A96-BA3A-C04C21BA960E

Exception Type:  EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Crashed Thread:  1

Application Specific Information:
abort() called

Thread 0:  Dispatch queue: com.apple.main-thread
0   libSystem.B.dylib                   0x00007fff869bba6a __semwait_signal + 10
1   libSystem.B.dylib                   0x00007fff869bf881 _pthread_cond_wait + 
1286
2   org.python.python                   0x00000001000ef554 
PyThread_acquire_lock + 116
3   org.python.python                   0x00000001000b499a PyEval_RestoreThread 
+ 58
4   org.python.python                   0x00000001000f3f20 
lock_PyThread_acquire_lock + 80
5   org.python.python                   0x00000001000bc558 PyEval_EvalFrameEx + 
28696
6   org.python.python                   0x00000001000bd305 PyEval_EvalCodeEx + 
2197
7   org.python.python                   0x00000001000bb29d PyEval_EvalFrameEx + 
23901
8   org.python.python                   0x00000001000bb6fa PyEval_EvalFrameEx + 
25018
9   org.python.python                   0x00000001000bb6fa PyEval_EvalFrameEx + 
25018
10  org.python.python                   0x00000001000bd305 PyEval_EvalCodeEx + 
2197
11  org.python.python                   0x000000010003b50d function_call + 429
12  org.python.python                   0x000000010000bb92 PyObject_Call + 98
13  org.python.python                   0x00000001000b76d2 PyEval_EvalFrameEx + 
8594
14  org.python.python                   0x00000001000bd305 PyEval_EvalCodeEx + 
2197
15  org.python.python                   0x000000010003b405 function_call + 165
16  org.python.python                   0x000000010000bb92 PyObject_Call + 98
17  org.python.python                   0x00000001000b43c7 
PyEval_CallObjectWithKeywords + 87
18  org.python.python                   0x00000001000e202a Py_Finalize + 186
19  org.python.python                   0x00000001000e1b46 handle_system_exit + 
246
20  org.python.python                   0x00000001000e1d95 PyErr_PrintEx + 437
21  org.python.python                   0x00000001000f16a4 RunModule + 404
22  org.python.python                   0x00000001000f2109 Py_Main + 2505
23  org.python.python                   0x0000000100000e22 0x100000000 + 3618
24  org.python.python                   0x0000000100000d41 0x100000000 + 3393

Thread 1 Crashed:
0   libSystem.B.dylib                   0x00007fff869f39ce 
__semwait_signal_nocancel + 10
1   libSystem.B.dylib                   0x00007fff869f38d0 nanosleep$NOCANCEL + 
129
2   libSystem.B.dylib                   0x00007fff86a503ce usleep$NOCANCEL + 57
3   libSystem.B.dylib                   0x00007fff86a6fa00 abort + 93
4   _mesos.so                           0x000000010190495c 
google::LogSink::~LogSink() + 0
5   _mesos.so                           0x000000010190473b 
google::LogMessage::Fail() + 13
6   _mesos.so                           0x0000000101909cae 
google::LogMessage::SendToLog() + 1212
7   _mesos.so                           0x00000001019066c6 
google::LogMessage::Flush() + 418
8   _mesos.so                           0x0000000101907e5e 
google::LogMessageFatal::~LogMessageFatal() + 22
9   _mesos.so                           0x0000000101915e4c 
process::ProcessManager::wait(process::ProcessBase*, process::UPID const&) + 670
10  _mesos.so                           0x0000000101920eb9 
process::wait(process::UPID const&, double) + 183
11  _mesos.so                           0x000000010186c359 
process::wait(process::ProcessBase const*, double) + 45
12  _mesos.so                           0x000000010185d2c6 
mesos::MesosExecutorDriver::~MesosExecutorDriver() + 98
13  _mesos.so                           0x000000010171791a 
mesos::python::MesosExecutorDriverImpl_dealloc(mesos::python::MesosExecutorDriverImpl*)
 + 42 (mesos_executor_driver_impl.cpp:159)
14  org.python.python                   0x0000000100069521 tupledealloc + 129
15  org.python.python                   0x0000000100010a0a PyObject_CallMethod 
+ 474
16  _mesos.so                           0x000000010171b477 
mesos::python::ProxyExecutor::launchTask(mesos::ExecutorDriver*, 
mesos::TaskDescription const&) + 103 (proxy_executor.cpp:68)
17  _mesos.so                           0x0000000101860641 
boost::_mfi::mf2<void, mesos::Executor, mesos::ExecutorDriver*, 
mesos::TaskDescription const&>::operator()(mesos::Executor*, 
mesos::ExecutorDriver*, mesos::TaskDescription const&) const + 113
18  _mesos.so                           0x0000000101861010 void 
boost::_bi::list3<boost::_bi::value<mesos::Executor*>, 
boost::_bi::value<mesos::MesosExecutorDriver*>, 
boost::reference_wrapper<mesos::TaskDescription const> 
>::operator()<boost::_mfi::mf2<void, mesos::Executor, mesos::ExecutorDriver*, 
mesos::TaskDescription const&>, boost::_bi::list0>(boost::_bi::type<void>, 
boost::_mfi::mf2<void, mesos::Executor, mesos::ExecutorDriver*, 
mesos::TaskDescription const&>&, boost::_bi::list0&, int) + 118
19  _mesos.so                           0x0000000101861052 
boost::_bi::bind_t<void, boost::_mfi::mf2<void, mesos::Executor, 
mesos::ExecutorDriver*, mesos::TaskDescription const&>, 
boost::_bi::list3<boost::_bi::value<mesos::Executor*>, 
boost::_bi::value<mesos::MesosExecutorDriver*>, 
boost::reference_wrapper<mesos::TaskDescription const> > >::operator()() + 54
20  ???                                 0x0000000401861071 0 + 17205432433
21  libSystem.B.dylib                   0x00007fff86a62dc9 setcontext + 25
22  libSystem.B.dylib                   0x00007fff869b9fd6 _pthread_start + 331
23  libSystem.B.dylib                   0x00007fff869b9e89 thread_start + 13

Thread 2:
0   libSystem.B.dylib                   0x00007fff869c4932 select$DARWIN_EXTSN 
+ 10
1   _mesos.so                           0x0000000101956518 select_poll + 168
2   _mesos.so                           0x0000000101957327 ev_loop + 631
3   _mesos.so                           0x00000001019195ff 
process::serve(void*) + 26
4   libSystem.B.dylib                   0x00007fff869b9fd6 _pthread_start + 331
5   libSystem.B.dylib                   0x00007fff869b9e89 thread_start + 13

Thread 1 crashed with X86 Thread State (64-bit):
  rax: 0x000000000000003c  rbx: 0x00000001023e3340  rcx: 0x00000001023e32f8  
rdx: 0x0000000000000001
  rdi: 0x000000000000030f  rsi: 0x0000000000000000  rbp: 0x00000001023e3330  
rsp: 0x00000001023e32f8
   r8: 0x0000000000000000   r9: 0x0000000000989680  r10: 0x0000000000000001  
r11: 0x0000000000000246
  r12: 0x0000000000000000  r13: 0x0000000102264cf8  r14: 0x0000000000000002  
r15: 0x0000000100179110
  rip: 0x00007fff869f39ce  rfl: 0x0000000000000247  cr2: 0x000000010158cc00

Binary Images:
       0x100000000 -        0x100000ff7 +org.python.python 2.6.7 (2.6.7) 
<DE73C8D7-8FE7-91E3-E747-F9410238C08F> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/Resources/Python.app/Contents/MacOS/Python
       0x100003000 -        0x100153ff7 +org.python.python 2.6.7, (c) 2004-2008 
Python Software Foundation. (2.6.7) <F55830D0-BF78-CD2D-D7C3-8B06D61B89A2> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/Python
       0x1002fb000 -        0x1002fcfff +_json.so ??? (???) 
<E7E8F1AE-788B-4C9C-1F32-4563FDBB9B32> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/_json.so
       0x100440000 -        0x100443ff7 +zlib.so ??? (???) 
<7F0DCA3D-EE00-0742-4F3F-F858148BD0F8> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/zlib.so
       0x100488000 -        0x10048bfff +math.so ??? (???) 
<B8550672-AC88-C793-81A8-2C9B8F568541> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/math.so
       0x100491000 -        0x100492ff7 +time.so ??? (???) 
<5B4EB63E-F442-0C01-3EFA-85E1578C71BF> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/time.so
       0x100497000 -        0x10049aff7 +select.so ??? (???) 
<5F113BD9-E2E6-32FA-F5EC-8D1CB714D39D> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/select.so
       0x10049f000 -        0x1004a0ff7 +fcntl.so ??? (???) 
<0C585E11-AA6A-C307-43FD-52E1195AC9E6> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/fcntl.so
       0x1004e3000 -        0x1004e8ff7 +_struct.so ??? (???) 
<DE726454-5661-6198-09A3-914217769C47> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/_struct.so
       0x1004ef000 -        0x1004f1fe7 +binascii.so ??? (???) 
<F377A449-B4B9-3A96-3EC7-7D1DB272A0C2> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/binascii.so
       0x1004f5000 -        0x1004f6fff +cStringIO.so ??? (???) 
<F4762F1D-BC48-5D5F-8747-1FECBE43DFAB> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/cStringIO.so
       0x10053b000 -        0x10053cff7 +_hashlib.so ??? (???) 
<25BE55BB-FA03-671D-1697-6181AB32DB9B> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/_hashlib.so
       0x100540000 -        0x100541fff +termios.so ??? (???) 
<D2D1DE14-3799-B958-3EEB-70C834D1EEE0> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/termios.so
       0x100546000 -        0x100554fe7 +datetime.so ??? (???) 
<5975895D-4A18-EBFB-FFB9-B3C815D0116D> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/datetime.so
       0x100560000 -        0x100564fff +_collections.so ??? (???) 
<7C600BC4-8E88-B5D7-4C3F-C47BE74460BE> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/_collections.so
       0x10056a000 -        0x10056efff +operator.so ??? (???) 
<8B28D217-1C9C-19AA-3952-C1FF8FFA248A> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/operator.so
       0x100575000 -        0x100576fff +_random.so ??? (???) 
<574D30E0-2727-337F-F39C-1701CAA0F85A> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/_random.so
       0x1007e6000 -        0x1007e9ff7 +strop.so ??? (???) 
<15BD9A78-D5BE-6E04-5595-CD0132F5C32C> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/strop.so
       0x1007ee000 -        0x1007efff7 +_functools.so ??? (???) 
<8947EAB7-5F3E-412A-2EF6-81DE9BFDF702> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/_functools.so
       0x1007f2000 -        0x1007f4ff7 +_locale.so ??? (???) 
<3674FAA6-719B-8284-AABF-6381A8B97577> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/_locale.so
       0x1007fb000 -        0x1007fbfff +_weakref.so ??? (???) 
<89494552-9A1C-C2B3-5ADE-2F8A8DD6F977> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/_weakref.so
       0x101430000 -        0x101464fff +pyexpat.so ??? (???) 
<E028F381-F4B5-8148-84CF-9099B43A3A8B> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/pyexpat.so
       0x1014f6000 -        0x1014faff7 +_ssl.so ??? (???) 
<01812E71-E811-0EB0-70A6-7729BA0BDFEF> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/_ssl.so
       0x101700000 -        0x101b40fff +_mesos.so ??? (???) 
<5D8D6FDF-58AC-990A-AE20-EA1BFA36B739> 
/Users/wickman/.python-eggs/mesos-68-py2.6-macosx-10.4-x86_64.egg-tmp/_mesos.so
       0x102312000 -        0x102319fff +_socket.so ??? (???) 
<DD52D015-A984-CB02-BE51-EC8F9C0B2559> 
/Users/wickman/Local/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/lib-dynload/_socket.so
    0x7fff5fc00000 -     0x7fff5fc3bdef  dyld 132.1 (???) 
<DB8B8AB0-0C97-B51C-BE8B-B79895735A33> /usr/lib/dyld
    0x7fff808ed000 -     0x7fff809a3ff7  libobjc.A.dylib 227.0.0 (compatibility 
1.0.0) <03140531-3B2D-1EBA-DA7F-E12CC8F63969> /usr/lib/libobjc.A.dylib
    0x7fff8203b000 -     0x7fff82078ff7  libssl.0.9.8.dylib 0.9.8 
(compatibility 0.9.8) <F743389F-F25A-A77D-4FCA-D6B01AF2EE6D> 
/usr/lib/libssl.0.9.8.dylib
    0x7fff83494000 -     0x7fff835b3fe7  libcrypto.0.9.8.dylib 0.9.8 
(compatibility 0.9.8) <14115D29-432B-CF02-6B24-A60CC533A09E> 
/usr/lib/libcrypto.0.9.8.dylib
    0x7fff83a5d000 -     0x7fff83bd4fe7  com.apple.CoreFoundation 6.6.5 
(550.43) <31A1C118-AD96-0A11-8BDF-BD55B9940EDC> 
/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation
    0x7fff84ea2000 -     0x7fff84ea3ff7  com.apple.TrustEvaluationAgent 1.1 (1) 
<A91CE5B9-3C63-5F8C-5052-95CCAB866F72> 
/System/Library/PrivateFrameworks/TrustEvaluationAgent.framework/Versions/A/TrustEvaluationAgent
    0x7fff86980000 -     0x7fff86b41fef  libSystem.B.dylib 125.2.11 
(compatibility 1.0.0) <9AB4F1D1-89DC-0E8A-DC8E-A4FE4D69DB69> 
/usr/lib/libSystem.B.dylib
    0x7fff86c32000 -     0x7fff86df0fff  libicucore.A.dylib 40.0.0 
(compatibility 1.0.0) <4274FC73-A257-3A56-4293-5968F3428854> 
/usr/lib/libicucore.A.dylib
    0x7fff87c06000 -     0x7fff87c0aff7  libmathCommon.A.dylib 315.0.0 
(compatibility 1.0.0) <95718673-FEEE-B6ED-B127-BCDBDB60D4E5> 
/usr/lib/system/libmathCommon.A.dylib
    0x7fff88a86000 -     0x7fff88ad2fff  libauto.dylib ??? (???) 
<F7221B46-DC4F-3153-CE61-7F52C8C293CF> /usr/lib/libauto.dylib
    0x7fff895bd000 -     0x7fff895ceff7  libz.1.dylib 1.2.3 (compatibility 
1.0.0) <5BAFAE5C-2307-C27B-464D-582A10A6990B> /usr/lib/libz.1.dylib
    0x7fff89786000 -     0x7fff89803fef  libstdc++.6.dylib 7.9.0 (compatibility 
7.0.0) <35ECA411-2C08-FD7D-11B1-1B7A04921A5C> /usr/lib/libstdc++.6.dylib
    0x7fffffe00000 -     0x7fffffe01fff  libSystem.B.dylib ??? (???) 
<9AB4F1D1-89DC-0E8A-DC8E-A4FE4D69DB69> /usr/lib/libSystem.B.dylib

Model: MacBookAir3,1, BootROM MBA31.0061.B01, 2 processors, Intel Core 2 Duo, 
1.6 GHz, 4 GB, SMC 1.67f4
Graphics: NVIDIA GeForce 320M, NVIDIA GeForce 320M, PCI, 256 MB
Memory Module: global_name
AirPort: spairport_wireless_card_type_airport_extreme (0x14E4, 0xD1), Broadcom 
BCM43xx 1.0 (5.10.131.42.4)
Bluetooth: Version 2.4.5f3, 2 service, 12 devices, 1 incoming serial ports
Network Service: AirPort, AirPort, en0
Serial ATA Device: APPLE SSD SM128C, 113 GB
USB Device: FaceTime Camera (Built-in), 0x05ac  (Apple Inc.), 0x850a, 
0x24600000 / 2
USB Device: BRCM2070 Hub, 0x0a5c  (Broadcom Corp.), 0x4500, 0x04500000 / 3
USB Device: Bluetooth USB Host Controller, 0x05ac  (Apple Inc.), 0x821b, 
0x04530000 / 6
USB Device: Apple Internal Keyboard / Trackpad, 0x05ac  (Apple Inc.), 0x0242, 
0x04300000 / 2


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to