[
https://issues.apache.org/jira/browse/MESOS-8239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16544593#comment-16544593
]
Benjamin Hindman commented on MESOS-8239:
-----------------------------------------
https://reviews.apache.org/r/67921
> LIFO semaphore does not decommission correctly.
> -----------------------------------------------
>
> Key: MESOS-8239
> URL: https://issues.apache.org/jira/browse/MESOS-8239
> Project: Mesos
> Issue Type: Bug
> Components: libprocess
> Reporter: Benjamin Mahler
> Assignee: Benjamin Hindman
> Priority: Major
>
> When building with the {{DecomissionableLastInFirstOutFixedSizeSemaphore}},
> it seems that libprocess can get stuck during finalization:
> {noformat}
> ../configure CXX=clang++ CC=clang --disable-python --disable-java
> --enable-ssl --enable-libevent --enable-lock-free-run-queue
> --enable-lock-free-event-queue --enable-last-in-first-out-fixed-size-semaphore
> {noformat}
> {code}
> Thread 2 (Thread 0x7f939ffff700 (LWP 39226)):
> #0 0x00007f94641d3a0b in futex_abstimed_wait (cancel=true,
> private=<optimized out>, abstime=0x0, expected=0, futex=0x7f945001edc0) at
> ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
> #1 do_futex_wait (sem=sem@entry=0x7f945001edc0, abstime=0x0) at
> ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:223
> #2 0x00007f94641d3a9f in __new_sem_wait_slow (sem=0x7f945001edc0,
> abstime=0x0) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:292
> #3 0x00007f94641d3b3b in __new_sem_wait (sem=<optimized out>) at
> ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:28
> #4 0x0000000000a7225c in KernelSemaphore::wait() () at
> ../../../3rdparty/libprocess/src/semaphore.hpp:115
> #5 0x0000000000a72069 in wait () at
> ../../../3rdparty/libprocess/src/semaphore.hpp:371
> #6 0x0000000000a3afbc in process::RunQueue::wait() () at
> ../../../3rdparty/libprocess/src/run_queue.hpp:147
> #7 0x0000000000a1f8f5 in dequeue () at
> ../../../3rdparty/libprocess/src/process.cpp:3647
> #8 0x0000000000a287e1 in operator() () at
> ../../../3rdparty/libprocess/src/process.cpp:2859
> #9 0x0000000000a286d5 in void
> std::_Bind_simple<process::ProcessManager::init_threads()::$_9
> ()>::_M_invoke<>(std::_Index_tuple<>) ()
> at
> /opt/rh/devtoolset-4/root/usr/lib/gcc/x86_64-redhat-linux/5.3.1/../../../../include/c++/5.3.1/functional:1530
> #10 0x0000000000a286a5 in
> std::_Bind_simple<process::ProcessManager::init_threads()::$_9
> ()>::operator()() ()
> at
> /opt/rh/devtoolset-4/root/usr/lib/gcc/x86_64-redhat-linux/5.3.1/../../../../include/c++/5.3.1/functional:1520
> #11 0x0000000000a28599 in
> std::thread::_Impl<std::_Bind_simple<process::ProcessManager::init_threads()::$_9
> ()> >::_M_run() ()
> at
> /opt/rh/devtoolset-4/root/usr/lib/gcc/x86_64-redhat-linux/5.3.1/../../../../include/c++/5.3.1/thread:115
> #12 0x0000000000bae180 in execute_native_thread_routine ()
> #13 0x00007f94641cde25 in start_thread (arg=0x7f939ffff700) at
> pthread_create.c:308
> #14 0x00007f94632cf34d in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
> Thread 1 (Thread 0x7f9466bd68c0 (LWP 37342)):
> #0 0x00007f94641cef57 in pthread_join (threadid=140272021272320,
> thread_return=0x0) at pthread_join.c:92
> #1 0x00007f9463b67077 in __gthread_join (__value_ptr=0x0,
> __threadid=<optimized out>)
> at
> /usr/src/debug/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/x86_64-redhat-linux/libstdc++-v3/include/x86_64-redhat-linux/bits/gthr-default.h:668
> #2 std::thread::join (this=0x2ca3740) at
> ../../../../../libstdc++-v3/src/c++11/thread.cc:107
> #3 0x0000000000a0e212 in process::ProcessManager::finalize() () at
> ../../../3rdparty/libprocess/src/process.cpp:2797
> #4 0x0000000000a0cfc3 in process::finalize(bool) () at
> ../../../3rdparty/libprocess/src/process.cpp:1407
> #5 0x0000000000a0ce3d in process::reinitialize(Option<std::string> const&,
> Option<std::string> const&, Option<std::string> const&) () at
> ../../../3rdparty/libprocess/src/process.cpp:1092
> #6 0x00000000005f16ed in HTTPTest::TearDownTestCase() () at
> ../../../3rdparty/libprocess/src/tests/http_tests.cpp:203
> #7 0x00000000008be4b3 in testing::TestCase::RunTearDownTestCase() () at
> googletest-release-1.8.0/googletest/include/gtest/gtest.h:891
> #8 0x00000000008d542a in void
> testing::internal::HandleSehExceptionsInMethodIfSupported<testing::TestCase,
> void>(testing::TestCase*, void (testing::TestCase::*)(), char const*) ()
> at googletest-release-1.8.0/googletest/src/gtest.cc:2402
> #9 0x00000000008be271 in void
> testing::internal::HandleExceptionsInMethodIfSupported<testing::TestCase,
> void>(testing::TestCase*, void (testing::TestCase::*)(), char const*) ()
> at googletest-release-1.8.0/googletest/src/gtest.cc:2438
> #10 0x000000000089ee61 in testing::TestCase::Run() () at
> googletest-release-1.8.0/googletest/src/gtest.cc:2779
> #11 0x00000000008a6361 in testing::internal::UnitTestImpl::RunAllTests() ()
> at googletest-release-1.8.0/googletest/src/gtest.cc:4649
> #12 0x00000000008d6f4a in bool
> testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,
> bool>(testing::internal::UnitTestImpl*, bool
> (testing::internal::UnitTestImpl::*)(), char const*) () at
> googletest-release-1.8.0/googletest/src/gtest.cc:2402
> #13 0x00000000008bf511 in bool
> testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,
> bool>(testing::internal::UnitTestImpl*, bool
> (testing::internal::UnitTestImpl::*)(), char const*) () at
> googletest-release-1.8.0/googletest/src/gtest.cc:2438
> #14 0x00000000008a6033 in testing::UnitTest::Run() () at
> googletest-release-1.8.0/googletest/src/gtest.cc:4257
> #15 0x0000000000694f31 in RUN_ALL_TESTS() () at
> ../googletest-release-1.8.0/googletest/include/gtest/gtest.h:2233
> #16 0x0000000000693e6b in main () at
> ../../../3rdparty/libprocess/src/tests/main.cpp:111
> {code}
> Looks like there is a bug in the decomission logic.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)