[ 
https://issues.apache.org/jira/browse/MESOS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16474172#comment-16474172
 ] 

Benjamin Bannier commented on MESOS-6985:
-----------------------------------------

[~ipronin] Would you mind sharing your POC? I would be interested in a fix for 
MESOS-3475 which I suspect to be linked and which caused issues like the one 
reported here in a number of places.

> os::getenv() can segfault
> -------------------------
>
>                 Key: MESOS-6985
>                 URL: https://issues.apache.org/jira/browse/MESOS-6985
>             Project: Mesos
>          Issue Type: Bug
>          Components: stout
>         Environment: ASF CI, Ubuntu 14.04 and CentOS 7 both with and without 
> libevent/SSL
>            Reporter: Greg Mann
>            Assignee: Ilya Pronin
>            Priority: Major
>              Labels: flaky-test, reliability, stout
>         Attachments: 
> MasterMaintenanceTest.InverseOffersFilters-truncated.txt, 
> MasterTest.MultipleExecutors.txt
>
>
> This was observed on ASF CI. The segfault first showed up on CI on 9/20/16 
> and has been produced by the tests {{MasterTest.MultipleExecutors}} and 
> {{MasterMaintenanceTest.InverseOffersFilters}}. In both cases, 
> {{os::getenv()}} segfaults with the same stack trace:
> {code}
> *** Aborted at 1485241617 (unix time) try "date -d @1485241617" if you are 
> using GNU date ***
> PC: @     0x2ad59e3ae82d (unknown)
> I0124 07:06:57.422080 28619 exec.cpp:162] Version: 1.2.0
> *** SIGSEGV (@0xf0) received by PID 28591 (TID 0x2ad5a7b87700) from PID 240; 
> stack trace: ***
> I0124 07:06:57.422336 28615 exec.cpp:212] Executor started at: 
> executor(75)@172.17.0.2:45752 with pid 28591
>     @     0x2ad5ab953197 (unknown)
>     @     0x2ad5ab957479 (unknown)
>     @     0x2ad59e165330 (unknown)
>     @     0x2ad59e3ae82d (unknown)
>     @     0x2ad594631358 os::getenv()
>     @     0x2ad59aba6acf mesos::internal::slave::executorEnvironment()
>     @     0x2ad59ab845c0 mesos::internal::slave::Framework::launchExecutor()
>     @     0x2ad59ab818a2 mesos::internal::slave::Slave::_run()
>     @     0x2ad59ac1ec10 
> _ZZN7process8dispatchIN5mesos8internal5slave5SlaveERKNS_6FutureIbEERKNS1_13FrameworkInfoERKNS1_12ExecutorInfoERK6OptionINS1_8TaskInfoEERKSF_INS1_13TaskGroupInfoEES6_S9_SC_SH_SL_EEvRKNS_3PIDIT_EEMSP_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_ENKUlPNS_11ProcessBaseEE_clES16_
>     @     0x2ad59ac1e6bf 
> _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS0_6FutureIbEERKNS5_13FrameworkInfoERKNS5_12ExecutorInfoERK6OptionINS5_8TaskInfoEERKSJ_INS5_13TaskGroupInfoEESA_SD_SG_SL_SP_EEvRKNS0_3PIDIT_EEMST_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_
>     @     0x2ad59bce2304 std::function<>::operator()()
>     @     0x2ad59bcc9824 process::ProcessBase::visit()
>     @     0x2ad59bd4028e process::DispatchEvent::visit()
>     @     0x2ad594616df1 process::ProcessBase::serve()
>     @     0x2ad59bcc72b7 process::ProcessManager::resume()
>     @     0x2ad59bcd567c 
> process::ProcessManager::init_threads()::$_2::operator()()
>     @     0x2ad59bcd5585 
> _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_2vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE
>     @     0x2ad59bcd5555 std::_Bind_simple<>::operator()()
>     @     0x2ad59bcd552c std::thread::_Impl<>::_M_run()
>     @     0x2ad59d9e6a60 (unknown)
>     @     0x2ad59e15d184 start_thread
>     @     0x2ad59e46d37d (unknown)
> make[4]: *** [check-local] Segmentation fault
> {code}
> Find attached the full log from a failed run of 
> {{MasterTest.MultipleExecutors}} and a truncated log from a failed run of 
> {{MasterMaintenanceTest.InverseOffersFilters}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to