[ 
https://issues.apache.org/jira/browse/MESOS-3235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257751#comment-15257751
 ] 

Bernd Mathiske commented on MESOS-3235:
---------------------------------------

So far I could not reproduce the behavior. Also, a few weeks ago I still saw 
this test failing on several occasions, but lately it has been stable with no 
failures. 

Looking at the log, it seems that all tasks got executed normally. The only 
thing that looks a bit strange is that TASK_KILLED is mentioned after 
TASK_FINISHED. I'll look into that, but on the backburner.

> FetcherCacheHttpTest.HttpCachedSerialized and 
> FetcherCacheHttpTest.HttpCachedConcurrent are flaky
> -------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-3235
>                 URL: https://issues.apache.org/jira/browse/MESOS-3235
>             Project: Mesos
>          Issue Type: Bug
>          Components: fetcher, tests
>    Affects Versions: 0.23.0
>            Reporter: Joseph Wu
>            Assignee: Bernd Mathiske
>              Labels: mesosphere
>             Fix For: 0.27.0
>
>         Attachments: fetchercache_log_centos_6.txt
>
>
> On OSX, {{make clean && make -j8 V=0 check}}:
> {code}
> [----------] 3 tests from FetcherCacheHttpTest
> [ RUN      ] FetcherCacheHttpTest.HttpCachedSerialized
> HTTP/1.1 200 OK
> Date: Fri, 07 Aug 2015 17:23:05 GMT
> Content-Length: 30
> I0807 10:23:05.673596 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:05.675884 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:05.675897 182226944 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:05.683980 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 0
> Forked command at 54363
> sh -c './mesos-fetcher-test-cmd 0'
> E0807 10:23:05.694953 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54363)
> E0807 10:23:05.793927 184373248 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.590008 2085372672 exec.cpp:133] Version: 0.24.0
> E0807 10:23:06.592244 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> I0807 10:23:06.592243 353255424 exec.cpp:207] Executor registered on slave 
> 20150807-102305-139395082-52338-52313-S0
> E0807 10:23:06.597995 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Registered executor on 10.0.79.8
> Starting task 1
> Forked command at 54411
> sh -c './mesos-fetcher-test-cmd 1'
> E0807 10:23:06.608708 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> Command exited with status 0 (pid: 54411)
> E0807 10:23:06.707649 355938304 socket.hpp:173] Shutdown failed on fd=18: 
> Socket is not connected [57]
> ../../src/tests/fetcher_cache_tests.cpp:860: Failure
> Failed to wait 15secs for awaitFinished(task.get())
> *** Aborted at 1438968214 (unix time) try "date -d @1438968214" if you are 
> using GNU date ***
> [  FAILED  ] FetcherCacheHttpTest.HttpCachedSerialized (28685 ms)
> [ RUN      ] FetcherCacheHttpTest.HttpCachedConcurrent
> PC: @        0x113723618 process::Owned<>::get()
> *** SIGSEGV (@0x0) received by PID 52313 (TID 0x118d59000) stack trace: ***
>     @     0x7fff8fcacf1a _sigtramp
>     @     0x7f9bc3109710 (unknown)
>     @        0x1136f07e2 mesos::internal::slave::Fetcher::fetch()
>     @        0x113862f9d 
> mesos::internal::slave::MesosContainerizerProcess::fetch()
>     @        0x1138f1b5d 
> _ZZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS2_11ContainerIDERKNS2_11CommandInfoERKNSt3__112basic_stringIcNSC_11char_traitsIcEENSC_9allocatorIcEEEERK6OptionISI_ERKNS2_7SlaveIDES6_S9_SI_SM_SP_EENS_6FutureIT_EERKNS_3PIDIT0_EEMSW_FSU_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_ENKUlPNS_11ProcessBaseEE_clES1D_
>     @        0x1138f18cf 
> _ZNSt3__110__function6__funcIZN7process8dispatchI7NothingN5mesos8internal5slave25MesosContainerizerProcessERKNS5_11ContainerIDERKNS5_11CommandInfoERKNS_12basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEEERK6OptionISK_ERKNS5_7SlaveIDES9_SC_SK_SO_SR_EENS2_6FutureIT_EERKNS2_3PIDIT0_EEMSY_FSW_T1_T2_T3_T4_T5_ET6_T7_T8_T9_T10_EUlPNS2_11ProcessBaseEE_NSI_IS1G_EEFvS1F_EEclEOS1F_
>     @        0x1143768cf std::__1::function<>::operator()()
>     @        0x11435ca7f process::ProcessBase::visit()
>     @        0x1143ed6fe process::DispatchEvent::visit()
>     @        0x1127aaaa1 process::ProcessBase::serve()
>     @        0x114343b4e process::ProcessManager::resume()
>     @        0x1143431ca process::internal::schedule()
>     @        0x1143da646 _ZNSt3__114__thread_proxyINS_5tupleIJPFvvEEEEEEPvS5_
>     @     0x7fff95090268 _pthread_body
>     @     0x7fff950901e5 _pthread_start
>     @     0x7fff9508e41d thread_start
> Failed to synchronize with slave (it's probably exited)
> make[3]: *** [check-local] Segmentation fault: 11
> make[2]: *** [check-am] Error 2
> make[1]: *** [check] Error 2
> make: *** [check-recursive] Error 1
> {code}
> This was encountered just once out of 3+ {{make check}}s.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to