[ 
https://issues.apache.org/jira/browse/MESOS-912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13874417#comment-13874417
 ] 

Benjamin Mahler commented on MESOS-912:
---------------------------------------

Better to fix the cause than the symptom.

We do have a mechanism for suppressing SIGPIPE in those parts of our code where 
it is expected to occur (see signals.hpp for this suppression mechanism). But 
for everywhere else, we'd like to know when a SIGPIPE is occurring and 
understand the cause, it's not clear from the stacktrace above what caused this 
SIGPIPE.

> Slave sometimes crashes with SIGPIPE
> ------------------------------------
>
>                 Key: MESOS-912
>                 URL: https://issues.apache.org/jira/browse/MESOS-912
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.17.0
>         Environment: OSX 10.8.5
>            Reporter: Vinod Kone
>             Fix For: 0.17.0
>
>
> ➜  build git:(vinod/vote) ✗ ./bin/mesos-slave.sh --master=127.0.0.1:5055
> I0115 12:15:19.846664 2096390528 main.cpp:118] Build: 2014-01-14 16:52:48 by 
> vinod
> I0115 12:15:19.847189 2096390528 main.cpp:120] Creating "process" isolator
> I0115 12:15:19.847462 2096390528 main.cpp:132] Starting Mesos slave
> I0115 12:15:19.847807 2096390528 slave.cpp:111] Slave started on 
> 1)@172.25.27.97:5051
> I0115 12:15:19.848068 2096390528 slave.cpp:211] Slave resources: cpus(*):4; 
> mem(*):7168; disk(*):481998; ports(*):[31000-32000]
> I0115 12:15:19.852408 175071232 state.cpp:33] Recovering state from 
> '/tmp/mesos/meta'
> I0115 12:15:19.853726 175071232 status_update_manager.cpp:188] Recovering 
> status update manager
> I0115 12:15:19.853798 175071232 process_isolator.cpp:317] Recovering isolator
> I0115 12:15:19.853883 175071232 slave.cpp:2769] Finished recovery
> I0115 12:15:19.854004 173998080 slave.cpp:500] New master detected at 
> master@127.0.0.1:5055
> I0115 12:15:19.854161 175607808 status_update_manager.cpp:162] New master 
> detected at master@127.0.0.1:5055
> I0115 12:15:19.854220 173998080 slave.cpp:525] Detecting new master
> I0115 12:15:19.854409 175607808 slave.cpp:1966] master@127.0.0.1:5055 exited
> W0115 12:15:19.854440 175607808 slave.cpp:1969] Master disconnected! Waiting 
> for a new master to be elected
> W0115 12:15:19.854440 2096390528 logging.cpp:69] RAW: Received signal 
> SIGPIPE; escalating to SIGABRT
> *** Aborted at 1389816919 (unix time) try "date -d @1389816919" if you are 
> using GNU date ***
> PC: @     0x7fff98586d46 __kill
> *** SIGABRT (@0x7fff98586d46) received by PID 21391 (TID 0x7fff7cf46180) 
> stack trace: ***
>     @     0x7fff960b190a _sigtramp
>     @     0x7fff7bf03588 std::string::_Rep::_S_empty_rep_storage
>     @     0x7fff960b190a _sigtramp
>     @                0x0 (unknown)
>     @        0x10956046b process::ProcessManager::wait()
>     @        0x109566e7d process::wait()
>     @        0x10924760a main
>     @     0x7fff947cc7e1 start
>     @                0x2 (unknown)
> [1]    21391 abort      ./bin/mesos-slave.sh --master=127.0.0.1:5055



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to