Just want to surface this up to the dev@ thread to raise some awareness.
Recently with the SIGPIPE bug from libev [1], we've revisited whether it
makes sense to continue down the path of leaving SIGPIPE unblocked and
trying to handle it case by case.

We originally wanted users of libprocess to decide on their own whether
they want to ignore SIGPIPE. However, we'd like to reconsider:

(a) The amount of code that is needed to work around SIGPIPE is
substantial, especially because on OS X SIGPIPE appears to not be delivered
synchronously [2]. Also, it is not possible to create pipes that don't
surface SIGPIPE (unlike sockets), so in order to safely write to a pipe we
need to wrap write() calls with signal suppression blocks (which we don't
do in general!). You can get a sense of the code from [3] and [4].

(b) SIGPIPE seems to be more of a legacy mechanism to shut down a set of
piped programs and the general recommendation seems to be to not bother
with it and ignore it. Programs can handle EPIPE as they would with other
signals.

Would love to hear if there are any concerns. I will be glad to shepherd
James' changes here.

[1] https://issues.apache.org/jira/browse/MESOS-2768
[2] https://issues.apache.org/jira/browse/MESOS-2079
[3] https://reviews.apache.org/r/39940/diff/1#index_header
[4]
https://github.com/apache/mesos/blob/0.25.0/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/signals.hpp#L101

On Wed, Nov 4, 2015 at 9:20 AM, James Peach (JIRA) <j...@apache.org> wrote:

>
>     [
> https://issues.apache.org/jira/browse/MESOS-2079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14989947#comment-14989947
> ]
>
> James Peach edited comment on MESOS-2079 at 11/4/15 5:19 PM:
> -------------------------------------------------------------
>
> These patches global ignore {{SIGPIPE}} during libprocess initialization,
> document {{SIGPIPE}} behavior a bit more, and remove various signal
> manipulations that were formerly necessary for disabling {{SIGPIPE}}
> delivery.
>
> https://reviews.apache.org/r/39938/
> https://reviews.apache.org/r/39940/
> https://reviews.apache.org/r/39941/
>
>
>
> was (Author: jamespeach):
> https://reviews.apache.org/r/39938/
> https://reviews.apache.org/r/39940/
> https://reviews.apache.org/r/39941/
>
>
> > IO.Write test is flaky on OS X 10.10.
> > -------------------------------------
> >
> >                 Key: MESOS-2079
> >                 URL: https://issues.apache.org/jira/browse/MESOS-2079
> >             Project: Mesos
> >          Issue Type: Task
> >          Components: libprocess, technical debt, test
> >         Environment: OS X 10.10
> > {noformat}
> > $ clang++ --version
> > Apple LLVM version 6.0 (clang-600.0.54) (based on LLVM 3.5svn)
> > Target: x86_64-apple-darwin14.0.0
> > Thread model: posix
> > {noformat}
> >            Reporter: Benjamin Mahler
> >            Assignee: James Peach
> >              Labels: flaky
> >
> > [~benjaminhindman]: If I recall correctly, this is related to
> MESOS-1658. Unfortunately, we don't have a stacktrace for SIGPIPE currently:
> > {noformat}
> > [ RUN      ] IO.Write
> > make[5]: *** [check-local] Broken pipe: 13
> > {noformat}
> > Running in gdb, seems to always occur here:
> > {code}
> > Program received signal SIGPIPE, Broken pipe.
> > [Switching to process 56827 thread 0x60b]
> > 0x00007fff9a011132 in __psynch_cvwait ()
> > (gdb) where
> > #0  0x00007fff9a011132 in __psynch_cvwait ()
> > #1  0x00007fff903e7ea0 in _pthread_cond_wait ()
> > #2  0x000000010062f27c in Gate::arrive (this=0x101908a10, old=14780) at
> gate.hpp:82
> > #3  0x0000000100600888 in process::schedule (arg=0x0) at
> src/process.cpp:1373
> > #4  0x00007fff903e72fc in _pthread_body ()
> > #5  0x00007fff903e7279 in _pthread_start ()
> > #6  0x00007fff903e54b1 in thread_start ()
> > {code}
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>

Reply via email to