subject:"EPIPE"

Re: EPIPE

2000-05-13 Thread Michael Natterer


Raphael Quinet wrote:
> 
> On Thu, 11 May 2000, Michael Natterer <[EMAIL PROTECTED]> wrote:
> >
> > We don't use SA_NODEFER any more. And AFAIK the delivery of SIGCHLD has
> > nothing to do with cleaning up zombies. This is why we loop around
> > waitpid() because POSIX explicitly says that signals arriving close
> > together may be merged by the kernel.
> 
> The delivery of SIGCHLD does not clean up the zombies: this is done by
> calling waitpid() or any wait*() function.  But the call to waitpid()
> will be triggered by a SIGCHLD, and there is a potential race
> condition.  I will try to explain this in a slightly different way:
> regardless of the setting of SA_NODEFER, you cannot ensure that you
> will get one SIGCHLD for every dead process.  Some kernels merge the
> signals that arrive close together, and some of them mask the signal
> while the corresponding handler is being called (some systems will
> re-deliver the masked signal immediately after the handler returns,
> but this is not always the case).

Yeah, my confusing sentence above :) wanted to say: The # of SIGCHLD
arriving doesn't have to correspond with the # of zombies we have to
clean up. Of course just catching SIGCHLD doesn't touch the zombies
at all.

> Now, consider the following scenario for a race condition:
>   * The Gimp starts two plug-ins
>   * plug-in 1 dies
>   * SIGCHLD delivered, handler called
>   * inside the signal handler:
>   * - waitpid() returns the status of the first process
>   * - waitpid() returns 0 and the while() loop ends
>   * plug-in 2 dies (before the signal handler returns)
>   * SIGCHLD cannot be delivered
>   * - the signal handler returns
>   * The Gimp continues and the status of second plug-in is never
> collected, so this creates a zombie.
> Although it is rather unlikely that a context switch happens in the
> signal handler just after the while() loop and before returning, it
> can and will happen.
> 
> If you are sure that all kernels of the systems that Gimp supports
> will re-deliver the second SIGCHLD signal immediately after the signal
> handler returns, then there is no problem in the scenario described
> above: the handler will be called a second time and will collect the
> status of the second plug-in.  But I do not think so; that's why I
> suggested to call waitpid() outside the signal handler, because then
> you avoid the race condition.

Hm. I don't think that this is really a problem. The scenario above is,
as you say unlikely but will definitely happen. But _if_ it happens
we just have a stale zombie hanging around which should be cleaned up
the next time the handler is called.

> > > Yes...  I wrote a few months ago that I would change that and implement
> > > some kind of --enable-stack-trace option, but I never took the time to
> > > do it.
> >
> > Now it's there :) We just have to convert the remaining g_print() to write()
> > and the handler will be totally safe if enable_stack_trace == FALSE.
> 
> Hmmm...  I don't know how you have implemented it (I cannot look at
> the CVS code), but I was thinking of having more options for
> --enable-stack-trace, both for the configure option and for the
> command-line option:
> - "never": disable the stack trace and use the default signal handlers
> - "query": ask the user, as we are doing now.
> - "always": always call the debugger without asking any question (useful
>   if stdin does not work, but stdout is still OK)
> - "wait": always wait 30 seconds, without asking any question (this does
>   not use stdin or stdout and gives some time to the user if she
>   wants to attach a debugger to the process)
> If the code is compiled or run with the "never" option, then the
> signal handlers would never be installed.  The configure option would
> set the default value for this flag, but it would always be possible
> to override it with the command-line option without recompiling.  The
> default for the code in CVS and for the unstable tarballs should be
> "query", and the default for the stable releases could be "never" or
> "wait".

Wow, this is a bit more than what I was thinking about. I think this
would be really useful. Still willing to hack it if comeone kicks you ?-)
Ok, kick, kick, kick...

> > We should probably re-add the reentrancy guards in the fatal handlers and
> > just do a brute force exit() if it's called recursively (which can only
> > happen during the stack trace because that's the only case where the signals
> > are unblocked in the handler).
> 
> Another option would be to set the signals to SIG_DFL as soon as the
> fatal handler is called.  Since the default behavior for these signals
> is to kill the program, this would prevent the endless loop without
> requiring a special flag.

Yep, this sounds more useful than the current state (which just unblocks
the signals in the handler if the trace is called and causes the loop
if another signal arrives during and probably because of the trace).

> On

Re: EPIPE

2000-05-12 Thread Marc Lehmann

On Fri, May 12, 2000 at 05:43:20PM +0200, Raphael Quinet <[EMAIL PROTECTED]> wrote:
>   * The Gimp continues and the status of second plug-in is never
> collected, so this creates a zombie.

> If you are sure that all kernels of the systems that Gimp supports
> will re-deliver the second SIGCHLD signal immediately after the signal
> handler returns,

These kernel exists, but most likely gimp won't compile on them
anyway. There is a large body of programs relying on these
marginally-working-POSIX signals (bash or perl for example), and breaking
the expectations of signal blocking would most likely break these
programs.  However, the bash tries to workaround bugs in various versions
of waitpid

> status of the second plug-in.  But I do not think so; that's why I
> suggested to call waitpid() outside the signal handler, because then
> you avoid the race condition.

This would indeed be the most safe solution.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |

Re: EPIPE

2000-05-12 Thread Raphael Quinet

On Thu, 11 May 2000, Michael Natterer <[EMAIL PROTECTED]> wrote:
> Raphael Quinet wrote:
> > No, actually it is not safe on all operating systems: as I wrote
> > elsewhere, you cannot always rely on SA_NODEFER.  This means that in
> > some cases, you could miss a SIGCHLD signal that occurs while you are
> > still inside the handler but after the last test on waitpid().  If
> > this happens, the main app will not see that one of the plug-ins has
> > died (until another one dies and the handler collects the status for
> > both).  That's why it is safer to make the tests outside the signal
> > handler.  Otherwise, you could have a race condition on some systems
> > (very seldom, but still...).
> 
> We don't use SA_NODEFER any more. And AFAIK the delivery of SIGCHLD has
> nothing to do with cleaning up zombies. This is why we loop around
> waitpid() because POSIX explicitly says that signals arriving close
> together may be merged by the kernel.

The delivery of SIGCHLD does not clean up the zombies: this is done by
calling waitpid() or any wait*() function.  But the call to waitpid()
will be triggered by a SIGCHLD, and there is a potential race
condition.  I will try to explain this in a slightly different way:
regardless of the setting of SA_NODEFER, you cannot ensure that you
will get one SIGCHLD for every dead process.  Some kernels merge the
signals that arrive close together, and some of them mask the signal
while the corresponding handler is being called (some systems will
re-deliver the masked signal immediately after the handler returns,
but this is not always the case).

Now, consider the following scenario for a race condition:
  * The Gimp starts two plug-ins
  * plug-in 1 dies
  * SIGCHLD delivered, handler called
  * inside the signal handler:
  * - waitpid() returns the status of the first process
  * - waitpid() returns 0 and the while() loop ends
  * plug-in 2 dies (before the signal handler returns)
  * SIGCHLD cannot be delivered
  * - the signal handler returns
  * The Gimp continues and the status of second plug-in is never
collected, so this creates a zombie.
Although it is rather unlikely that a context switch happens in the
signal handler just after the while() loop and before returning, it
can and will happen.

If you are sure that all kernels of the systems that Gimp supports
will re-deliver the second SIGCHLD signal immediately after the signal
handler returns, then there is no problem in the scenario described
above: the handler will be called a second time and will collect the
status of the second plug-in.  But I do not think so; that's why I
suggested to call waitpid() outside the signal handler, because then
you avoid the race condition.

> > Yes...  I wrote a few months ago that I would change that and implement
> > some kind of --enable-stack-trace option, but I never took the time to
> > do it.
> 
> Now it's there :) We just have to convert the remaining g_print() to write()
> and the handler will be totally safe if enable_stack_trace == FALSE.

Hmmm...  I don't know how you have implemented it (I cannot look at
the CVS code), but I was thinking of having more options for
--enable-stack-trace, both for the configure option and for the
command-line option:
- "never": disable the stack trace and use the default signal handlers
- "query": ask the user, as we are doing now.
- "always": always call the debugger without asking any question (useful
  if stdin does not work, but stdout is still OK)
- "wait": always wait 30 seconds, without asking any question (this does
  not use stdin or stdout and gives some time to the user if she
  wants to attach a debugger to the process)
If the code is compiled or run with the "never" option, then the
signal handlers would never be installed.  The configure option would
set the default value for this flag, but it would always be possible
to override it with the command-line option without recompiling.  The
default for the code in CVS and for the unstable tarballs should be
"query", and the default for the stable releases could be "never" or
"wait".

> We should probably re-add the reentrancy guards in the fatal handlers and
> just do a brute force exit() if it's called recursively (which can only
> happen during the stack trace because that's the only case where the signals
> are unblocked in the handler).

Another option would be to set the signals to SIG_DFL as soon as the
fatal handler is called.  Since the default behavior for these signals
is to kill the program, this would prevent the endless loop without
requiring a special flag.

On Fri, 12 May 2000, Michael Natterer <[EMAIL PROTECTED]> wrote:
[...]
> So ignoring SIGPIPE in the app causes it to be ignored in children, too.
> To avoid the need of resetting the signal we could just connect it
> to a dummy handler and let exec() do it's job of resetting it.

Well, as I wrote above and in a previous mail, there is a convenient
SIG_DFL that exits for that purpose

Re: EPIPE

2000-05-12 Thread Michael Natterer

On Fri, 12 May 2000, Marc Lehmann wrote:

> On Thu, May 11, 2000 at 09:05:58PM +0200, Michael Natterer 
><[EMAIL PROTECTED]> wrote:
> > > The libgimp code could try to set the signal handler to SIG_DFL before
> > > executing the code of the plug-in.
> > 
> > We don't need to do this, as (exec()'ed) children can't inherit handlers
> > from their parent anyway.
> 
> If you ignore signals (you said so) this is very well inherited by
> children (this is how programs like nohup work). However, this is not
> necessarily a problem, of course.  It might just be unexpected by libgimp
> in the child, for example, or by plug-ins, and most probably it's not the
> intented action to ignore any signals by default (where the default action
> isn't ignore) in the plug-ins.

Oops, you're right. Of course only signals which are connected to a
handler are reset to their default disposition on exec() at.al.

So ignoring SIGPIPE in the app causes it to be ignored in children, too.
To avoid the need of resetting the signal we could just connect it
to a dummy handler and let exec() do it's job of resetting it.

> I would really prefer leaving most of the signal handlers intact (or under
> program control). It would be nice if a plug-in could choose to call
> 
>   gimp_signal_safetynet ();
> 
> If it wanted all it's signals to be clobbered. And gimp_main could change
> the signals it really requires. The problem is that now, a call to
> gimp_main (which doesn't return) changes signal handlers without leaving
> the caller the option of not (or the option of later re-setting them
> again, since "later" might be never)

Hm, we install the signal stuff before the call to gimp_loop() which in
turn calls the plugin's run() function after which the plugin is
closed anyway.

So why can't you just do the signal setting which have to be special
in the run() procedure. There is nothing really happening in between
(well, except the config message arriving but that shouldn't hurt).

> > > signal handlers installed by the Gimp should call the ones that were
> > > installed before.  But then we have to support the systems that have
> > 
> > Do you really think this is neccessary?
> 
> I can imagine that gimp needs control over a few signals (although I
> cannot imagine which ones at the moment). The current situation, however,
> is, that gimp (or glib?) changes many signals in plug-ins (like SIGPIPE)
> without any visible benefit. Inherited settings are not the problem here
> (no handlers).

Plugins connect fatal signals to the fatal handler (stacktrace),
ignore SIGPIPE to get notified of a crashing gimp by the new
gimp_plugin_io_error_handler() (because SIGPIPE didn't work)
and they connect SIGCHLD to a handler which is identical to the
one in the main app (to clean up zombies).

> > I'd rather like to wait it the current code (which is not really
> > different from the old one except that it uses sigaction()) works as
> > expected.
> 
> It's indeed the behaviour of the old code that's bugging me ;) But I get
> a bit nervous when the number of clobbered signals increases, rather then
> decreases.

Only SIGCHLD and SIGPIPE disposition has changed in plugins. But I'm
of course willing to change my mind if someone shows me where the
problem is.

(the problem may be that I still don't understand why you think
you cannot change gimp_main()'s signal settings :)

bye,
--Mitch

Re: EPIPE

2000-05-12 Thread Austin Donnelly

On Friday, 12 May 2000, Marc Lehmann wrote:

> [1: plugins should have the system default signal handers, not gimp ones]
> [2: main gimp app should have system default SIGPIPE handler]

Point 1: I mainly agree with this.  I can't think of a need to hook
  any signals.  Did plugins used to give backtraces when they crashed?
  I think they did: if so, and people want to keep this functionality,
  then unfortunately plugins need to hook signals in a similar manner
  to the main gimp app.  However, plugins definitely won't benefit
  from SIGCHLD handers or ignoring SIGPIPE.  If a plugin tried to
  write to the gimp pipe and gets a SIGPIPE, it's because the main
  gimp app has died.  The plugin should die as swiftly as possible, in
  this case.

Point 2: The main gimp app should ignore SIGPIPE, so that an EPIPE can
  be generated in a controlled manner to be collected and acted upon
  by the plugin management routines in plug_in.c

With all this vigorous discussion going on, I seem to be a little
confused.  Can someone (Mitch, Garry?) summarise what the current CVS
signal handling situation is, and what changes (if any) are planned?

Austin

Re: EPIPE

2000-05-11 Thread Marc Lehmann

On Thu, May 11, 2000 at 09:05:58PM +0200, Michael Natterer <[EMAIL PROTECTED]> 
wrote:
> > The libgimp code could try to set the signal handler to SIG_DFL before
> > executing the code of the plug-in.
> 
> We don't need to do this, as (exec()'ed) children can't inherit handlers
> from their parent anyway.

If you ignore signals (you said so) this is very well inherited by
children (this is how programs like nohup work). However, this is not
necessarily a problem, of course.  It might just be unexpected by libgimp
in the child, for example, or by plug-ins, and most probably it's not the
intented action to ignore any signals by default (where the default action
isn't ignore) in the plug-ins.

I would really prefer leaving most of the signal handlers intact (or under
program control). It would be nice if a plug-in could choose to call

  gimp_signal_safetynet ();

If it wanted all it's signals to be clobbered. And gimp_main could change
the signals it really requires. The problem is that now, a call to
gimp_main (which doesn't return) changes signal handlers without leaving
the caller the option of not (or the option of later re-setting them
again, since "later" might be never)

> > signal handlers installed by the Gimp should call the ones that were
> > installed before.  But then we have to support the systems that have
> 
> Do you really think this is neccessary?

I can imagine that gimp needs control over a few signals (although I
cannot imagine which ones at the moment). The current situation, however,
is, that gimp (or glib?) changes many signals in plug-ins (like SIGPIPE)
without any visible benefit. Inherited settings are not the problem here
(no handlers).

> I'd rather like to wait it the current code (which is not really
> different from the old one except that it uses sigaction()) works as
> expected.

It's indeed the behaviour of the old code that's bugging me ;) But I get
a bit nervous when the number of clobbered signals increases, rather then
decreases.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |

Re: EPIPE

2000-05-11 Thread Marc Lehmann

On Thu, May 11, 2000 at 08:33:54PM +0200, Michael Natterer <[EMAIL PROTECTED]> 
wrote:
> But we don't do bad things in the SIGCHLD handler anyway
> (no need for sharing data with the app) and in the fatal handler, we just
> quit the app.

"I'll put all my trust in you" ;)

> > > The reason to ignore SIGPIPE was to make plugins to bahave like
> > > the main app did before :)
> > 
> > I don't think plug-ins should behave like the main application ;) They
> > are, after all, plug-ins.
> 
> This was a joke.

Same here ;)

> But mysteriously, it didn't work. Gimp was always connecting SIGPIPE to
> the fatal signal handler and we _never_ got the signal, though.

Then _is_ SIGPIPE being ignored or not just now? (confused...). However, I
think not connecting SIGPIPE to _anything_ at all is not a good solution.
Unless the plug-in needs to do some specific cleanup just terminating
silently seems like the best behaviour.

And if SIGPIPE isn't sent, changeing it's handler is even less useful.

> You can of course set your own signal handlers or reset the signals to
> their default behaviour.

This is very difficult, since libgimp changes the signal handlers after
claling gimp_main, and control doesn't return to the main application very
determistically :(

> No it hasn't. But as SIGPIPE didn't work (probably because of glib doing
> strange stuff in the main loop or whatever), I had to find another way
> (which seems to work quite nice so far).

Even better, so SIGPIPE is left alone (I presume).

> Nope, signals are set back to their default behaviour when doing an exec().

Hmm, AFAICS gimp simply calls fork and then execvp. Nothing signal related
at all (and the plug-in inherits the signal behaviour)

> > handlers on any plug-in" behaviour is making life very difficult (and this
> > has been the case before you started to change anything!).
> 
> Good to hear that. I tested the perl plugins with the new code and it
> seemed to work, but it feels better to hear it from da perl man :)

Hah! you probably tested it more thoroughly than me, so watch out for the
_real_ complaints! ;)

> > I don't see why replacing the default behaviour of terminating the process
> > is worse than installing a signal handler that just terminates the
> > process anyway.
> 
> And does a stacktrace.

Nobody needs a stacktrace from gimp + as many stacktraces as plug-ins are
currently running. Having a stacktrace for sigpipe in a plug-in seems
counterproductive.

> We can remove them in the final release.  I would however prefer to
> leave them there (because the final release is unlikely to be bug-free).

I still get endless loops in about 2-10% of all crashes (As I said gimp
rarely crashes compared to some months ago). I just close my terminal
window since that seems to really kill it in most cases, but sometimes
gimp crashes and eats 100% cpu power for a long time, until I can detect
that it's still running somewhere in the background (printing messages to
nowehere).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |

Re: EPIPE

2000-05-11 Thread Marc Lehmann

On Thu, May 11, 2000 at 08:54:44PM +0200, Michael Natterer <[EMAIL PROTECTED]> 
wrote:
> > Yes...  I wrote a few months ago that I would change that and implement
> > some kind of --enable-stack-trace option, but I never took the time to
> > do it.
> 
> Now it's there :) We just have to convert the remaining g_print() to write()
> and the handler will be totally safe if enable_stack_trace == FALSE.

Raphael, Mitch. Just to be clear: what _I_ need is a way to not have the
plug-in signals changed behind my back. Installing signal handlers for
unsuspecting signals, therefore, is my _problem_. Even moving signal
initialization to somewhere before gimp_main would improve my situation.
Making it optional in a plug-in would be best.

I also argued (not independently enough, however, as the topics are
similar) that the stack-tracing stuff (and I meant cathing each and every
fatal signal) makes a lot of problems for the main _gimp_ app, as it's
often not working (no way to get out of the prompt, no sensible stack
trace anyway). It does not limit my "programming creativeness" (if there
is any), like the first problem. I am just afraid that getting a SIGSEGV
is better than looping around in the background.

Catching signals and not offering a menu (as mitch seemed to imply) does not
change the situation a bit.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |

Re: EPIPE

2000-05-11 Thread Marc Lehmann

(Sorry, I received mitch's mail befoee this one!)

On Thu, May 11, 2000 at 07:23:48PM +0200, Raphael Quinet <[EMAIL PROTECTED]> wrote:
> Hmmm...  In the last discussion, I said that I would implement a
> compile-time and a run-time option for choosing this, because some
> people (including myself) do like the option to have a stack trace
> (when it works).

However, since there were such a large number of problems where you had to
kill -9 gimp because it tried to catch all and everything, wouldn't it be
safer to not try to play games, at least with the more problematic signals
like SIGSEGV or SIGILL?

(OTOH, gimp crashes ever more rarely the closer 1.2 gets!)

> But I did not take the time to do it.  If you kick me enough, I will do
> it... :-)

a compiletime option (per-plug-in) would be perfect, yes, yes, yes! please,
please, please My problems are related to not being able to catch many
signals (TERM, PIPE) in my plug-ins (gimp-perl as well as some specialised
perl-plug-ins like the Perl-Server!)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |

Re: EPIPE

2000-05-11 Thread Marc Lehmann


On Thu, May 11, 2000 at 06:48:17PM +0200, Raphael Quinet <[EMAIL PROTECTED]> wrote:
> Note that being MT safe is not enough.  For the 4 signals that are
> listed above, you can usually expect that your memory is already
> corrupted.  So if you want to minimize the risks of crashing

If it's important enough to reliably catch these (just for the benefit of
developers...) then one could try to use sigaltstack where available to at
least provide safe local storage.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |

Re: EPIPE

2000-05-11 Thread Tim Mooney


In regard to: Re: EPIPE, Michael Natterer said (at 9:05pm on May 11, 2000):

>> Another solution is to save and re-install the old signal handler.  We
>> may even want to support some kind of chain of signal handers: the new
>> signal handlers installed by the Gimp should call the ones that were
>> installed before.  But then we have to support the systems that have
>> SA_SIGINFO and those that do not have it (this is not part of POSIX).
>> The systems that have SA_SIGINFO are using extended prototypes for the
>> signal handlers, so some #ifdef's will be necessary if we want to do
>> that in the right way.  I could contribute some code if you want,
>> because I already did that in some other projects.
>
>Do you really think this is neccessary? I'd rather like to wait it the
>current code (which is not really different from the old one except
>that it uses sigaction()) works as expected.
>
>I guess Tim with his hardware zoo will be able to answer the question :)

I'll be checking it this weekend, and I'll definitely report to the "signal
crew" on my findings.  :-)

Tim
-- 
Tim Mooney  [EMAIL PROTECTED]
Information Technology Services (701) 231-1076 (Voice)
Room 242-J1, IACC Building  (701) 231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164

Re: EPIPE

2000-05-11 Thread Michael Natterer


Raphael Quinet wrote:
> 
> On Thu, 11 May 2000, Marc Lehmann <[EMAIL PROTECTED]> wrote:
> [...]
> > What could be the case, however, is, that gimp itself does not reset it's
> > signal handlers when it execs the plug-ins. If this is the case, then this
> > bug (not restoring signal handlers to their default) might cause many other
> > subtle problems.
> 
> The libgimp code could try to set the signal handler to SIG_DFL before
> executing the code of the plug-in.

We don't need to do this, as (exec()'ed) children can't inherit handlers
from their parent anyway.

> Another solution is to save and re-install the old signal handler.  We
> may even want to support some kind of chain of signal handers: the new
> signal handlers installed by the Gimp should call the ones that were
> installed before.  But then we have to support the systems that have
> SA_SIGINFO and those that do not have it (this is not part of POSIX).
> The systems that have SA_SIGINFO are using extended prototypes for the
> signal handlers, so some #ifdef's will be necessary if we want to do
> that in the right way.  I could contribute some code if you want,
> because I already did that in some other projects.

Do you really think this is neccessary? I'd rather like to wait it the
current code (which is not really different from the old one except
that it uses sigaction()) works as expected.

I guess Tim with his hardware zoo will be able to answer the question :)

And BTW, this gimp-developer traffic is cool. Somehow remembers me of
the old days when we used to hack new features :)

bye,
--Mitch

Re: EPIPE

2000-05-11 Thread Michael Natterer


Raphael Quinet wrote:
> 
> On Thu, 11 May 2000, Michael Natterer <[EMAIL PROTECTED]> wrote:
> >
> > This is what currently happens (ok, it happens in the handler, but WNOHANG
> > *should* be absolutely safe).
> 
> No, actually it is not safe on all operating systems: as I wrote
> elsewhere, you cannot always rely on SA_NODEFER.  This means that in
> some cases, you could miss a SIGCHLD signal that occurs while you are
> still inside the handler but after the last test on waitpid().  If
> this happens, the main app will not see that one of the plug-ins has
> died (until another one dies and the handler collects the status for
> both).  That's why it is safer to make the tests outside the signal
> handler.  Otherwise, you could have a race condition on some systems
> (very seldom, but still...).

We don't use SA_NODEFER any more. And AFAIK the delivery of SIGCHLD has
nothing to do with cleaning up zombies. This is why we loop around
waitpid() because POSIX explicitly says that signals arriving close
together may be merged by the kernel.

> [...]
> > The usage of SIGCLD is strongly discouraged by Stevens and some Solaris
> > document I fould recently. But Gimp uses SIGCHLD anyway.
> 
> And here is an excerpt from /usr/include/sys/signal.h on Solaris 2.6:
> 
> #define SIGCLD  18  /* child status change */
> #define SIGCHLD 18  /* child status change alias (POSIX) */
> 
> So it does not make much of a difference under recent versions of
> Solaris, at least... :-)  But they still say in some docs that the
> behavior of SIGCLD might change in future releases, so the POSIX
> version should be used in new programs.

Yeah, although Solaris is definitely more POSIX than Linux, they have
to cook their own stuff :) Stevens calls SIGCLD a "questionable practice".

> > > In one application that wanted to catch SIGSEGV, SIGBUS, SIGILL and
> > > SIGFPE, I created a handler that uses a direct call to write() on an
> > > fd that was previously obtained from fileno(stderr) (this fd is saved
> > > early so that the write() call can work even if the FILE *stderr is
> > > overwritten with garbage).  Doing this is safe, AFAIK.
> >
> > Yep, write() is safe. Gimp uses g_print() which is not really safe, but
> > then we call g_on_error_query() which definitely does a bit more than
> > what's allowed :)
> 
> Yes...  I wrote a few months ago that I would change that and implement
> some kind of --enable-stack-trace option, but I never took the time to
> do it.

Now it's there :) We just have to convert the remaining g_print() to write()
and the handler will be totally safe if enable_stack_trace == FALSE.

> > >From glib/gerror.c:
> >
> > /*
> >  * MT safe ; except for g_on_error_stack_trace, but who wants thread safety
> >  * then
> >  */
> 
> Note that being MT safe is not enough.  For the 4 signals that are
> listed above, you can usually expect that your memory is already
> corrupted.  So if you want to minimize the risks of crashing
> recursively inside the signal handler, you should avoid using
> variables as much as possible.  A handler for SIGSEGV is a good place
> for paranoia...

Like getting another SIGSEGV while glib asks if it should do a stack trace ?-)
That's e-v-i-l. But I'm afraid we either have a waterproof SIGSEGV handler
or the trace, not both :(

We should probably re-add the reentrancy guards in the fatal handlers and
just do a brute force exit() if it's called recursively (which can only happen
during the stack trace because that's the only case where the signals are
unblocked in the handler).

> The other signal handlers do not need so much defensive programming.
> Being MT safe is usually enough.

I think so, too. The SIGCHLD handler is so trivial that it can hardly
cause any harm.

bye,
--Mitch

Re: EPIPE

2000-05-11 Thread Michael Natterer

Marc Lehmann wrote:
> 
> On Thu, May 11, 2000 at 12:47:22PM +0200, Michael Natterer 
><[EMAIL PROTECTED]> wrote:
> > > The only type that is atomic is sig_atomic_t. Everything else is not atomic
> > > one at least one target where gimp runs. Limiting gimp to gnu-libc-platforms
> > > looks very bad.
> >
> > See my other mail. I meant atomic data access which simply has to be
> > the case or the os will go berserk :)
> 
> If you are free to use the type you want, I would opt for sig_atomic_t,
> which should be available on all platforms that gimp targets (of course,
> this is only what I believe ;)

And you're right. But we don't do bad things in the SIGCHLD handler anyway
(no need for sharing data with the app) and in the fatal handler, we just
quit the app.

> > The reason to ignore SIGPIPE was to make plugins to bahave like
> > the main app did before :)
> 
> I don't think plug-ins should behave like the main application ;) They
> are, after all, plug-ins.

This was a joke. Gimp used to die on dying children. But that is what plugins
should do (die on dying gimp) :)

> > They should terminate themselves when Gimp crashes.
> 
> Since the default action of SIGPIPE is to terminate the process, I
> really think that ignoring the signal won't improve the situation. To
> the contrary, changing all sorts of signals behind the applications
> back (remember that there is no way to reset the signal handlers to the
> values an application likes, as they are reset inside gimp_main!) creates
> headaches for non-trviial plug-ins (like some of mine... ;)

But mysteriously, it didn't work. Gimp was always connecting SIGPIPE to
the fatal signal handler and we _never_ got the signal, though.
You can of course set your own signal handlers or reset the signals to
their default behaviour.

> > So I inotrduced a handler for IO errors on the read channel which
> > doesn't let them hang around as they used to do before.
> 
> This has nothing to do with SIGPIPE, however. If the process didn't receive a
> SIGPIPE before (the default action is to terminate the process), ignoring it
> will have no effect.

No it hasn't. But as SIGPIPE didn't work (probably because of glib doing
strange stuff in the main loop or whatever), I had to find another way
(which seems to work quite nice so far).

> What could be the case, however, is, that gimp itself does not reset it's
> signal handlers when it execs the plug-ins. If this is the case, then this
> bug (not restoring signal handlers to their default) might cause many other
> subtle problems.

Nope, signals are set back to their default behaviour when doing an exec().
This is not the case for blocked signals but we don't block signals, we just
ignore them.

> > But I never claimed the new code to be the correct and final solution.
> > So if it breaks perl in any way, we will of course have to change it
> > again.
> 
> It doesn't break perl. However, the whole "we will force our %$&§%& signal
> handlers on any plug-in" behaviour is making life very difficult (and this
> has been the case before you started to change anything!).

Good to hear that. I tested the perl plugins with the new code and it
seemed to work, but it feels better to hear it from da perl man :)

> I thought the last time this was discussed we agreed that, in the release,
> the signal handling code that causes frustration and endless loops will be
> disabled by default? Most of these signals have very sane default actions
> (like SIGPIPE).
> 
> I don't see why replacing the default behaviour of terminating the process
> is worse than installing a signal handler that just terminates the
> process anyway.

And does a stacktrace. We can remove them in the final release.
I would however prefer to leave them there (because the final release
is unlikely to be bug-free).

bye,
--Mitch

Re: EPIPE

2000-05-11 Thread Raphael Quinet

On Thu, 11 May 2000, Marc Lehmann <[EMAIL PROTECTED]> wrote:
[...]
> What could be the case, however, is, that gimp itself does not reset it's
> signal handlers when it execs the plug-ins. If this is the case, then this
> bug (not restoring signal handlers to their default) might cause many other
> subtle problems.

The libgimp code could try to set the signal handler to SIG_DFL before
executing the code of the plug-in.

Another solution is to save and re-install the old signal handler.  We
may even want to support some kind of chain of signal handers: the new
signal handlers installed by the Gimp should call the ones that were
installed before.  But then we have to support the systems that have
SA_SIGINFO and those that do not have it (this is not part of POSIX).
The systems that have SA_SIGINFO are using extended prototypes for the
signal handlers, so some #ifdef's will be necessary if we want to do
that in the right way.  I could contribute some code if you want,
because I already did that in some other projects.

[...]
> I thought the last time this was discussed we agreed that, in the release,
> the signal handling code that causes frustration and endless loops will be
> disabled by default? Most of these signals have very sane default actions
> (like SIGPIPE).

Hmmm...  In the last discussion, I said that I would implement a
compile-time and a run-time option for choosing this, because some
people (including myself) do like the option to have a stack trace
(when it works).  But I did not take the time to do it.  If you kick
me enough, I will do it...  :-)

-Raphael

Re: EPIPE

2000-05-11 Thread Marc Lehmann

On Thu, May 11, 2000 at 12:47:22PM +0200, Michael Natterer <[EMAIL PROTECTED]> 
wrote:
> > The only type that is atomic is sig_atomic_t. Everything else is not atomic
> > one at least one target where gimp runs. Limiting gimp to gnu-libc-platforms
> > looks very bad.
> 
> See my other mail. I meant atomic data access which simply has to be
> the case or the os will go berserk :)

If you are free to use the type you want, I would opt for sig_atomic_t,
which should be available on all platforms that gimp targets (of course,
this is only what I believe ;)

> The reason to ignore SIGPIPE was to make plugins to bahave like
> the main app did before :)

I don't think plug-ins should behave like the main application ;) They
are, after all, plug-ins.

> They should terminate themselves when Gimp crashes.

Since the default action of SIGPIPE is to terminate the process, I
really think that ignoring the signal won't improve the situation. To
the contrary, changing all sorts of signals behind the applications
back (remember that there is no way to reset the signal handlers to the
values an application likes, as they are reset inside gimp_main!) creates
headaches for non-trviial plug-ins (like some of mine... ;)

> So I inotrduced a handler for IO errors on the read channel which
> doesn't let them hang around as they used to do before.

This has nothing to do with SIGPIPE, however. If the process didn't receive a
SIGPIPE before (the default action is to terminate the process), ignoring it
will have no effect.

What could be the case, however, is, that gimp itself does not reset it's
signal handlers when it execs the plug-ins. If this is the case, then this
bug (not restoring signal handlers to their default) might cause many other
subtle problems.

> But I never claimed the new code to be the correct and final solution.
> So if it breaks perl in any way, we will of course have to change it
> again.

It doesn't break perl. However, the whole "we will force our %$&§%& signal
handlers on any plug-in" behaviour is making life very difficult (and this
has been the case before you started to change anything!).

I thought the last time this was discussed we agreed that, in the release,
the signal handling code that causes frustration and endless loops will be
disabled by default? Most of these signals have very sane default actions
(like SIGPIPE).

I don't see why replacing the default behaviour of terminating the process
is worse than installing a signal handler that just terminates the
process anyway.

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |

Re: EPIPE

2000-05-11 Thread Raphael Quinet

On Thu, 11 May 2000, Michael Natterer <[EMAIL PROTECTED]> wrote:
> Raphael Quinet wrote:
[...]
> > - SIGCHLD or SIGCLD: a child process died.  This signal can be
> >   delivered at any time.  Some systems do not provide a reliable way
> >   to know how many processes exited (if they do not support SA_NODEFER
> >   or if their waitpid() or wait3() calls are broken), so it is usually
> >   better to simply set a flag in the signal handler (without calling
> >   any wait*() function) and to check the status of the children outside
> >   the signal handler, until some wait*() function reports that there
> >   are no more dead processes.  For example:
> > while ((pid = waitpid (-1, &status, WNOHANG)) > 0)
> >   { ... /* check WIFEXITED(status) and other things */ }
> 
> This is what currently happens (ok, it happens in the handler, but WNOHANG
> *should* be absolutely safe).

No, actually it is not safe on all operating systems: as I wrote
elsewhere, you cannot always rely on SA_NODEFER.  This means that in
some cases, you could miss a SIGCHLD signal that occurs while you are
still inside the handler but after the last test on waitpid().  If
this happens, the main app will not see that one of the plug-ins has
died (until another one dies and the handler collects the status for
both).  That's why it is safer to make the tests outside the signal
handler.  Otherwise, you could have a race condition on some systems
(very seldom, but still...).

[...]
> The usage of SIGCLD is strongly discouraged by Stevens and some Solaris
> document I fould recently. But Gimp uses SIGCHLD anyway.

And here is an excerpt from /usr/include/sys/signal.h on Solaris 2.6:

#define SIGCLD  18  /* child status change */
#define SIGCHLD 18  /* child status change alias (POSIX) */

So it does not make much of a difference under recent versions of
Solaris, at least... :-)  But they still say in some docs that the
behavior of SIGCLD might change in future releases, so the POSIX
version should be used in new programs.

> > In one application that wanted to catch SIGSEGV, SIGBUS, SIGILL and
> > SIGFPE, I created a handler that uses a direct call to write() on an
> > fd that was previously obtained from fileno(stderr) (this fd is saved
> > early so that the write() call can work even if the FILE *stderr is
> > overwritten with garbage).  Doing this is safe, AFAIK.
> 
> Yep, write() is safe. Gimp uses g_print() which is not really safe, but
> then we call g_on_error_query() which definitely does a bit more than
> what's allowed :)

Yes...  I wrote a few months ago that I would change that and implement
some kind of --enable-stack-trace option, but I never took the time to
do it.

> >From glib/gerror.c:
> 
> /* 
>  * MT safe ; except for g_on_error_stack_trace, but who wants thread safety 
>  * then
>  */

Note that being MT safe is not enough.  For the 4 signals that are
listed above, you can usually expect that your memory is already
corrupted.  So if you want to minimize the risks of crashing
recursively inside the signal handler, you should avoid using
variables as much as possible.  A handler for SIGSEGV is a good place
for paranoia...

The other signal handlers do not need so much defensive programming.
Being MT safe is usually enough.

> killall -SIGGIMP ps

:-)

-Raphael

Re: EPIPE

2000-05-11 Thread Michael Natterer

Marc Lehmann wrote:
> 
> >However, a signal handler can do whatever it likes with the app's structures
> >as long as it uses atomic data access (which can be a pointer, as pointers
> >have the same size as integers, which are atomic. This is true at least on
> >all processors which have a GNU libc port and finding a processor
> >where pointers are not atomic looks very unlikely to me).
> 
> The only type that is atomic is sig_atomic_t. Everything else is not atomic
> one at least one target where gimp runs. Limiting gimp to gnu-libc-platforms
> looks very bad.

See my other mail. I meant atomic data access which simply has to be
the case or the os will go berserk :)

> BTW: what's the reason for messing with SIGPIPE in plug-ins? Wasn't the
> original idea to not mess with signals in the release at all? I suffer
> quite a lot because libgimp resets signal handlers at places where you
> cannot override it ;)

The reason to ignore SIGPIPE was to make plugins to bahave like
the main app did before :) They should terminate themselves when Gimp
crashes. So I inotrduced a handler for IO errors on the read channel
which doesn't let them hang around as they used to do before.

But I never claimed the new code to be the correct and final solution.
So if it breaks perl in any way, we will of course have to change it
again.

bye,
--Mitch

Re: EPIPE

2000-05-11 Thread Michael Natterer


Tim Mooney wrote:
> 
> In regard to: Re: EPIPE, Michael Natterer said (at 12:40am on May 11, 2000):
> 
> >This is what currently happens (ok, it happens in the handler, but WNOHANG
> >*should* be absolutely safe).
> >However, a signal handler can do whatever it likes with the app's structures
> >as long as it uses atomic data access (which can be a pointer, as pointers
> >have the same size as integers, which are atomic. This is true at least on
> >all processors which have a GNU libc port and finding a processor
> >where pointers are not atomic looks very unlikely to me).
> 
> Finding a processor/OS combo where sizeof(pointer) != sizeof(int) is pretty
> easy, however.  How does this change your thinking?

No substantially :) I meant atomic access. A 64bit Processor will still
have to access it's pointers atomically even if they are larger than an
integer.

> >The usage of SIGCLD is strongly discouraged by Stevens and some Solaris
> >document I fould recently. But Gimp uses SIGCHLD anyway.
> 
> SIGCLD != SIGCHLD.  They're distant relations, but that's it.  Is that what
> you were trying to say?  Since you apparently have access to Stevens' APUE,
> look at section 10.7 again if you don't understand what I'm talking about.

Yep, this is my source of info. The only common thing between SIGCHLD
and SIGCLD is that they have something to do with children. SIGCLD
is nonstandard and should therefore not be used.

bye,
--Mitch

Re: EPIPE

2000-05-10 Thread Marc Lehmann


On Wed, May 10, 2000 at 05:19:11PM +0200, Raphael Quinet <[EMAIL PROTECTED]> wrote:
> - SIGABRT: usually triggered by the application calling abort() or by a
>   user who wants to get a core dump from a running process.  It can be

The official signal to get a coredump without process interaction is
SIGQUIT (you can send a SIGQUIT via the keyboard, but you cnanot send
SIGABORT).

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |

Re: EPIPE

2000-05-10 Thread Tim Mooney

In regard to: Re: EPIPE, Nick Lamb said (at 2:57am on May 11, 2000):

>On Wed, May 10, 2000 at 07:15:57PM -0500, Tim Mooney wrote:
>> Finding a processor/OS combo where sizeof(pointer) != sizeof(int) is pretty
>> easy, however.  How does this change your thinking?
>
>Wouldn't atomicity guarantees be a processor feature, and hence tied
>to word size (probably pointer width if you are taking full advantage
>of your CPU) rather than whatever CC might think sizeof(int) is ?
>
>If I've completely forgotten my architecture course, don't hesitate
>to write me a long rant, off-list of course...

Nick-

You may very well be completely correct, but it's essentially circumstantial
to the point I was trying to make.  My whole goal with the text you quoted
was trying to dissuade anyone from making the "All the world's a ILP32 OS"
assumption, which seemed where things might be headed.  These days, recent
versions of Unix from Sun, HP, SGI, IBM, and Compaq (as well as other's I'm
not familiar with) on recent hardware are all LP64 or LP64-capable.  Making
pointer size assumptions was never a good idea, but it's especially problematic
today.

Linux and some of the BSD derivatives also have plans to be LP64 some day too,
so even Linux-only developers want to keep a mindful eye on code portability to
future platforms.

Tim
-- 
Tim Mooney  [EMAIL PROTECTED]
Information Technology Services (701) 231-1076 (Voice)
Room 242-J1, IACC Building  (701) 231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164

Re: EPIPE

2000-05-10 Thread Marc Lehmann


>However, a signal handler can do whatever it likes with the app's structures
>as long as it uses atomic data access (which can be a pointer, as pointers
>have the same size as integers, which are atomic. This is true at least on
>all processors which have a GNU libc port and finding a processor
>where pointers are not atomic looks very unlikely to me).

The only type that is atomic is sig_atomic_t. Everything else is not atomic
one at least one target where gimp runs. Limiting gimp to gnu-libc-platforms
looks very bad.

BTW: what's the reason for messing with SIGPIPE in plug-ins? Wasn't the
original idea to not mess with signals in the release at all? I suffer
quite a lot because libgimp resets signal handlers at places where you
cannot override it ;)

-- 
  -==- |
  ==-- _   |
  ---==---(_)__  __   __   Marc Lehmann  +--
  --==---/ / _ \/ // /\ \/ /   [EMAIL PROTECTED] |e|
  -=/_/_//_/\_,_/ /_/\_\   XX11-RIPE --+
The choice of a GNU generation   |
 |

Re: EPIPE

2000-05-10 Thread Nick Lamb

On Wed, May 10, 2000 at 07:15:57PM -0500, Tim Mooney wrote:
> Finding a processor/OS combo where sizeof(pointer) != sizeof(int) is pretty
> easy, however.  How does this change your thinking?

Wouldn't atomicity guarantees be a processor feature, and hence tied
to word size (probably pointer width if you are taking full advantage
of your CPU) rather than whatever CC might think sizeof(int) is ?

If I've completely forgotten my architecture course, don't hesitate
to write me a long rant, off-list of course...

Nick.

Re: EPIPE

2000-05-10 Thread Robert L Krawitz


   Date: Wed, 10 May 2000 19:15:57 -0500 (CDT)
   From: Tim Mooney <[EMAIL PROTECTED]>
   cc: Raphael Quinet <[EMAIL PROTECTED]>, [EMAIL PROTECTED]

   In regard to: Re: EPIPE, Michael Natterer said (at 12:40am on May 11, 2000):

   >This is what currently happens (ok, it happens in the handler, but
   >WNOHANG *should* be absolutely safe).  However, a signal handler
   >can do whatever it likes with the app's structures as long as it
   >uses atomic data access (which can be a pointer, as pointers have
   >the same size as integers, which are atomic. This is true at least
   >on all processors which have a GNU libc port and finding a
   >processor where pointers are not atomic looks very unlikely to
   >me).

   Finding a processor/OS combo where sizeof(pointer) != sizeof(int) is pretty
   easy, however.  How does this change your thinking?

For example, UltraSPARC in 64-bit mode under Solaris.

-- 
Robert Krawitz <[EMAIL PROTECTED]>  http://www.tiac.net/users/rlk/

Tall Clubs International  --  http://www.tall.org/ or 1-888-IM-TALL-2
Member of the League for Programming Freedom -- mail [EMAIL PROTECTED]
Project lead for The Gimp Print --  http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton

Re: EPIPE

2000-05-10 Thread Tim Mooney


In regard to: Re: EPIPE, Michael Natterer said (at 12:40am on May 11, 2000):

>This is what currently happens (ok, it happens in the handler, but WNOHANG
>*should* be absolutely safe).
>However, a signal handler can do whatever it likes with the app's structures
>as long as it uses atomic data access (which can be a pointer, as pointers
>have the same size as integers, which are atomic. This is true at least on
>all processors which have a GNU libc port and finding a processor
>where pointers are not atomic looks very unlikely to me).

Finding a processor/OS combo where sizeof(pointer) != sizeof(int) is pretty
easy, however.  How does this change your thinking?

>The usage of SIGCLD is strongly discouraged by Stevens and some Solaris
>document I fould recently. But Gimp uses SIGCHLD anyway.

SIGCLD != SIGCHLD.  They're distant relations, but that's it.  Is that what
you were trying to say?  Since you apparently have access to Stevens' APUE,
look at section 10.7 again if you don't understand what I'm talking about.

You're definitely correct that gimp shouldn't be mucking with SIGCLD, and
it's not. 

Tim
-- 
Tim Mooney  [EMAIL PROTECTED]
Information Technology Services (701) 231-1076 (Voice)
Room 242-J1, IACC Building  (701) 231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164

Re: EPIPE

2000-05-10 Thread Michael Natterer

Raphael Quinet wrote:
> 
> [ a cool summary of signals which i'm thinking about to copy for my students :) ]

> - SIGILL: one some processors that do not deliver SIGBUS in all cases,
>   you can get a SIGILL if a pointer to a callback function was
>   overwritten with garbage.  If the pointer is still referencing some
>   area inside a code segment (so that you don't get a SIGSEGV) but not
>   pointing to the start of a valid instruction, you will get the SIGILL.
>   By the way, the Gimp does not catch this one.  Why?

Dunno, most likely it was just forgotten in the first place.

> - SIGABRT: usually triggered by the application calling abort() or by a
>   user who wants to get a core dump from a running process.  It can be
>   caught by an application that wants to perform some specific cleanup
>   tasks, but in most cases it should not be caught by a generic error
>   handler.  I don't understand why the Gimp maps this to the generic
>   gimp_fatal_error() function???

Yep, we should not catch it but let the kernel do it's job. If the user
wants a core dump, she should get one.

> - SIGCHLD or SIGCLD: a child process died.  This signal can be
>   delivered at any time.  Some systems do not provide a reliable way
>   to know how many processes exited (if they do not support SA_NODEFER
>   or if their waitpid() or wait3() calls are broken), so it is usually
>   better to simply set a flag in the signal handler (without calling
>   any wait*() function) and to check the status of the children outside
>   the signal handler, until some wait*() function reports that there
>   are no more dead processes.  For example:
> while ((pid = waitpid (-1, &status, WNOHANG)) > 0)
>   { ... /* check WIFEXITED(status) and other things */ }

This is what currently happens (ok, it happens in the handler, but WNOHANG
*should* be absolutely safe).
However, a signal handler can do whatever it likes with the app's structures
as long as it uses atomic data access (which can be a pointer, as pointers
have the same size as integers, which are atomic. This is true at least on
all processors which have a GNU libc port and finding a processor
where pointers are not atomic looks very unlikely to me).
The usage of SIGCLD is strongly discouraged by Stevens and some Solaris
document I fould recently. But Gimp uses SIGCHLD anyway.

> In most of the applications that I wrote, the signal handlers do
> nothing directly: they only set a flag that is checked by the main
> loop (in an idle function for GTK+ apps, or after poll() or select()
> for applications that use low-level calls).  I define one flag for
> each signal (got_sigchld, got_sigterm, ...) and a master flag that
> tells if any of the signal-specific flags have been set.  Sometimes I
> also use counters instead of boolean flags, but on some systems the
> counters are not reliable (especially if there is no SA_NODEFER) so
> most of the time they are meaningless.
> 
> In one application that wanted to catch SIGSEGV, SIGBUS, SIGILL and
> SIGFPE, I created a handler that uses a direct call to write() on an
> fd that was previously obtained from fileno(stderr) (this fd is saved
> early so that the write() call can work even if the FILE *stderr is
> overwritten with garbage).  Doing this is safe, AFAIK.

Yep, write() is safe. Gimp uses g_print() which is not really safe, but
then we call g_on_error_query() which definitely does a bit more than
what's allowed :)

>From glib/gerror.c:

/* 
 * MT safe ; except for g_on_error_stack_trace, but who wants thread safety 
 * then
 */

> In most cases, I ignore SIGPIPE (or I only increment a counter for
> debugging purposes) because the best way to deal with this is to check
> the return value of the write() or send() calls, or to check if a
> read() returns 0 later.

This is what Gimp does now. Still mysterious why it worked all the time
with SIGPIPE being fatal.

> Just my 0.02 Euro.  But you probably knew all of this already...

Almost all... But thanks a lot for writing this summary I've been to lazy to
write for years now :):):)

And... please torture the new code.

I was not able to crash gimp or make it spinning, even when sending
SIGSEGV's to plugins while they were heavily using the wire.

OTOH I'm almost sure that I'm again the one who introduced the worst
bug of the next release...

bye,
--Mitch

-
killall -SIGGIMP ps

Re: EPIPE

2000-05-10 Thread Raphael Quinet


On Wed, 10 May 2000, Michael Natterer <[EMAIL PROTECTED]> wrote:
> Michael Natterer wrote:
> > 
> > Austin Donnelly wrote:
> > >
> > > [ two mails i totally agree with ]
> > 
> > I'm about to commit some code which should bring the signal
> > stuff into a sane state. The ChangeLog entry is quite verbose
> > and should explain how I tortured the code.
> 
> Um, 1 minute later I found a bad bug. No commit today :(

I don't know if this is relevant (I haven't seen the code that you
wanted to commit), but here are some general comments about signals
and when they are usually triggered in programs similar to the Gimp...

- SIGHUP, SIGINT, SIGQUIT, SIGTERM: usually triggered by the user or
  by the system shutting down.  They can be delivered at any time.
- SIGPIPE: an attempt to write() or send() something on a socket has
  failed because the other party has closed the connection.  This
  signal is usually triggered from within the system call, which is
  often called from inside a high-level function such as printf().
  Since most versions of printf() are not re-entrant, it is usually
  a bad idea to call printf() or any stdio function in the signal
  handler.
- SIGSEGV: just say "Oops!" or "Eek!".  Some bug in the code has
  corrupted the memory.  Usually happens when trying to dereference a
  NULL pointer or a pointer that has been overwritten with garbage.
  This also happens quite often inside printf() or sprintf(), for
  example when you are printing some debugging messages (or error
  messages) and you did not think that some arguments may be NULL.
  For this reason, it is also a bad idea to call any stdio functions
  inside the handler for this signal.
- SIGBUS: mostly a variant of SIGSEGV.  Happens on many processors
  when you are trying to access some unaligned data.  Again, this is
  usually due to pointer corruption.
- SIGILL: one some processors that do not deliver SIGBUS in all cases,
  you can get a SIGILL if a pointer to a callback function was
  overwritten with garbage.  If the pointer is still referencing some
  area inside a code segment (so that you don't get a SIGSEGV) but not
  pointing to the start of a valid instruction, you will get the SIGILL.
  By the way, the Gimp does not catch this one.  Why?
- SIGFPE: usually comes from a division by 0, although some other
  errors (overflow, invalid operand) can also occur.  This is usually
  triggered while executing a floating-point operation, although some
  processors or OS's can delay the signal.
- SIGABRT: usually triggered by the application calling abort() or by a
  user who wants to get a core dump from a running process.  It can be
  caught by an application that wants to perform some specific cleanup
  tasks, but in most cases it should not be caught by a generic error
  handler.  I don't understand why the Gimp maps this to the generic
  gimp_fatal_error() function???
- SIGCHLD or SIGCLD: a child process died.  This signal can be
  delivered at any time.  Some systems do not provide a reliable way
  to know how many processes exited (if they do not support SA_NODEFER
  or if their waitpid() or wait3() calls are broken), so it is usually
  better to simply set a flag in the signal handler (without calling
  any wait*() function) and to check the status of the children outside
  the signal handler, until some wait*() function reports that there
  are no more dead processes.  For example:
while ((pid = waitpid (-1, &status, WNOHANG)) > 0)
  { ... /* check WIFEXITED(status) and other things */ }

In most of the applications that I wrote, the signal handlers do
nothing directly: they only set a flag that is checked by the main
loop (in an idle function for GTK+ apps, or after poll() or select()
for applications that use low-level calls).  I define one flag for
each signal (got_sigchld, got_sigterm, ...) and a master flag that
tells if any of the signal-specific flags have been set.  Sometimes I
also use counters instead of boolean flags, but on some systems the
counters are not reliable (especially if there is no SA_NODEFER) so
most of the time they are meaningless.

In one application that wanted to catch SIGSEGV, SIGBUS, SIGILL and
SIGFPE, I created a handler that uses a direct call to write() on an
fd that was previously obtained from fileno(stderr) (this fd is saved
early so that the write() call can work even if the FILE *stderr is
overwritten with garbage).  Doing this is safe, AFAIK.

In most cases, I ignore SIGPIPE (or I only increment a counter for
debugging purposes) because the best way to deal with this is to check
the return value of the write() or send() calls, or to check if a
read() returns 0 later.

Just my 0.02 Euro.  But you probably knew all of this already...

-Raphael

Re: EPIPE

2000-05-09 Thread Michael Natterer


Michael Natterer wrote:
> 
> Austin Donnelly wrote:
> >
> > [ two mails i totally agree with ]
> 
> I'm about to commit some code which should bring the signal
> stuff into a sane state. The ChangeLog entry is quite verbose
> and should explain how I tortured the code.

Um, 1 minute later I found a bad bug. No commit today :(

--Mitch

Re: EPIPE

2000-05-09 Thread Michael Natterer


Austin Donnelly wrote:
> 
> [ two mails i totally agree with ]

I'm about to commit some code which should bring the signal
stuff into a sane state. The ChangeLog entry is quite verbose
and should explain how I tortured the code.

Hopefully no -isms this time :)

bye,
--Mitch

Re: EPIPE

2000-05-09 Thread Austin Donnelly

On Monday, 8 May 2000, Michael Natterer wrote:

> Unfortunately this is not the reason why gimp dies on just any aborting
> child. Although I 100% agree that SIGPIPE being fatal is the wrong thing
> to do. I browsed CVS and Gimp is connecting SIGPIPE to on_signal() since
> the beginning of CVS time.

Ok, fair enough.  But gimp used to be able to survive its plugins
dying in the past.  I'm not sure when this was last working.

> There is also an infinite loop using 30% of my cpu time in all cases where
> a dying plugin does _not_ kill gimp.

Fvwm2 used to have a bug very similar to this.  It turned out to be
trying to do too much processing in some signal hander function.

> [Mitch on temp procs not working from plugins]

This seems to be a separate, probably unrelated, bug.

On Monday, 8 May 2000, Tim Mooney wrote:

> The SIGPIPE problem is because on_signal is currently treating it as a
> fatal signal (see the case on_signal in app/main.c).  The on_signal routine
> should probably be modified to not treat SIGPIPE as fatal.  That should fix
> the problem Austin is seeing (that others will no doubt see too).  Someone
> should investigate the handler in 1.1.19 or earlier, and see what was being
> done on SIGPIPE there.

On Monday, 8 May 2000, Tim Mooney wrote:

> So why is Austin observing the problem now?  I'm not sure.

I'm noting that 2-3 recent bug reports have been related to plugins
crashing (the main reason the bug was filed) but as a side effect, the
main gimp process is also dying.  I happened to look into one of these
very briefly, and noted the strange SIGPIPE -> on_signal() -> SIGSEGV
behaviour.

> It does seem
> like it's behaving as expected based on the code.  The thing to try would
> be to use a different signal handler for SIGPIPE, that doesn't terminate
> the gimp.  Perhaps just ignore the signal and let the plug-in protocol detect
> the problem (assuming it can?)?

What happens if we use SIGIGN as the handler for SIGPIPE.  That,
combined with using EPIPE from g_io_channel_{read,write}() should give
us everything we need, without needing the overly complex gimp_nanny()
stuff being proposed earlier.  Remember: KISS - Keep It Simple, Stupid.

On Monday, 8 May 2000, Michael Natterer wrote:

> Nope, it unfortunately has another reason. I can't explain why it used to
> work with SIGPIPE being fatal but the diffs of app/main.c show no semantical
> changes at all down to CVS version 1.1.

So, maybe we never used to get SIGPIPEs raised before.  Maybe we were
just lucky.  I still think we need to deal with them properly.

> On Monday, 8 May 2000, Tim Mooney wrote:
> > Austin is also correct that calling printf from the handler is probably
> > asking for trouble.  If a message must be written, some other method should
> > be found (write *is* ok from a signal handler, but won't using *that* be
> > fun...).
> 
> Hm, I guess printf from a signal handler which aborts the program should be
> totally safe

No it is not.  Consider a libc implementation where printf takes out a
mutex while appending to the stdio output buffer.  If the signal is
delivered while printf has this lock held, any attempt to call printf
again will deadlock.  You lose.  This situation may occur if you call
_any_ library code you didn't write, so the conclusion is that calling
someone else's code from a signal handler is a very bad idea, unless
you're guaranteed the code is fully re-entrant.  Section 10.6 of
Advanced Programming in the UNIX Environment (W. Richard Stevens)
lists the POSIX1 mandated re-entrant functions.  Printf, malloc etc
are not in the list, unsuprisingly.

On Monday, 8 May 2000, Michael Natterer wrote:

> We should probably ignore SIGPIPE totally and let a more sophisticated
> SIGCHLD handler do the work:
> 
> - record which processes have been started ("gimp_nanny()")
> - on SIGCHLD check if it crashed...
> - ...and somehow clean up the plugin's struct and io channels
>  (and show a proper error message)
> 
> The problem with this may be that we need a periodic function doing the cleanup
> (just like the shell prints it's "bla exited [SIGwhatever]" message before the
> prompt) because we cannot do it from the handler.
> 
> OTOH it should be ok to call this cleanup function from two places:
> 
> 1. whenever read/write from/to a plugin pipe fails.
> 2. in an idle function.
> 
> The "gimp_fork()" and signal handler stuff I proposed during the SIGCHLD
> discussion might so the job but it would be much nicer to fix it to work like
> before

SIGPIPE indicates a write(2) into the pipe when the plugin has died.
If SIGPIPE is being ignored (or the signal handler returns) then the
write(2) returns EPIPE.  If you read(2) from a pipe when the plugin
has died,

Re: EPIPE

2000-05-08 Thread Robert L Krawitz

   Date: Mon, 08 May 2000 20:22:27 +0200
   From: Michael Natterer <[EMAIL PROTECTED]>

   Unfortunately this is not the reason why gimp dies on just any aborting
   child. Although I 100% agree that SIGPIPE being fatal is the wrong thing
   to do. I browsed CVS and Gimp is connecting SIGPIPE to on_signal() since
   the beginning of CVS time.

Absolutely.  I get hammered by this on the print plugin every now and
again when I'm hacking on it and it segv's on me.

While we're on the subject of plugin interaction, is there any way
that hitting "cancel" could be a bit more graceful, to allow the
plugin a chance to clean up?  The print plugin doesn't get a chance to
clean up its lpr child when the user cancels, with the result that an
incomplete print file gets sent to the printer.  That's nasty.

-- 
Robert Krawitz <[EMAIL PROTECTED]>  http://www.tiac.net/users/rlk/

Tall Clubs International  --  http://www.tall.org/ or 1-888-IM-TALL-2
Member of the League for Programming Freedom -- mail [EMAIL PROTECTED]
Project lead for The Gimp Print --  http://gimp-print.sourceforge.net

"Linux doesn't dictate how I work, I dictate how Linux works."
--Eric Crampton

Re: EPIPE

2000-05-08 Thread Michael Natterer

Tim Mooney wrote:
> 
> I just looked at 1.1.19, and on_signal there was doing the same thing it
> is now: SIGPIPE caused a call to gimp_terminate.  Obviously the current code
> is based on the older code.
> 
> So why is Austin observing the problem now?  I'm not sure.  It does seem
> like it's behaving as expected based on the code.  The thing to try would
> be to use a different signal handler for SIGPIPE, that doesn't terminate
> the gimp.  Perhaps just ignore the signal and let the plug-in protocol detect
> the problem (assuming it can?)?

We should probably ignore SIGPIPE totally and let a more sophisticated
SIGCHLD handler do the work:

- record which processes have been started ("gimp_nanny()")
- on SIGCHLD check if it crashed...
- ...and somehow clean up the plugin's struct and io channels
 (and show a proper error message)

The problem with this may be that we need a periodic function doing the cleanup
(just like the shell prints it's "bla exited [SIGwhatever]" message before the
prompt) because we cannot do it from the handler.

OTOH it should be ok to call this cleanup function from two places:

1. whenever read/write from/to a plugin pipe fails.
2. in an idle function.

The "gimp_fork()" and signal handler stuff I proposed during the SIGCHLD
discussion might so the job but it would be much nicer to fix it to work like
before :)

bye,
--Mitch

Re: EPIPE

2000-05-08 Thread Michael Natterer


Tim Mooney wrote:
> 
> In regard to: Re: EPIPE, Michael Natterer said (at 8:22pm on May 8, 2000):
> 
> >Unfortunately this is not the reason why gimp dies on just any aborting
> >child. Although I 100% agree that SIGPIPE being fatal is the wrong thing
> >to do. I browsed CVS and Gimp is connecting SIGPIPE to on_signal() since
> >the beginning of CVS time.
> >
> >Mysteriously, as you mention, it worked before. I'm also pretty sure that
> >the current state of signal processing behaves exactly like before Garry
> >started to fix the SIGCHLD bug.
> 
> The current state of signal processing does *not* behave exactly as before:
> plug-in query works on alpha-dec-osf now.  Using sigaction instead of signal
> should improve things across the board, on all platforms.  signal() on HP-UX
> and Solaris was the SYSV signal, where the handler needed to be reset after
> every signal (and there was therefore a race condition), so even though major
> problems regarding signal handling were never reported on those platforms,
> they were lurking.

Ok, the handling of SIGCHLD has changed but SIGPIPE is handled as always in
Gimp and the call to gimp_signal_private() emulates the old signal()

> The SIGPIPE problem is because on_signal is currently treating it as a
> fatal signal (see the case on_signal in app/main.c).  The on_signal routine
> should probably be modified to not treat SIGPIPE as fatal.  That should fix
> the problem Austin is seeing (that others will no doubt see too).  Someone
> should investigate the handler in 1.1.19 or earlier, and see what was being
> done on SIGPIPE there.

Nope, it unfortunately has another reason. I can't explain why it used to
work with SIGPIPE being fatal but the diffs of app/main.c show no semantical
changes at all down to CVS version 1.1.

> Austin is also correct that calling printf from the handler is probably
> asking for trouble.  If a message must be written, some other method should
> be found (write *is* ok from a signal handler, but won't using *that* be
> fun...).

Hm, I guess printf from a signal handler which aborts the program should be
totally safe (while we should of course _not_ call the handler for SIGPIPE).

My current theory (TM) is that there are weird implications between SIGPIPE
and SIGCHLD (which we currently ignore) but I suspect this is total nonsense :)

bye,
--Mitch

Re: EPIPE

2000-05-08 Thread Tim Mooney


I said:

>The SIGPIPE problem is because on_signal is currently treating it as a
>fatal signal (see the case on_signal in app/main.c).  The on_signal routine
>should probably be modified to not treat SIGPIPE as fatal.  That should fix
>the problem Austin is seeing (that others will no doubt see too).  Someone
>should investigate the handler in 1.1.19 or earlier, and see what was being
>done on SIGPIPE there.

I just looked at 1.1.19, and on_signal there was doing the same thing it
is now: SIGPIPE caused a call to gimp_terminate.  Obviously the current code
is based on the older code.

So why is Austin observing the problem now?  I'm not sure.  It does seem
like it's behaving as expected based on the code.  The thing to try would
be to use a different signal handler for SIGPIPE, that doesn't terminate
the gimp.  Perhaps just ignore the signal and let the plug-in protocol detect
the problem (assuming it can?)?

Tim
-- 
Tim Mooney  [EMAIL PROTECTED]
Information Technology Services (701) 231-1076 (Voice)
Room 242-J1, IACC Building  (701) 231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164

Re: EPIPE

2000-05-08 Thread Tim Mooney

In regard to: Re: EPIPE, Michael Natterer said (at 8:22pm on May 8, 2000):

>Austin Donnelly wrote:
>> 
>> Current gimp (1.1.21) seems to have problems with recovering from any
>> plugin that dies.  Things start going wrong when it takes a SIGPIPE
>> while trying to write(read?) to the pipe to the plugin which is dead.
>> Rather than ignoring SIGPIPE, and collecting an EPIPE from the io
>> operation and using this to trigger dead plugin cleanup operations,
>> gimp currently treats SIGPIPE just like any other signal: it's fatal.
>> Unfortunately, while attempting to print out some error message or
>> other, gimp causes a segfault.  This might be due to non-reentrant
>> stdio libraries in use, I don't know.  According to POSIX, the only
>> thing you're allowed to do is read or write variables of type
>> sigatomic_t.  Calling libc funcitions (including printf()) sounds like
>> a recipe for disaster, especially with a non-reentrant libc.
>> 
>> This needs some more thought, and I don't have much time right now to
>> look into any more.  I'm pretty sure that plugins were correctly
>> cleaned up on their unexpected termination at some earlier stage.  The
>> whole point of plugins being separate processes is that a plugin
>> should be unable to cause the main gimp app to crash: if they can then
>> this is a fairly critical bug that should be fixed.
>
>Hi Austin,
>
>Unfortunately this is not the reason why gimp dies on just any aborting
>child. Although I 100% agree that SIGPIPE being fatal is the wrong thing
>to do. I browsed CVS and Gimp is connecting SIGPIPE to on_signal() since
>the beginning of CVS time.
>
>Mysteriously, as you mention, it worked before. I'm also pretty sure that
>the current state of signal processing behaves exactly like before Garry
>started to fix the SIGCHLD bug.

Actually, it was Austin's diagnosis (the hard part!) and initial code, with
Garry fleshing it out and polishing it up in Austin's abscense, and comments
from the peanut gallery (me and you, mainly).

The current state of signal processing does *not* behave exactly as before:
plug-in query works on alpha-dec-osf now.  Using sigaction instead of signal
should improve things across the board, on all platforms.  signal() on HP-UX
and Solaris was the SYSV signal, where the handler needed to be reset after
every signal (and there was therefore a race condition), so even though major
problems regarding signal handling were never reported on those platforms,
they were lurking.

The SIGPIPE problem is because on_signal is currently treating it as a
fatal signal (see the case on_signal in app/main.c).  The on_signal routine
should probably be modified to not treat SIGPIPE as fatal.  That should fix
the problem Austin is seeing (that others will no doubt see too).  Someone
should investigate the handler in 1.1.19 or earlier, and see what was being
done on SIGPIPE there.

Austin is also correct that calling printf from the handler is probably
asking for trouble.  If a message must be written, some other method should
be found (write *is* ok from a signal handler, but won't using *that* be
fun...).

Tim
-- 
Tim Mooney  [EMAIL PROTECTED]
Information Technology Services (701) 231-1076 (Voice)
Room 242-J1, IACC Building  (701) 231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164

Re: EPIPE

2000-05-08 Thread Michael Natterer

Austin Donnelly wrote:
> 
> Current gimp (1.1.21) seems to have problems with recovering from any
> plugin that dies.  Things start going wrong when it takes a SIGPIPE
> while trying to write(read?) to the pipe to the plugin which is dead.
> Rather than ignoring SIGPIPE, and collecting an EPIPE from the io
> operation and using this to trigger dead plugin cleanup operations,
> gimp currently treats SIGPIPE just like any other signal: it's fatal.
> Unfortunately, while attempting to print out some error message or
> other, gimp causes a segfault.  This might be due to non-reentrant
> stdio libraries in use, I don't know.  According to POSIX, the only
> thing you're allowed to do is read or write variables of type
> sigatomic_t.  Calling libc funcitions (including printf()) sounds like
> a recipe for disaster, especially with a non-reentrant libc.
> 
> This needs some more thought, and I don't have much time right now to
> look into any more.  I'm pretty sure that plugins were correctly
> cleaned up on their unexpected termination at some earlier stage.  The
> whole point of plugins being separate processes is that a plugin
> should be unable to cause the main gimp app to crash: if they can then
> this is a fairly critical bug that should be fixed.

Hi Austin,

Unfortunately this is not the reason why gimp dies on just any aborting
child. Although I 100% agree that SIGPIPE being fatal is the wrong thing
to do. I browsed CVS and Gimp is connecting SIGPIPE to on_signal() since
the beginning of CVS time.

Mysteriously, as you mention, it worked before. I'm also pretty sure that
the current state of signal processing behaves exactly like before Garry
started to fix the SIGCHLD bug.

There is also an infinite loop using 30% of my cpu time in all cases where
a dying plugin does _not_ kill gimp. This also happens if a plugin calls
gimp_quit() (which is the recomended way of aborting a plugin and it
definitely worked before, too).

I suspect there are some strange things going on in plug_in.c like
a forgotten plug_in_push() or the following (unrelated):

---
/*   plug_in_pop (); */

  plug_in_params_destroy (proc_run.params, proc_run.nparams, FALSE);

  old_recurse = plug_in->recurse;
  plug_in->recurse = TRUE;

/*   gtk_main (); */

/*   return_vals = plug_in_get_current_return_vals (proc_rec); */
---

which, toghether with libgimp/gimp.c:

---
  /* No longer a return message */
/*   proc_return.name = proc_run->name; */
/*   proc_return.nparams = nreturn_vals; */
/*   proc_return.params = (GPParam*) return_vals; */

/*   if (!gp_temp_proc_return_write (_writechannel, &proc_return)) */
/*  gimp_quit (); */
---

makes it impossible to call temp procs across plugins. I tried to use temp
procs in a naive way (i.e. how expected) and totally failed when hacking
the help browser's temp proc. The caller just doesn't get a return message
when calling a plugin's temp proc. At least this is my theory :)

[ no conclusion ]

starting gdb...

--Mitch

Re: EPIPE

2000-05-08 Thread Nick Lamb

On Mon, May 08, 2000 at 02:16:06PM +0100, Austin Donnelly wrote:
> cleaned up on their unexpected termination at some earlier stage.  The
> whole point of plugins being separate processes is that a plugin
> should be unable to cause the main gimp app to crash: if they can then
> this is a fairly critical bug that should be fixed.

s/unable to cause/able to prevent/

We don't do nearly enough checks to prevent plug-ins from killing Gimp
if they're badly written. A future version could, in theory, but it
seems better to assume that plug-in authors have good intentions.

However, minor accidents causing a crash in a plug-in should NOT kill
Gimp as you said. This is our strength over most PC apps when it comes
to integrating 3rd party plug-ins. This has been annoying me, but
I presumed that whoever's been tinkering with the signals knew what
they were doing...?

Nick.

EPIPE

2000-05-08 Thread Austin Donnelly


Current gimp (1.1.21) seems to have problems with recovering from any
plugin that dies.  Things start going wrong when it takes a SIGPIPE
while trying to write(read?) to the pipe to the plugin which is dead.
Rather than ignoring SIGPIPE, and collecting an EPIPE from the io
operation and using this to trigger dead plugin cleanup operations,
gimp currently treats SIGPIPE just like any other signal: it's fatal.
Unfortunately, while attempting to print out some error message or
other, gimp causes a segfault.  This might be due to non-reentrant
stdio libraries in use, I don't know.  According to POSIX, the only
thing you're allowed to do is read or write variables of type
sigatomic_t.  Calling libc funcitions (including printf()) sounds like
a recipe for disaster, especially with a non-reentrant libc.

This needs some more thought, and I don't have much time right now to
look into any more.  I'm pretty sure that plugins were correctly
cleaned up on their unexpected termination at some earlier stage.  The
whole point of plugins being separate processes is that a plugin
should be unable to cause the main gimp app to crash: if they can then
this is a fairly critical bug that should be fixed.

Austin

38 matches

Mail list logo