On Mon, 17 Jun 2013 16:00:21 +0200 Cedric Blancher wrote:
> On 17 June 2013 15:50, Glenn Fowler <[email protected]> wrote:
> > On Mon, 17 Jun 2013 14:59:38 +0200 Cedric Blancher wrote:
> >> I found a bug in ksh93's kill(1) implementation: It doesn't check for
> >> errors when kill(3p) and killpg(3p) are used.
> >> However posix says for kill(3p) and killpg(3p):
> >> RETURN VALUE
> >>        Upon successful completion, 0 shall be returned. Otherwise, -1
> >> shall be returned and errno set to indicate the error.
> >
> >> ERRORS
> >>        The kill() function shall fail if:
> >
> >>        EINVAL The value of the sig argument is an invalid or
> >> unsupported signal number.
> >
> >>        EPERM  The process does not have permission to send the signal
> >> to any receiving process.
> >
> >>        ESRCH  No process or process group can be found corresponding
> >> to that specified by pid.
> >
> >> So the two syscalls can fail, and ksh93 kill(1) should check this.

> >
> > can you point to the specific lines in the code that led you to this 
> > conclusion

> function job_kill():
>   1214                                  }
>   1215                                  else
>   1216                                          r = kill(pid,sig);
>   1217                                  if(r>=0 && !stopsig)
>   1218                                  {
>   1219                                          if(pw->p_flag&P_STOPPED)
>   1220                                                  pw->p_flag &=
> ~(P_STOPPED|P_SIGNALLED);
>   1221                                          if(sig)
>   1222                                                  kill(pid,SIGCONT);
>   1223                                  }
>   1224                          }
>   1225                          else
>   1226                          {
>   1227                                  if(qflag)
>   1228                                          goto no_sigqueue;
>   1229                                  if((r = killpg(-pid,sig))>=0
> && !stopsig)
>   1230                                  {
>   1231
> job_unstop(job_bypid(pw->p_pid));
>   1232                                          if(sig)

> None of them checks for errors from kill(pid,sig) or killpg() to print an 
> error.

"none" in the section of code you mention (job_kill() which is called by
b_kill() via job_walk()) translates to 3 job control specific instances: 

        kill(pid,SIGCONT);
        killpg(-pid,SIGCONT);
        kill(pw->p_pid,SIGCONT);

and if you look close they are encased in logic that already has done a 
kill/killpg
that does check

the other 4 instances are checked:

        r = kill(pid,sig);
        if((r = killpg(-pid,sig))>=0 && !stopsig)
        r = kill(pid,sig);
        r = killpg(-pid,sig);
        r = killpg(pid,sig);

> I noticed this when David reported that on Linux 3% of the USR1 and
> USR2 signals get lost, which indicates a kernel bug. This is the only
> place I could find which may be a culprit for 'silently' loosing
> signals.

> If this isn't the culprit then Linux has a bug.

> Ced
> -- 
> Cedric Blancher <[email protected]>
> Institute Pasteur

_______________________________________________
ast-developers mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-developers

Reply via email to