Re: sendsyslog file race

2017-03-27 Thread Mateusz Guzik
On Mon, Mar 27, 2017 at 6:09 PM, Alexander Bluhm 
wrote:

> On Mon, Mar 27, 2017 at 05:39:27PM +0200, Mateusz Guzik wrote:
> > The previous patch replaced multiple reads of the global var with just
> > one read and had the result stored in a local variable, which then is
> > read multiple times. Even though the compiler ended up emitting one read
> > of the global, it perfectly legal to emit several, thus the bug can still
> > reappear. This can be prevented as described above.
>
> It is not a problem when the compiler creates multiple read
> instructions for syslogf as long no function call is between them.
> Nothing else can change the content of syslogf in parallel, we are
> running with the big kernel lock.
>
> It would be a problem if the compiler would do a read from the
> global syslogf variable before and after a sleep.  But that would
> be illegal in C.


Agreed. I somehow brainfarted myself into not realising it really can't
reload the var across a func call which would be needed to give cpu to
someone else.

Sorry for the noise in that bit.

-- 
Mateusz Guzik 


Re: sendsyslog file race

2017-03-27 Thread Theo de Raadt
> On Mon, Mar 27, 2017 at 05:39:27PM +0200, Mateusz Guzik wrote:
> > The previous patch replaced multiple reads of the global var with just
> > one read and had the result stored in a local variable, which then is
> > read multiple times. Even though the compiler ended up emitting one read
> > of the global, it perfectly legal to emit several, thus the bug can still
> > reappear. This can be prevented as described above.
> 
> It is not a problem when the compiler creates multiple read
> instructions for syslogf as long no function call is between them.
> Nothing else can change the content of syslogf in parallel, we are
> running with the big kernel lock.
> 
> It would be a problem if the compiler would do a read from the
> global syslogf variable before and after a sleep.  But that would
> be illegal in C.
> 
> When we will unlock this code for multi processor, we need lock,
> memory barrier, per cpu memory, or whatever like you suggested.
> But that does not happen now.

This really has nothing to do with the biglock.

As the biglock gets pushed out of this layer, other locks will be
added to ensure that the VFS functions can be safely call, those will
satisfy the conditions.  There's no point creating needless
contortions.



Re: sendsyslog file race

2017-03-27 Thread Alexander Bluhm
On Mon, Mar 27, 2017 at 05:39:27PM +0200, Mateusz Guzik wrote:
> The previous patch replaced multiple reads of the global var with just
> one read and had the result stored in a local variable, which then is
> read multiple times. Even though the compiler ended up emitting one read
> of the global, it perfectly legal to emit several, thus the bug can still
> reappear. This can be prevented as described above.

It is not a problem when the compiler creates multiple read
instructions for syslogf as long no function call is between them.
Nothing else can change the content of syslogf in parallel, we are
running with the big kernel lock.

It would be a problem if the compiler would do a read from the
global syslogf variable before and after a sleep.  But that would
be illegal in C.

When we will unlock this code for multi processor, we need lock,
memory barrier, per cpu memory, or whatever like you suggested.
But that does not happen now.

bluhm



Re: sendsyslog file race

2017-03-27 Thread Mateusz Guzik
On Sun, Mar 26, 2017 at 10:04 PM, Alexander Bluhm 
wrote:

> On Sun, Mar 26, 2017 at 05:00:12PM +0200, Mateusz Guzik wrote:
> > The patch is somewhat incorrect, although from what I checked it happens
> > to generate the expected outcome in terms of assembly (the global pointer
> > read only once and then a local copy accessed several times). You either
> > need a linux-like READ_ONCE macros which casts through a volatile pointer
> > or mark the global as volatile to prevent the compiler from issuing
> > multiple reads in the future.
>
> Note that our OpenBSD kernel is still implemented with a big kernel
> lock in most places.  So multi processor thinking and locking is
> not necessary.  The execution order can only be interrupted by
> hardware interrups or explicitly rescheduling with tsleep(9).  Here
> the sleep is hidden in copyin(), mallocarray(), sosend(), malloc(),
> ktrgenio().
>
> Interrupts are not relevant, they never change syslogf.  As tsleep()
> is a function call, it automatically acts as compiler barrier.  So
> volatile is not needed.
>
>
The previous patch replaced multiple reads of the global var with just
one read and had the result stored in a local variable, which then is
read multiple times. Even though the compiler ended up emitting one read
of the global, it perfectly legal to emit several, thus the bug can still
reappear. This can be prevented as described above.


> > Remaining ses sof syslogf are also super fishy:
> > 1. logclose
> > 2. logioctl -> LIOCSFD
> > FRELE(syslogf, p);
>
> A few months ago I would have said FRELE() does not sleep.  I think
> this is currently still the case as we do not grab the netlock for
> socketpairs.  But we did for a while.
>
> As we are heading towards multi processor in kernel, I would suggest
> this diff.  It orders FREF() and FRELE() in a way that we only
> operate on refcounted global variables.  Although not necessary
> with the current sleeping points, I think it results in more robust
> code.


With aforementioned caveat ignored, the patch indeed fixes the problem.

I skimmed through and found few *sleep calls, but perhaps they cannot
end up being executed for the type of socket accepted here.

-- 
Mateusz Guzik 


Re: sendsyslog file race

2017-03-26 Thread Alexander Bluhm
On Sun, Mar 26, 2017 at 05:00:12PM +0200, Mateusz Guzik wrote:
> The patch is somewhat incorrect, although from what I checked it happens
> to generate the expected outcome in terms of assembly (the global pointer
> read only once and then a local copy accessed several times). You either
> need a linux-like READ_ONCE macros which casts through a volatile pointer
> or mark the global as volatile to prevent the compiler from issuing
> multiple reads in the future.

Note that our OpenBSD kernel is still implemented with a big kernel
lock in most places.  So multi processor thinking and locking is
not necessary.  The execution order can only be interrupted by
hardware interrups or explicitly rescheduling with tsleep(9).  Here
the sleep is hidden in copyin(), mallocarray(), sosend(), malloc(),
ktrgenio().

Interrupts are not relevant, they never change syslogf.  As tsleep()
is a function call, it automatically acts as compiler barrier.  So
volatile is not needed.

> Remaining ses sof syslogf are also super fishy:
> 1. logclose
> 2. logioctl -> LIOCSFD
> FRELE(syslogf, p);

A few months ago I would have said FRELE() does not sleep.  I think
this is currently still the case as we do not grab the netlock for
socketpairs.  But we did for a while.

As we are heading towards multi processor in kernel, I would suggest
this diff.  It orders FREF() and FRELE() in a way that we only
operate on refcounted global variables.  Although not necessary
with the current sleeping points, I think it results in more robust
code.

ok?

bluhm

Index: kern/subr_log.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/subr_log.c,v
retrieving revision 1.49
diff -u -p -r1.49 subr_log.c
--- kern/subr_log.c 24 Mar 2017 16:42:38 -  1.49
+++ kern/subr_log.c 26 Mar 2017 19:37:05 -
@@ -172,10 +172,12 @@ logopen(dev_t dev, int flags, int mode, 
 int
 logclose(dev_t dev, int flag, int mode, struct proc *p)
 {
+   struct file *fp;
 
-   if (syslogf)
-   FRELE(syslogf, p);
+   fp = syslogf;
syslogf = NULL;
+   if (fp)
+   FRELE(fp, p);
log_open = 0;
logsoftc.sc_state = 0;
return (0);
@@ -355,11 +357,11 @@ logioctl(dev_t dev, u_long com, caddr_t 
case LIOCSFD:
if ((error = suser(p, 0)) != 0)
return (error);
-   if ((error = getsock(p, *(int *)data, )) != 0)
+   fp = syslogf;
+   if ((error = getsock(p, *(int *)data, )) != 0)
return (error);
-   if (syslogf)
-   FRELE(syslogf, p);
-   syslogf = fp;
+   if (fp)
+   FRELE(fp, p);
break;
 
default:
Index: kern/uipc_syscalls.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_syscalls.c,v
retrieving revision 1.150
diff -u -p -r1.150 uipc_syscalls.c
--- kern/uipc_syscalls.c14 Feb 2017 09:46:21 -  1.150
+++ kern/uipc_syscalls.c26 Mar 2017 19:37:05 -
@@ -1146,8 +1146,8 @@ getsock(struct proc *p, int fdes, struct
return (EBADF);
if (fp->f_type != DTYPE_SOCKET)
return (ENOTSOCK);
-   *fpp = fp;
FREF(fp);
+   *fpp = fp;
 
return (0);
 }



Re: sendsyslog file race

2017-03-26 Thread Mateusz Guzik
On Fri, Mar 24, 2017 at 4:56 PM, Alexander Bluhm 
wrote:

> Hi,
>
> There is a race in dosendsyslog() which resulted in a crash on a
> 5.9 system.  sosend(syslogf->f_data, ...) was called with a NULL
> pointer.  So syslogf is not NULL, f_data is NULL and f_count is 1.
>
> The file structure is ref counted, but the global variable syslogf
> is not protected.  So it may change during sleep and dosendsyslog()
> possibly uses a different socket at each access.
>
> Solution is to access syslogf ony once, use a local copy, and do
> the ref counting there.
>

I'm sending this from the gmail web interface, so formatting may be screwed
up. Apologies in advance.

The patch is somewhat incorrect, although from what I checked it happens
to generate the expected outcome in terms of assembly (the global pointer
read only once and then a local copy accessed several times). You either
need a linux-like READ_ONCE macros which casts through a volatile pointer
or mark the global as volatile to prevent the compiler from issuing
multiple reads in the future.

Remaining ses sof syslogf are also super fishy:

1. logclose

if (syslogf)
FRELE(syslogf, p);
syslogf = NULL;

If this results in fdrop I presume it can block. But if that happens, the
global points to a defunct object.

2. logioctl -> LIOCSFD

if (syslogf)
FRELE(syslogf, p);
syslogf = fp;

This one will give double free and ref leaks.

Suggested fix is to xchg the pointer. While it is more expensive than
necessary for this case, it does not pose a problem. Also xchg is
blatantly obvious, while manual replacement would require explicit memory
barriers to be placed here.


sendsyslog file race

2017-03-24 Thread Alexander Bluhm
Hi,

There is a race in dosendsyslog() which resulted in a crash on a
5.9 system.  sosend(syslogf->f_data, ...) was called with a NULL
pointer.  So syslogf is not NULL, f_data is NULL and f_count is 1.

The file structure is ref counted, but the global variable syslogf
is not protected.  So it may change during sleep and dosendsyslog()
possibly uses a different socket at each access.

My crash happend during a reboot when init(8) is killing syslogd(8)
and some sort of super daemon tries to restart it constantly.
Although this design is questionable, it helps finding kernel bugs :-)

Solution is to access syslogf ony once, use a local copy, and do
the ref counting there.

ok?

bluhm

Index: kern/subr_log.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/subr_log.c,v
retrieving revision 1.48
diff -u -p -r1.48 subr_log.c
--- kern/subr_log.c 23 Jun 2016 15:41:42 -  1.48
+++ kern/subr_log.c 24 Mar 2017 15:31:49 -
@@ -409,14 +409,17 @@ dosendsyslog(struct proc *p, const char 
struct iovec *ktriov = NULL;
int iovlen;
 #endif
+   struct file *fp;
char pri[6], *kbuf;
struct iovec aiov;
struct uio auio;
size_t i, len;
int error;
 
-   if (syslogf)
-   FREF(syslogf);
+   /* Global variable syslogf may change during sleep, use local copy. */
+   fp = syslogf;
+   if (fp)
+   FREF(fp);
else if (!ISSET(flags, LOG_CONS))
return (ENOTCONN);
else {
@@ -467,8 +470,8 @@ dosendsyslog(struct proc *p, const char 
 #endif
 
len = auio.uio_resid;
-   if (syslogf) {
-   error = sosend(syslogf->f_data, NULL, , NULL, NULL, 0);
+   if (fp) {
+   error = sosend(fp->f_data, NULL, , NULL, NULL, 0);
if (error == 0)
len -= auio.uio_resid;
} else if (constty || cn_devvp) {
@@ -515,8 +518,8 @@ dosendsyslog(struct proc *p, const char 
free(ktriov, M_TEMP, iovlen);
}
 #endif
-   if (syslogf)
-   FRELE(syslogf, p);
+   if (fp)
+   FRELE(fp, p);
else
error = ENOTCONN;
return (error);