Re: sendsyslog file race
On Mon, Mar 27, 2017 at 6:09 PM, Alexander Bluhmwrote: > On Mon, Mar 27, 2017 at 05:39:27PM +0200, Mateusz Guzik wrote: > > The previous patch replaced multiple reads of the global var with just > > one read and had the result stored in a local variable, which then is > > read multiple times. Even though the compiler ended up emitting one read > > of the global, it perfectly legal to emit several, thus the bug can still > > reappear. This can be prevented as described above. > > It is not a problem when the compiler creates multiple read > instructions for syslogf as long no function call is between them. > Nothing else can change the content of syslogf in parallel, we are > running with the big kernel lock. > > It would be a problem if the compiler would do a read from the > global syslogf variable before and after a sleep. But that would > be illegal in C. Agreed. I somehow brainfarted myself into not realising it really can't reload the var across a func call which would be needed to give cpu to someone else. Sorry for the noise in that bit. -- Mateusz Guzik
Re: sendsyslog file race
> On Mon, Mar 27, 2017 at 05:39:27PM +0200, Mateusz Guzik wrote: > > The previous patch replaced multiple reads of the global var with just > > one read and had the result stored in a local variable, which then is > > read multiple times. Even though the compiler ended up emitting one read > > of the global, it perfectly legal to emit several, thus the bug can still > > reappear. This can be prevented as described above. > > It is not a problem when the compiler creates multiple read > instructions for syslogf as long no function call is between them. > Nothing else can change the content of syslogf in parallel, we are > running with the big kernel lock. > > It would be a problem if the compiler would do a read from the > global syslogf variable before and after a sleep. But that would > be illegal in C. > > When we will unlock this code for multi processor, we need lock, > memory barrier, per cpu memory, or whatever like you suggested. > But that does not happen now. This really has nothing to do with the biglock. As the biglock gets pushed out of this layer, other locks will be added to ensure that the VFS functions can be safely call, those will satisfy the conditions. There's no point creating needless contortions.
Re: sendsyslog file race
On Mon, Mar 27, 2017 at 05:39:27PM +0200, Mateusz Guzik wrote: > The previous patch replaced multiple reads of the global var with just > one read and had the result stored in a local variable, which then is > read multiple times. Even though the compiler ended up emitting one read > of the global, it perfectly legal to emit several, thus the bug can still > reappear. This can be prevented as described above. It is not a problem when the compiler creates multiple read instructions for syslogf as long no function call is between them. Nothing else can change the content of syslogf in parallel, we are running with the big kernel lock. It would be a problem if the compiler would do a read from the global syslogf variable before and after a sleep. But that would be illegal in C. When we will unlock this code for multi processor, we need lock, memory barrier, per cpu memory, or whatever like you suggested. But that does not happen now. bluhm
Re: sendsyslog file race
On Sun, Mar 26, 2017 at 10:04 PM, Alexander Bluhmwrote: > On Sun, Mar 26, 2017 at 05:00:12PM +0200, Mateusz Guzik wrote: > > The patch is somewhat incorrect, although from what I checked it happens > > to generate the expected outcome in terms of assembly (the global pointer > > read only once and then a local copy accessed several times). You either > > need a linux-like READ_ONCE macros which casts through a volatile pointer > > or mark the global as volatile to prevent the compiler from issuing > > multiple reads in the future. > > Note that our OpenBSD kernel is still implemented with a big kernel > lock in most places. So multi processor thinking and locking is > not necessary. The execution order can only be interrupted by > hardware interrups or explicitly rescheduling with tsleep(9). Here > the sleep is hidden in copyin(), mallocarray(), sosend(), malloc(), > ktrgenio(). > > Interrupts are not relevant, they never change syslogf. As tsleep() > is a function call, it automatically acts as compiler barrier. So > volatile is not needed. > > The previous patch replaced multiple reads of the global var with just one read and had the result stored in a local variable, which then is read multiple times. Even though the compiler ended up emitting one read of the global, it perfectly legal to emit several, thus the bug can still reappear. This can be prevented as described above. > > Remaining ses sof syslogf are also super fishy: > > 1. logclose > > 2. logioctl -> LIOCSFD > > FRELE(syslogf, p); > > A few months ago I would have said FRELE() does not sleep. I think > this is currently still the case as we do not grab the netlock for > socketpairs. But we did for a while. > > As we are heading towards multi processor in kernel, I would suggest > this diff. It orders FREF() and FRELE() in a way that we only > operate on refcounted global variables. Although not necessary > with the current sleeping points, I think it results in more robust > code. With aforementioned caveat ignored, the patch indeed fixes the problem. I skimmed through and found few *sleep calls, but perhaps they cannot end up being executed for the type of socket accepted here. -- Mateusz Guzik
Re: sendsyslog file race
On Sun, Mar 26, 2017 at 05:00:12PM +0200, Mateusz Guzik wrote: > The patch is somewhat incorrect, although from what I checked it happens > to generate the expected outcome in terms of assembly (the global pointer > read only once and then a local copy accessed several times). You either > need a linux-like READ_ONCE macros which casts through a volatile pointer > or mark the global as volatile to prevent the compiler from issuing > multiple reads in the future. Note that our OpenBSD kernel is still implemented with a big kernel lock in most places. So multi processor thinking and locking is not necessary. The execution order can only be interrupted by hardware interrups or explicitly rescheduling with tsleep(9). Here the sleep is hidden in copyin(), mallocarray(), sosend(), malloc(), ktrgenio(). Interrupts are not relevant, they never change syslogf. As tsleep() is a function call, it automatically acts as compiler barrier. So volatile is not needed. > Remaining ses sof syslogf are also super fishy: > 1. logclose > 2. logioctl -> LIOCSFD > FRELE(syslogf, p); A few months ago I would have said FRELE() does not sleep. I think this is currently still the case as we do not grab the netlock for socketpairs. But we did for a while. As we are heading towards multi processor in kernel, I would suggest this diff. It orders FREF() and FRELE() in a way that we only operate on refcounted global variables. Although not necessary with the current sleeping points, I think it results in more robust code. ok? bluhm Index: kern/subr_log.c === RCS file: /data/mirror/openbsd/cvs/src/sys/kern/subr_log.c,v retrieving revision 1.49 diff -u -p -r1.49 subr_log.c --- kern/subr_log.c 24 Mar 2017 16:42:38 - 1.49 +++ kern/subr_log.c 26 Mar 2017 19:37:05 - @@ -172,10 +172,12 @@ logopen(dev_t dev, int flags, int mode, int logclose(dev_t dev, int flag, int mode, struct proc *p) { + struct file *fp; - if (syslogf) - FRELE(syslogf, p); + fp = syslogf; syslogf = NULL; + if (fp) + FRELE(fp, p); log_open = 0; logsoftc.sc_state = 0; return (0); @@ -355,11 +357,11 @@ logioctl(dev_t dev, u_long com, caddr_t case LIOCSFD: if ((error = suser(p, 0)) != 0) return (error); - if ((error = getsock(p, *(int *)data, )) != 0) + fp = syslogf; + if ((error = getsock(p, *(int *)data, )) != 0) return (error); - if (syslogf) - FRELE(syslogf, p); - syslogf = fp; + if (fp) + FRELE(fp, p); break; default: Index: kern/uipc_syscalls.c === RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_syscalls.c,v retrieving revision 1.150 diff -u -p -r1.150 uipc_syscalls.c --- kern/uipc_syscalls.c14 Feb 2017 09:46:21 - 1.150 +++ kern/uipc_syscalls.c26 Mar 2017 19:37:05 - @@ -1146,8 +1146,8 @@ getsock(struct proc *p, int fdes, struct return (EBADF); if (fp->f_type != DTYPE_SOCKET) return (ENOTSOCK); - *fpp = fp; FREF(fp); + *fpp = fp; return (0); }
Re: sendsyslog file race
On Fri, Mar 24, 2017 at 4:56 PM, Alexander Bluhmwrote: > Hi, > > There is a race in dosendsyslog() which resulted in a crash on a > 5.9 system. sosend(syslogf->f_data, ...) was called with a NULL > pointer. So syslogf is not NULL, f_data is NULL and f_count is 1. > > The file structure is ref counted, but the global variable syslogf > is not protected. So it may change during sleep and dosendsyslog() > possibly uses a different socket at each access. > > Solution is to access syslogf ony once, use a local copy, and do > the ref counting there. > I'm sending this from the gmail web interface, so formatting may be screwed up. Apologies in advance. The patch is somewhat incorrect, although from what I checked it happens to generate the expected outcome in terms of assembly (the global pointer read only once and then a local copy accessed several times). You either need a linux-like READ_ONCE macros which casts through a volatile pointer or mark the global as volatile to prevent the compiler from issuing multiple reads in the future. Remaining ses sof syslogf are also super fishy: 1. logclose if (syslogf) FRELE(syslogf, p); syslogf = NULL; If this results in fdrop I presume it can block. But if that happens, the global points to a defunct object. 2. logioctl -> LIOCSFD if (syslogf) FRELE(syslogf, p); syslogf = fp; This one will give double free and ref leaks. Suggested fix is to xchg the pointer. While it is more expensive than necessary for this case, it does not pose a problem. Also xchg is blatantly obvious, while manual replacement would require explicit memory barriers to be placed here.
sendsyslog file race
Hi, There is a race in dosendsyslog() which resulted in a crash on a 5.9 system. sosend(syslogf->f_data, ...) was called with a NULL pointer. So syslogf is not NULL, f_data is NULL and f_count is 1. The file structure is ref counted, but the global variable syslogf is not protected. So it may change during sleep and dosendsyslog() possibly uses a different socket at each access. My crash happend during a reboot when init(8) is killing syslogd(8) and some sort of super daemon tries to restart it constantly. Although this design is questionable, it helps finding kernel bugs :-) Solution is to access syslogf ony once, use a local copy, and do the ref counting there. ok? bluhm Index: kern/subr_log.c === RCS file: /data/mirror/openbsd/cvs/src/sys/kern/subr_log.c,v retrieving revision 1.48 diff -u -p -r1.48 subr_log.c --- kern/subr_log.c 23 Jun 2016 15:41:42 - 1.48 +++ kern/subr_log.c 24 Mar 2017 15:31:49 - @@ -409,14 +409,17 @@ dosendsyslog(struct proc *p, const char struct iovec *ktriov = NULL; int iovlen; #endif + struct file *fp; char pri[6], *kbuf; struct iovec aiov; struct uio auio; size_t i, len; int error; - if (syslogf) - FREF(syslogf); + /* Global variable syslogf may change during sleep, use local copy. */ + fp = syslogf; + if (fp) + FREF(fp); else if (!ISSET(flags, LOG_CONS)) return (ENOTCONN); else { @@ -467,8 +470,8 @@ dosendsyslog(struct proc *p, const char #endif len = auio.uio_resid; - if (syslogf) { - error = sosend(syslogf->f_data, NULL, , NULL, NULL, 0); + if (fp) { + error = sosend(fp->f_data, NULL, , NULL, NULL, 0); if (error == 0) len -= auio.uio_resid; } else if (constty || cn_devvp) { @@ -515,8 +518,8 @@ dosendsyslog(struct proc *p, const char free(ktriov, M_TEMP, iovlen); } #endif - if (syslogf) - FRELE(syslogf, p); + if (fp) + FRELE(fp, p); else error = ENOTCONN; return (error);