Re: syslog() thread unsafety

2017-06-14 Thread Eugene Grosbein
15.06.2017 0:09, Konstantin Belousov wrote:

> On Wed, Jun 14, 2017 at 11:39:39PM +0700, Eugene Grosbein wrote:
>> 14.06.2017 21:12, Konstantin Belousov wrote:
>>
>>> If the issue is that mpd5 cancels logging thread, and this leaves the
>>> mutex in the locked state, the right solution is to establish a cleanup
>>> handler around the locked region.  Note that this can only work if the
>>> cancellation is in deferred mode, async mode is unsafe by definition.
>>>
>>> Try something like this, untested even a minimal bit.
>>
>> [skip]
>>
>> I've given it a spin with unpatched mpd5 and it seems to work just fine now.
>> I'm curious, should these two lines be swapped?
>>
>> +THREAD_LOCK();
>> +pthread_cleanup_push(syslog_cancel_cleanup, NULL);
>>
>> It seems it could be a race between another thread's pthread_cancel()
>> and pthread_cleanup_push() here.
> As I mentioned in my previous reply, you can only rely on some
> guarantees for the deferred cancellation mode. In this mode, only some
> calls are allowed to become cancellation points. In particular, the
> cancellation cannot happen between THREAD_LOCK() and push().

Forgot to mention that mpd5 does not change the mode to 
PTHREAD_CANCEL_ASYNCHRONOUS
from the default deferred.

> That is, popen() looks safe.
>> getlogin.c
>> lib/libc/stdio: findfp.c, fclose.c
> I already wrote about fclose() above, might be it indeed should grow the
> cleanup handler for the general case. I do not see any code which would
> need cancellation protection in findfp, what exactly do you mean there ?
> Same for getlogin().

That was just quick scan for THREAD_LOCK() :-)
I'm new to these details of pthreads.

>> Please consider committing the fix at least for syslog.c
> Confirm that it fixed your case, then I will commit the change.

Well, my stress test for mpd5 has been running for several hours
and no hang still. Without your patch, it locked in a minute,
so it seems the case is fixed, thanks!

Eugene Grosbein

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.1-BETA1: lsof build failure

2017-06-14 Thread Peter

Larry Rosenman wrote:

Current lsof is 4.90M.


Ack, that does.



Larry
Sysutils/lsof maintainer

On 6/14/17, 8:13 AM, "Peter"  wrote:

 FYI, please check if reproducible and/or issue:


 Installed this from SVN & local build:

 11.1-BETA1 FreeBSD 11.1-BETA1 #0 r319858:319867M ... amd64

 Then tried to update lsof-4.90.f,8 and got this error:

 cc  -pipe -DNEEDS_BOOL_TYPEDEF -DHASTASKS -DHAS_PAUSE_SBT
 -DHASEFFNLINK=i_effnlink -DHASF_VNODE -DHAS_FILEDESCENT -DHAS_TMPFS
 -DHASWCTYPE_H -DHASSBSTATE -DHAS_KVM_VNODE -DHAS_UFS1_2
 -DHAS_VM_MEMATTR_T -DHAS_CDEV2PRIV -DHAS_NO_SI_UDEV -DHAS_SYS_SX_H
 -DHASFUSEFS -DHAS_ZFS -DHAS_V_LOCKF -DHAS_LOCKF_ENTRY -DHAS_NO_6PORT
 -DHAS_NO_6PPCB -DNEEDS_BOOLEAN_T -DHAS_SB_CCC -DHAS_FDESCENTTBL
 -DFREEBSDV=11000 -DHASFDESCFS=2 -DHASPSEUDOFS -DHASNULLFS -DHASIPv6
 -DHASUTMPX -DHAS_STRFTIME -DLSOF_VSTR="11.1-BETA1" -I/usr/src/sys -O2 -c
 dvch.c -o dvch.o
 --- dnode.o ---
 dnode.c:906:13: error: no member named 'i_dev' in 'struct inode'
  if (i->i_dev
  ~  ^
 dnode.c:916:27: error: no member named 'i_dev' in 'struct inode'
  dev = Dev2Udev((KA_T)i->i_dev);
   ~  ^
 2 errors generated.
 *** [dnode.o] Error code 1

 ___
 freebsd-stable@freebsd.org mailing list
 https://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: syslog() thread unsafety

2017-06-14 Thread Konstantin Belousov
On Wed, Jun 14, 2017 at 11:39:39PM +0700, Eugene Grosbein wrote:
> 14.06.2017 21:12, Konstantin Belousov wrote:
> 
> > If the issue is that mpd5 cancels logging thread, and this leaves the
> > mutex in the locked state, the right solution is to establish a cleanup
> > handler around the locked region.  Note that this can only work if the
> > cancellation is in deferred mode, async mode is unsafe by definition.
> > 
> > Try something like this, untested even a minimal bit.
> 
> [skip]
> 
> I've given it a spin with unpatched mpd5 and it seems to work just fine now.
> I'm curious, should these two lines be swapped?
> 
> + THREAD_LOCK();
> + pthread_cleanup_push(syslog_cancel_cleanup, NULL);
> 
> It seems it could be a race between another thread's pthread_cancel()
> and pthread_cleanup_push() here.
As I mentioned in my previous reply, you can only rely on some
guarantees for the deferred cancellation mode. In this mode, only some
calls are allowed to become cancellation points. In particular, the
cancellation cannot happen between THREAD_LOCK() and push().

OTOH, if you configured the thread for async cancellation mode, then
cancellation can happen anywhere. If you called any non-signal-safe
function, any invariant can be broken. That is, the cancellation cleanup
is not supposed to be useful with async cancellation.  And you are not
supposed to call into stdio or syslog() while in async.

> 
> Anyway, we have several other places in the lib/ with similar code
> possibly missing pthread_cleanup_push():
> 
> lib/libc/gen: popen.c,
I doubt that popen() needs the cancellation protection.  The _close()
calls there are explicitely not the cancellation points, they are
direct syscall invocations, and are not subject to cancellation testing.

The only potentially problematic case could be the fclose() call in the locked
region.  But again, fclose(3) only potential cancellation point is the
fp->_close() call.  For fdopen()-ed FILEs, _close is set to the __sclose()
wrapper which only calls _close().

That is, popen() looks safe.

> getlogin.c
> lib/libc/stdio: findfp.c, fclose.c
I already wrote about fclose() above, might be it indeed should grow the
cleanup handler for the general case. I do not see any code which would
need cancellation protection in findfp, what exactly do you mean there ?
Same for getlogin().

> 
> Please consider committing the fix at least for syslog.c
Confirm that it fixed your case, then I will commit the change.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: syslog() thread unsafety

2017-06-14 Thread Eugene Grosbein
14.06.2017 21:12, Konstantin Belousov wrote:

> If the issue is that mpd5 cancels logging thread, and this leaves the
> mutex in the locked state, the right solution is to establish a cleanup
> handler around the locked region.  Note that this can only work if the
> cancellation is in deferred mode, async mode is unsafe by definition.
> 
> Try something like this, untested even a minimal bit.

[skip]

I've given it a spin with unpatched mpd5 and it seems to work just fine now.
I'm curious, should these two lines be swapped?

+   THREAD_LOCK();
+   pthread_cleanup_push(syslog_cancel_cleanup, NULL);

It seems it could be a race between another thread's pthread_cancel()
and pthread_cleanup_push() here.

Anyway, we have several other places in the lib/ with similar code
possibly missing pthread_cleanup_push():

lib/libc/gen: popen.c, getlogin.c
lib/libc/stdio: findfp.c, fclose.c

Please consider committing the fix at least for syslog.c


___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: syslog() thread unsafety

2017-06-14 Thread Konstantin Belousov
On Wed, Jun 14, 2017 at 08:49:36PM +0700, Eugene Grosbein wrote:
> Hi!
> 
> Our [v]syslog() implementation in src/lib/libc/gen/syslog.c
> tries to be thread-safe and uses "syslog_mutex".
> 
> It may lock this mutex then call blocking system calls like sendto().
> If a thread owning this mutex is pthread_cancel()'d in process,
> the mutex stays UMUTEX_CONTESTED and every other thread calling [v]syslog() 
> deadlocks.
> 
> I can reproduce this with net/mpd5 daemon reliably within some seconds
> from the beginning of my stress test involving RADIUS so mpd5 runs 
> multi-threaded.
> 
> I've tried to wrap every mpd5's [v]syslog() call around with
> pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, ) and
> pthread_setcancelstate(oldstate, NULL) statements and this eliminates the 
> problem
> but that's not a solution in any way.
> 
> This problem seems to be root cause of multiple complaints
> about mpd5 being unstable under FreeBSD 9.x+ version (PR: 186114, 214482).

If the issue is that mpd5 cancels logging thread, and this leaves the
mutex in the locked state, the right solution is to establish a cleanup
handler around the locked region.  Note that this can only work if the
cancellation is in deferred mode, async mode is unsafe by definition.

Try something like this, untested even a minimal bit.

diff --git a/lib/libc/gen/syslog.c b/lib/libc/gen/syslog.c
index dfca67360fa..2c390efe95e 100644
--- a/lib/libc/gen/syslog.c
+++ b/lib/libc/gen/syslog.c
@@ -129,8 +129,8 @@ syslog(int pri, const char *fmt, ...)
va_end(ap);
 }
 
-void
-vsyslog(int pri, const char *fmt, va_list ap)
+static void
+vsyslog1(int pri, const char *fmt, va_list ap)
 {
int cnt;
char ch, *p;
@@ -151,13 +151,9 @@ vsyslog(int pri, const char *fmt, va_list ap)
 
saved_errno = errno;
 
-   THREAD_LOCK();
-
/* Check priority against setlogmask values. */
-   if (!(LOG_MASK(LOG_PRI(pri)) & LogMask)) {
-   THREAD_UNLOCK();
+   if (!(LOG_MASK(LOG_PRI(pri)) & LogMask))
return;
-   }
 
/* Set default facility if none specified. */
if ((pri & LOG_FACMASK) == 0)
@@ -167,10 +163,8 @@ vsyslog(int pri, const char *fmt, va_list ap)
tbuf_cookie.base = tbuf;
tbuf_cookie.left = sizeof(tbuf);
fp = fwopen(_cookie, writehook);
-   if (fp == NULL) {
-   THREAD_UNLOCK();
+   if (fp == NULL)
return;
-   }
 
/* Build the message. */
(void)time();
@@ -200,7 +194,6 @@ vsyslog(int pri, const char *fmt, va_list ap)
fmt_fp = fwopen(_cookie, writehook);
if (fmt_fp == NULL) {
fclose(fp);
-   THREAD_UNLOCK();
return;
}
 
@@ -285,10 +278,8 @@ vsyslog(int pri, const char *fmt, va_list ap)
 */
disconnectlog();
connectlog();
-   if (send(LogFile, tbuf, cnt, 0) >= 0) {
-   THREAD_UNLOCK();
+   if (send(LogFile, tbuf, cnt, 0) >= 0)
return;
-   }
/*
 * if the resend failed, fall through to
 * possible scenario 2
@@ -303,15 +294,11 @@ vsyslog(int pri, const char *fmt, va_list ap)
if (status == CONNPRIV)
break;
_usleep(1);
-   if (send(LogFile, tbuf, cnt, 0) >= 0) {
-   THREAD_UNLOCK();
+   if (send(LogFile, tbuf, cnt, 0) >= 0)
return;
-   }
}
-   } else {
-   THREAD_UNLOCK();
+   } else
return;
-   }
 
/*
 * Output the message to the console; try not to block
@@ -333,10 +320,25 @@ vsyslog(int pri, const char *fmt, va_list ap)
(void)_writev(fd, iov, 2);
(void)_close(fd);
}
+}
+
+static void
+syslog_cancel_cleanup(void *arg __unused)
+{
 
THREAD_UNLOCK();
 }
 
+void
+vsyslog(int pri, const char *fmt, va_list ap)
+{
+
+   THREAD_LOCK();
+   pthread_cleanup_push(syslog_cancel_cleanup, NULL);
+   vsyslog1(pri, fmt, ap);
+   pthread_cleanup_pop(1);
+}
+
 /* Should be called with mutex acquired */
 static void
 disconnectlog(void)
@@ -423,9 +425,11 @@ openlog_unlocked(const char *ident, int logstat, int 
logfac)
 void
 openlog(const char *ident, int logstat, int logfac)
 {
+
THREAD_LOCK();
+   pthread_cleanup_push(syslog_cancel_cleanup, NULL);
openlog_unlocked(ident, logstat, logfac);
-   THREAD_UNLOCK();
+   pthread_cleanup_pop(1);
 }
 
 
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To 

syslog() thread unsafety

2017-06-14 Thread Eugene Grosbein
Hi!

Our [v]syslog() implementation in src/lib/libc/gen/syslog.c
tries to be thread-safe and uses "syslog_mutex".

It may lock this mutex then call blocking system calls like sendto().
If a thread owning this mutex is pthread_cancel()'d in process,
the mutex stays UMUTEX_CONTESTED and every other thread calling [v]syslog() 
deadlocks.

I can reproduce this with net/mpd5 daemon reliably within some seconds
from the beginning of my stress test involving RADIUS so mpd5 runs 
multi-threaded.

I've tried to wrap every mpd5's [v]syslog() call around with
pthread_setcancelstate(PTHREAD_CANCEL_DISABLE, ) and
pthread_setcancelstate(oldstate, NULL) statements and this eliminates the 
problem
but that's not a solution in any way.

This problem seems to be root cause of multiple complaints
about mpd5 being unstable under FreeBSD 9.x+ version (PR: 186114, 214482).

Please take a look.

Eugene Grosbein
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: 11.1-BETA1: lsof build failure

2017-06-14 Thread Larry Rosenman
Current lsof is 4.90M. 

Larry
Sysutils/lsof maintainer

On 6/14/17, 8:13 AM, "Peter"  wrote:

FYI, please check if reproducible and/or issue:


Installed this from SVN & local build:

11.1-BETA1 FreeBSD 11.1-BETA1 #0 r319858:319867M ... amd64

Then tried to update lsof-4.90.f,8 and got this error:

cc  -pipe -DNEEDS_BOOL_TYPEDEF -DHASTASKS -DHAS_PAUSE_SBT 
-DHASEFFNLINK=i_effnlink -DHASF_VNODE -DHAS_FILEDESCENT -DHAS_TMPFS 
-DHASWCTYPE_H -DHASSBSTATE -DHAS_KVM_VNODE -DHAS_UFS1_2 
-DHAS_VM_MEMATTR_T -DHAS_CDEV2PRIV -DHAS_NO_SI_UDEV -DHAS_SYS_SX_H 
-DHASFUSEFS -DHAS_ZFS -DHAS_V_LOCKF -DHAS_LOCKF_ENTRY -DHAS_NO_6PORT 
-DHAS_NO_6PPCB -DNEEDS_BOOLEAN_T -DHAS_SB_CCC -DHAS_FDESCENTTBL 
-DFREEBSDV=11000 -DHASFDESCFS=2 -DHASPSEUDOFS -DHASNULLFS -DHASIPv6 
-DHASUTMPX -DHAS_STRFTIME -DLSOF_VSTR="11.1-BETA1" -I/usr/src/sys -O2 -c 
dvch.c -o dvch.o
--- dnode.o ---
dnode.c:906:13: error: no member named 'i_dev' in 'struct inode'
 if (i->i_dev
 ~  ^
dnode.c:916:27: error: no member named 'i_dev' in 'struct inode'
 dev = Dev2Udev((KA_T)i->i_dev);
  ~  ^
2 errors generated.
*** [dnode.o] Error code 1

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


11.1-BETA1: lsof build failure

2017-06-14 Thread Peter

FYI, please check if reproducible and/or issue:


Installed this from SVN & local build:

11.1-BETA1 FreeBSD 11.1-BETA1 #0 r319858:319867M ... amd64

Then tried to update lsof-4.90.f,8 and got this error:

cc  -pipe -DNEEDS_BOOL_TYPEDEF -DHASTASKS -DHAS_PAUSE_SBT 
-DHASEFFNLINK=i_effnlink -DHASF_VNODE -DHAS_FILEDESCENT -DHAS_TMPFS 
-DHASWCTYPE_H -DHASSBSTATE -DHAS_KVM_VNODE -DHAS_UFS1_2 
-DHAS_VM_MEMATTR_T -DHAS_CDEV2PRIV -DHAS_NO_SI_UDEV -DHAS_SYS_SX_H 
-DHASFUSEFS -DHAS_ZFS -DHAS_V_LOCKF -DHAS_LOCKF_ENTRY -DHAS_NO_6PORT 
-DHAS_NO_6PPCB -DNEEDS_BOOLEAN_T -DHAS_SB_CCC -DHAS_FDESCENTTBL 
-DFREEBSDV=11000 -DHASFDESCFS=2 -DHASPSEUDOFS -DHASNULLFS -DHASIPv6 
-DHASUTMPX -DHAS_STRFTIME -DLSOF_VSTR="11.1-BETA1" -I/usr/src/sys -O2 -c 
dvch.c -o dvch.o

--- dnode.o ---
dnode.c:906:13: error: no member named 'i_dev' in 'struct inode'
if (i->i_dev
~  ^
dnode.c:916:27: error: no member named 'i_dev' in 'struct inode'
dev = Dev2Udev((KA_T)i->i_dev);
 ~  ^
2 errors generated.
*** [dnode.o] Error code 1

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"