[patch] Fix multithreaded snprintf

2007-09-06 Thread Brian Dessent

I tracked down the problem reported in
http://www.cygwin.com/ml/cygwin/2007-09/msg00120.html.  The crash was
occuring in pthread_mutex_lock, but that's a bit of a red herring.  The
real problem is that both newlib and Cygwin provide a
include/sys/stdio.h file, however they were out of sync with regard to
the _flockfile definition.

This comes about because vsnprintf() is implemented by creating a struct
FILE that represents the string buffer and then this is passed to the
standard vfprintf().  The 'flags' member of this FILE has the __SSTR
flag set to indicate that this is just a string buffer, and consequently
no locking should or can be performed; the lock member isn't even
initialized.

The newlib/libc/include/sys/stdio.h therefore has this:

#if !defined(_flockfile)
#ifndef __SINGLE_THREAD__
#  define _flockfile(fp) (((fp)-_flags  __SSTR) ? 0 :
__lock_acquire_recursive((fp)-_lock))
#else
#  define _flockfile(fp)
#endif
#endif

#if !defined(_funlockfile)
#ifndef __SINGLE_THREAD__
#  define _funlockfile(fp) (((fp)-_flags  __SSTR) ? 0 :
__lock_release_recursive((fp)-_lock))
#else
#  define _funlockfile(fp)
#endif
#endif

However, the Cygwin version of this header with the same name gets
preference and doesn't know to check the flags like this, and thus
unconditionally tries to lock the stream.  This leads ultimately to a
crash in pthread_mutex_lock because the lock member is just
uninitialized junk.

The attached patch fixes Cygwin's copy of the header and makes the
poster's testcase function as expected.  This only would appear in a
multithreaded program because the __cygwin_lock_* functions expand to
no-op in the case where there's only one thread.

Since this is used in a C++ file (syscalls.cc) I couldn't do the test ?
0 : func() idiom where void is the return type as that generates a
compiler error, so I use an 'if'.

Brian2007-09-06  Brian Dessent  [EMAIL PROTECTED]

* include/sys/stdio.h (_flockfile): Don't try to lock a FILE
that has the __SSTR flag set.
(_ftrylockfile): Likewise.
(_funlockfile): Likewise.


Index: include/sys/stdio.h
===
RCS file: /cvs/src/src/winsup/cygwin/include/sys/stdio.h,v
retrieving revision 1.6
diff -u -p -r1.6 stdio.h
--- include/sys/stdio.h 5 Feb 2006 20:30:24 -   1.6
+++ include/sys/stdio.h 6 Sep 2007 18:27:33 -
@@ -16,13 +16,16 @@ details. */
 
 #if !defined(__SINGLE_THREAD__)
 #  if !defined(_flockfile)
-#define _flockfile(fp) __cygwin_lock_lock ((_LOCK_T *)(fp)-_lock)
+#define _flockfile(fp) ({ if (!((fp)-_flags  __SSTR)) \
+  __cygwin_lock_lock ((_LOCK_T *)(fp)-_lock); })
 #  endif
 #  if !defined(_ftrylockfile)
-#define _ftrylockfile(fp) __cygwin_lock_trylock ((_LOCK_T *)(fp)-_lock)
+#define _ftrylockfile(fp) (((fp)-_flags  __SSTR) ? 0 : \
+  __cygwin_lock_trylock ((_LOCK_T *)(fp)-_lock))
 #  endif
 #  if !defined(_funlockfile)
-#define _funlockfile(fp) __cygwin_lock_unlock ((_LOCK_T *)(fp)-_lock)
+#define _funlockfile(fp) ({ if (!((fp)-_flags  __SSTR)) \
+  __cygwin_lock_unlock ((_LOCK_T *)(fp)-_lock); })
 #  endif
 #endif
 


Re: [patch] Fix multithreaded snprintf

2007-09-06 Thread Brian Dessent
Christopher Faylor wrote:

 Go ahead and check this in but could you add a comment indicating that
 this part of include/sys/stdio.h has to be kept in sync with newlib?

Done.

 Nice catch!

I wish I could say I caught this by inspection but it was only by single
stepping through python guts that it became apparent what was going on.

Brian


Re: [patch] Fix multithreaded snprintf

2007-09-06 Thread Christopher Faylor
On Thu, Sep 06, 2007 at 11:30:17AM -0700, Brian Dessent wrote:

I tracked down the problem reported in
http://www.cygwin.com/ml/cygwin/2007-09/msg00120.html.  The crash was
occuring in pthread_mutex_lock, but that's a bit of a red herring.  The
real problem is that both newlib and Cygwin provide a
include/sys/stdio.h file, however they were out of sync with regard to
the _flockfile definition.

This comes about because vsnprintf() is implemented by creating a struct
FILE that represents the string buffer and then this is passed to the
standard vfprintf().  The 'flags' member of this FILE has the __SSTR
flag set to indicate that this is just a string buffer, and consequently
no locking should or can be performed; the lock member isn't even
initialized.

The newlib/libc/include/sys/stdio.h therefore has this:

#if !defined(_flockfile)
#ifndef __SINGLE_THREAD__
#  define _flockfile(fp) (((fp)-_flags  __SSTR) ? 0 :
__lock_acquire_recursive((fp)-_lock))
#else
#  define _flockfile(fp)
#endif
#endif

#if !defined(_funlockfile)
#ifndef __SINGLE_THREAD__
#  define _funlockfile(fp) (((fp)-_flags  __SSTR) ? 0 :
__lock_release_recursive((fp)-_lock))
#else
#  define _funlockfile(fp)
#endif
#endif

However, the Cygwin version of this header with the same name gets
preference and doesn't know to check the flags like this, and thus
unconditionally tries to lock the stream.  This leads ultimately to a
crash in pthread_mutex_lock because the lock member is just
uninitialized junk.

The attached patch fixes Cygwin's copy of the header and makes the
poster's testcase function as expected.  This only would appear in a
multithreaded program because the __cygwin_lock_* functions expand to
no-op in the case where there's only one thread.

Since this is used in a C++ file (syscalls.cc) I couldn't do the test ?
0 : func() idiom where void is the return type as that generates a
compiler error, so I use an 'if'.

Thanks for the patch.

Go ahead and check this in but could you add a comment indicating that
this part of include/sys/stdio.h has to be kept in sync with newlib?

Nice catch!

cgf


Re: [patch] Fix multithreaded snprintf

2007-09-06 Thread Christopher Faylor
On Thu, Sep 06, 2007 at 11:53:02AM -0700, Brian Dessent wrote:
Christopher Faylor wrote:
 Nice catch!

I wish I could say I caught this by inspection but it was only by single
stepping through python guts that it became apparent what was going on.

Better you than me.  :-)

cgf


Re: [patch] inline __getreent in newlib

2007-09-06 Thread Christopher Faylor
On Thu, Sep 06, 2007 at 04:38:04PM -0700, Brian Dessent wrote:

I noticed today that all instances of _REENT in newlib go through a
function call to __getreent().  All this function does is get the value
of %fs:4 and subtract a fixed offset from it, so this seems rather
wasteful.  And we already have the required value of this offset
computed for us in tlsoffsets.h, so we have everything we need to
provide newlib with an inline version of this function, saving the
overhead of a function call.  It would obviously be cleaner to be able
to do:

#define __getreent() (_my_tls.local_clib)

...however this would require dragging all kinds of internal Cygwin
definitions into a newlib header and since we already have the required
offset in tlsoffsets.h we might as well just use that.  The attached
patch does this; the second part would obviously have to be approved by
the newlib maintainers, but I thought I'd see if there's any interest in
this idea first before bothering them.

I don't pretend to claim that this is a very scientific benchmark at
all, but there does seem to be a slight improvement especially in the
getc column which represents reading the whole 16MB file one byte at a
time, where this _REENT overhead would be most pronounced.

So, valid optimization or just complication?

Brian
2007-09-06  Brian Dessent  [EMAIL PROTECTED]

   * include/cygwin/config.h (__getreent): Define inline version.

I've always meant to investigate some way to turn the reent stuff into
a macro in the newlib library after doing that for cygwin.  I'm not
wild about using offsets like this but I can't think of any other way
to do it which didn't have the problems that you describe.

So, I guess I'll come down on the side of speed over clarity.  I'm sure
that Jeff won't mind your checking in the undef in newlib.  So, please
check in everything but, again, document heavily what you're doing with
the reent macro.

Thanks.

cgf


Re: [patch] inline __getreent in newlib

2007-09-06 Thread Brian Dessent

CC'd to newlib: I've checked in the attached change to
libc/reent/getreent.c as obvious, please let me know if it breaks
anything.

Christopher Faylor wrote:

 So, I guess I'll come down on the side of speed over clarity.  I'm sure
 that Jeff won't mind your checking in the undef in newlib.  So, please
 check in everything but, again, document heavily what you're doing with
 the reent macro.

Done.  I added the following comment to config.h to hopefully clarify
the situation:

/* The following provides an inline version of __getreent() for newlib,
   which will be used throughout the library whereever there is a _r
   version of a function that takes _REENT.  This saves the overhead
   of a function call for what amounts to a simple computation.
   
   The definition below is essentially equivalent to the one in cygtls.h
   (_my_tls.local_clib) however it uses a fixed precomputed
   offset rather than dereferencing a field of a structure.
   
   Including tlsoffets.h here in order to get this constant offset
   tls_local_clib is a bit of a hack, but the alternative would require
   dragging the entire definition of struct _cygtls (a large and complex
   Cygwin internal data structure) into newlib.  The machinery to
   compute these offsets already exists for the sake of gendef so
   we might as well just use it here.  */

Brian2007-09-06  Brian Dessent  [EMAIL PROTECTED]

* libc/reent/getreent.c: Allow for case where __getreent is
defined as a macro.

Index: libc/reent/getreent.c
===
RCS file: /cvs/src/src/newlib/libc/reent/getreent.c,v
retrieving revision 1.1
diff -u -p -r1.1 getreent.c
--- libc/reent/getreent.c   17 May 2002 23:39:37 -  1.1
+++ libc/reent/getreent.c   6 Sep 2007 23:13:10 -
@@ -3,6 +3,10 @@
 #include _ansi.h
 #include reent.h
 
+#ifdef __getreent
+#undef __getreent
+#endif
+
 struct _reent *
 _DEFUN_VOID(__getreent)
 {


[patch] inline __getreent in newlib

2007-09-06 Thread Brian Dessent

I noticed today that all instances of _REENT in newlib go through a
function call to __getreent().  All this function does is get the value
of %fs:4 and subtract a fixed offset from it, so this seems rather
wasteful.  And we already have the required value of this offset
computed for us in tlsoffsets.h, so we have everything we need to
provide newlib with an inline version of this function, saving the
overhead of a function call.  It would obviously be cleaner to be able
to do:

#define __getreent() (_my_tls.local_clib)

...however this would require dragging all kinds of internal Cygwin
definitions into a newlib header and since we already have the required
offset in tlsoffsets.h we might as well just use that.  The attached
patch does this; the second part would obviously have to be approved by
the newlib maintainers, but I thought I'd see if there's any interest in
this idea first before bothering them.

The following is the result of the iospeed output from the testsuite:
(units are ms elapsed as returned by GetTickCount, so smaller is better,
but note that the resolution here is at best 10ms.)

Before:
  - text -   binary 
lineszcr  getc fread fgets  getc fread fgets
 4 0  1906   110   656  189078   719
64 0  190694   218  190746   110
  4096 0  1922   125   172  23136263
 4 1  1438   203   640  189063   719
64 1  1891   109   219  19226394
  4096 1  193893   188  19227878

After:
  - text -   binary 
lineszcr  getc fread fgets  getc fread fgets
 4 0  1781   125   672  178262   703
64 0  1765   110   218  175062   109
  4096 0  179793   188  17667878
 4 1  1328   188   609  175062   719
64 1  1750   109   203  178147   109
  4096 1  1797   125   172  17666263

I don't pretend to claim that this is a very scientific benchmark at
all, but there does seem to be a slight improvement especially in the
getc column which represents reading the whole 16MB file one byte at a
time, where this _REENT overhead would be most pronounced.

So, valid optimization or just complication?

Brian2007-09-06  Brian Dessent  [EMAIL PROTECTED]

* include/cygwin/config.h (__getreent): Define inline version.


Index: include/cygwin/config.h
===
RCS file: /cvs/src/src/winsup/cygwin/include/cygwin/config.h,v
retrieving revision 1.5
diff -u -p -r1.5 config.h
--- include/cygwin/config.h 15 Nov 2003 17:04:10 -  1.5
+++ include/cygwin/config.h 6 Sep 2007 23:12:33 -
@@ -20,6 +20,9 @@ extern C {
 #define _CYGWIN_CONFIG_H
 
 #define __DYNAMIC_REENT__
+#include ../tlsoffsets.h
+extern char *_tlsbase __asm__ (%fs:4);
+#define __getreent() (struct _reent *)(_tlsbase + tls_local_clib)
 #define __FILENAME_MAX__ (260 - 1 /* NUL */)
 #define _READ_WRITE_RETURN_TYPE _ssize_t
 #define __LARGE64_FILES 1
2007-09-06  Brian Dessent  [EMAIL PROTECTED]

* libc/reent/getreent.c: Allow for case where __getreent is
defined as a macro.

Index: libc/reent/getreent.c
===
RCS file: /cvs/src/src/newlib/libc/reent/getreent.c,v
retrieving revision 1.1
diff -u -p -r1.1 getreent.c
--- libc/reent/getreent.c   17 May 2002 23:39:37 -  1.1
+++ libc/reent/getreent.c   6 Sep 2007 23:13:10 -
@@ -3,6 +3,10 @@
 #include _ansi.h
 #include reent.h
 
+#ifdef __getreent
+#undef __getreent
+#endif
+
 struct _reent *
 _DEFUN_VOID(__getreent)
 {