Re: [Mingw-w64-public] printf speedup [PATCH]

2015-04-01 Thread Mattias Engdegård
 patch is ok.  I am still a bit curious to learn about that high delay
 caused by getenv ().

It was a while since I dissected it, but I believe getenv takes a lock. This 
hurts multithreaded programs in particular. Even without a lock, getenv still 
needs to do a linear string search through the entire environment (because it 
typically fails).


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] printf speedup [PATCH]

2015-04-01 Thread Mattias Engdegård
1 apr 2015 kl. 05.35 skrev Dongsheng Song dongsheng.s...@gmail.com:

 In my testing, getenv() is very fast.
 
 *) unset PRINTF_EXPONENT_DIGITS
 
 preheat 1 times, then perform 100 times (use 4.6 seconds)
 getenv cost: 4.6 us
 
 *) set PRINTF_EXPONENT_DIGITS=3
 preheat 1 times, then perform 100 times (use 3.41991 seconds)
 getenv: 3.41991 us

4 µs is a lot on a modern CPU, and an unacceptable overhead for a basic library 
function such as sprintf.


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] printf speedup [PATCH]

2015-04-01 Thread Kai Tietz
Hi Mattias,

patch is ok.  I am still a bit curious to learn about that high delay
caused by getenv ().

Kai

2015-04-01 5:35 GMT+02:00 Dongsheng Song dongsheng.s...@gmail.com:
 Hi Mattias,

 Could you share your micro benchmark data ?

 In my testing, getenv() is very fast.

 *) unset PRINTF_EXPONENT_DIGITS

 preheat 1 times, then perform 100 times (use 4.6 seconds)
 getenv cost: 4.6 us

 *) set PRINTF_EXPONENT_DIGITS=3
 preheat 1 times, then perform 100 times (use 3.41991 seconds)
 getenv: 3.41991 us

 My CPU is Core2 E6550 at 2.33 GHz.

 On Wed, Apr 1, 2015 at 10:26 AM, Dongsheng Song dongsheng.s...@gmail.com
 wrote:

 Cache getenv() looks a good idea !
 Patch is OK for me.

 On Wed, Apr 1, 2015 at 4:16 AM, Mattias Engdegård matti...@acm.org
 wrote:

 The functions in the __mingw_printf family are very slow because of the
 getenv(PRINTF_EXPONENT_DIGITS) call that is made every time, even when
 that information isn't actually needed.

 Please consider this patch. It only calls getenv once, caching the result
 (as is traditionally done in libraries that use environment variables this
 way). It also only computes the minimum exponent digits when actually
 needed, at most once per format call.

 With this patch, __mingw_sprintf(buf, x) goes from being several orders
 of magnitude slower than the MSVCRT sprintf, to about 66% faster. You don't
 see this kind of improvement every day.




 --
 Dive into the World of Parallel Programming The Go Parallel Website,
 sponsored
 by Intel and developed in partnership with Slashdot Media, is your hub for
 all
 things parallel software development, from weekly thought leadership blogs
 to
 news, videos, case studies, tutorials and more. Take a look and join the
 conversation now. http://goparallel.sourceforge.net/
 ___
 Mingw-w64-public mailing list
 Mingw-w64-public@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] printf speedup [PATCH]

2015-04-01 Thread Martin Mitáš


Dne 1. 4. 2015 v 10:13 Mattias Engdegård napsal(a):
 1 apr 2015 kl. 05.35 skrev Dongsheng Song dongsheng.s...@gmail.com:
 
 In my testing, getenv() is very fast.

 *) unset PRINTF_EXPONENT_DIGITS

 preheat 1 times, then perform 100 times (use 4.6 seconds)
 getenv cost: 4.6 us

 *) set PRINTF_EXPONENT_DIGITS=3
 preheat 1 times, then perform 100 times (use 3.41991 seconds)
 getenv: 3.41991 us
 
 4 µs is a lot on a modern CPU, and an unacceptable overhead for a basic 
 library function such as sprintf.

If I assume getenv() iterates over complete environment in most cases (when
PRINTF_EXPONENT_DIGITS is not set), then there is probably much worse overhead 
in a real world programs which is not likely covered in a trivial test program.

The function getenv() shall replace data in CPU cache, which are likely useful
for the ongoing program computation, with a lot of junk (the complete process
environment).

Morous

--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] printf speedup [PATCH]

2015-04-01 Thread Dongsheng Song
On Wed, Apr 1, 2015 at 4:33 PM, Martin Mitáš m...@morous.org wrote:



 Dne 1. 4. 2015 v 10:13 Mattias Engdegård napsal(a):
  1 apr 2015 kl. 05.35 skrev Dongsheng Song dongsheng.s...@gmail.com:
 
  In my testing, getenv() is very fast.
 
  *) unset PRINTF_EXPONENT_DIGITS
 
  preheat 1 times, then perform 100 times (use 4.6 seconds)
  getenv cost: 4.6 us
 
  *) set PRINTF_EXPONENT_DIGITS=3
  preheat 1 times, then perform 100 times (use 3.41991 seconds)
  getenv: 3.41991 us
 
  4 µs is a lot on a modern CPU, and an unacceptable overhead for a basic
 library function such as sprintf.

 If I assume getenv() iterates over complete environment in most cases (when
 PRINTF_EXPONENT_DIGITS is not set), then there is probably much worse
 overhead
 in a real world programs which is not likely covered in a trivial test
 program.

 The function getenv() shall replace data in CPU cache, which are likely
 useful
 for the ongoing program computation, with a lot of junk (the complete
 process
 environment).

 Morous


I'm certainly not opposed your patch, but a bit curious why getenv() slow
your program. we can always simplify a real world program to a test program
to discourse the performance issue.
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] printf speedup [PATCH]

2015-03-31 Thread Dongsheng Song
Hi Mattias,

Could you share your micro benchmark data ?

In my testing, getenv() is very fast.

*) unset PRINTF_EXPONENT_DIGITS

preheat 1 times, then perform 100 times (use 4.6 seconds)
getenv cost: 4.6 us

*) set PRINTF_EXPONENT_DIGITS=3
preheat 1 times, then perform 100 times (use 3.41991 seconds)
getenv: 3.41991 us

My CPU is Core2 E6550 at 2.33 GHz.

On Wed, Apr 1, 2015 at 10:26 AM, Dongsheng Song dongsheng.s...@gmail.com
wrote:

 Cache getenv() looks a good idea !
 Patch is OK for me.

 On Wed, Apr 1, 2015 at 4:16 AM, Mattias Engdegård matti...@acm.org
 wrote:

 The functions in the __mingw_printf family are very slow because of the
 getenv(”PRINTF_EXPONENT_DIGITS”) call that is made every time, even when
 that information isn’t actually needed.

 Please consider this patch. It only calls getenv once, caching the result
 (as is traditionally done in libraries that use environment variables this
 way). It also only computes the minimum exponent digits when actually
 needed, at most once per format call.

 With this patch, __mingw_sprintf(buf, ”x”) goes from being several orders
 of magnitude slower than the MSVCRT sprintf, to about 66% faster. You don’t
 see this kind of improvement every day.



#include winsock2.h
#include stdlib.h
#include stdio.h
#include errno.h

#define NUMBER_PREHEAT  1
#define NUMBER_PERFORM  100

int main(int argc, char *argv[])
{
int i;
double t;
LARGE_INTEGER freq, pc, pc2;

QueryPerformanceFrequency(freq);

for (i = 0; i  NUMBER_PREHEAT; i++) {
getenv(PRINTF_EXPONENT_DIGITS);
}

QueryPerformanceCounter(pc);
for (i = 0; i  NUMBER_PERFORM; i++) {
getenv(PRINTF_EXPONENT_DIGITS);
}
QueryPerformanceCounter(pc2);

t = (pc2.QuadPart - pc.QuadPart) / (double) freq.QuadPart;

fprintf(stdout, preheat %d times, then perform %d times (use %.5lf seconds)\n, NUMBER_PREHEAT, NUMBER_PERFORM, t);
fprintf(stdout, getenv cost: %.5lf us\n, t * 100.0 / NUMBER_PERFORM);

return 0;
}
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


[Mingw-w64-public] printf speedup [PATCH]

2015-03-31 Thread Mattias Engdegård
The functions in the __mingw_printf family are very slow because of the 
getenv(”PRINTF_EXPONENT_DIGITS”) call that is made every time, even when that 
information isn’t actually needed.

Please consider this patch. It only calls getenv once, caching the result (as 
is traditionally done in libraries that use environment variables this way). It 
also only computes the minimum exponent digits when actually needed, at most 
once per format call.

With this patch, __mingw_sprintf(buf, ”x”) goes from being several orders of 
magnitude slower than the MSVCRT sprintf, to about 66% faster. You don’t see 
this kind of improvement every day.

--- mingw-w64-crt/stdio/mingw_pformat.c.orig2015-03-31 20:27:41.000465100 
+0200
+++ mingw-w64-crt/stdio/mingw_pformat.c 2015-03-31 21:42:05.990919500 +0200
@@ -171,12 +171,17 @@
 static
 int __pformat_exponent_digits( void )
 {
-  char *exponent_digits = getenv( PRINTF_EXPONENT_DIGITS );
-  return ((exponent_digits != NULL)  ((unsigned)(*exponent_digits - '0')  
3))
-|| (_get_output_format()  _TWO_DIGIT_EXPONENT)
-? 2
-: 3
-;
+  /* Calling getenv is expensive; only do it once and cache the result. */
+  static int two_exp_digits_env = -1;
+  if (two_exp_digits_env == -1) {
+const char *exponent_digits_env = getenv( PRINTF_EXPONENT_DIGITS );
+two_exp_digits_env = exponent_digits_env != NULL
+  (unsigned)(*exponent_digits_env - '0')  3;
+  }
+  return (two_exp_digits_env || (_get_output_format()  _TWO_DIGIT_EXPONENT))
+ ? 2
+ : 3
+ ;
 }
 #else
 /*
@@ -1222,6 +1227,8 @@
 
   /* Ensure that this is at least as many as the standard requirement.
*/
+  if (stream-expmin == -1)
+stream-expmin = PFORMAT_MINEXP;
   if( exp_width  stream-expmin )
 exp_width = stream-expmin;
 
@@ -1817,7 +1824,8 @@
 (wchar_t)(0),  /* leave it unspecified   */
 0, /* zero output char count */
 max,   /* establish output limit */
-PFORMAT_MINEXP /* exponent chars preferred   */
+-1 /* exponent chars preferred;
+   -1 means to be determined. 
*/
   };
 
   format_scan: while( (c = *fmt++) != 0 )
--
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public