Re: [Mingw-w64-public] printf speedup [PATCH]
patch is ok. I am still a bit curious to learn about that high delay caused by getenv (). It was a while since I dissected it, but I believe getenv takes a lock. This hurts multithreaded programs in particular. Even without a lock, getenv still needs to do a linear string search through the entire environment (because it typically fails). -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] printf speedup [PATCH]
1 apr 2015 kl. 05.35 skrev Dongsheng Song dongsheng.s...@gmail.com: In my testing, getenv() is very fast. *) unset PRINTF_EXPONENT_DIGITS preheat 1 times, then perform 100 times (use 4.6 seconds) getenv cost: 4.6 us *) set PRINTF_EXPONENT_DIGITS=3 preheat 1 times, then perform 100 times (use 3.41991 seconds) getenv: 3.41991 us 4 µs is a lot on a modern CPU, and an unacceptable overhead for a basic library function such as sprintf. -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] printf speedup [PATCH]
Hi Mattias, patch is ok. I am still a bit curious to learn about that high delay caused by getenv (). Kai 2015-04-01 5:35 GMT+02:00 Dongsheng Song dongsheng.s...@gmail.com: Hi Mattias, Could you share your micro benchmark data ? In my testing, getenv() is very fast. *) unset PRINTF_EXPONENT_DIGITS preheat 1 times, then perform 100 times (use 4.6 seconds) getenv cost: 4.6 us *) set PRINTF_EXPONENT_DIGITS=3 preheat 1 times, then perform 100 times (use 3.41991 seconds) getenv: 3.41991 us My CPU is Core2 E6550 at 2.33 GHz. On Wed, Apr 1, 2015 at 10:26 AM, Dongsheng Song dongsheng.s...@gmail.com wrote: Cache getenv() looks a good idea ! Patch is OK for me. On Wed, Apr 1, 2015 at 4:16 AM, Mattias Engdegård matti...@acm.org wrote: The functions in the __mingw_printf family are very slow because of the getenv(PRINTF_EXPONENT_DIGITS) call that is made every time, even when that information isn't actually needed. Please consider this patch. It only calls getenv once, caching the result (as is traditionally done in libraries that use environment variables this way). It also only computes the minimum exponent digits when actually needed, at most once per format call. With this patch, __mingw_sprintf(buf, x) goes from being several orders of magnitude slower than the MSVCRT sprintf, to about 66% faster. You don't see this kind of improvement every day. -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] printf speedup [PATCH]
Dne 1. 4. 2015 v 10:13 Mattias Engdegård napsal(a): 1 apr 2015 kl. 05.35 skrev Dongsheng Song dongsheng.s...@gmail.com: In my testing, getenv() is very fast. *) unset PRINTF_EXPONENT_DIGITS preheat 1 times, then perform 100 times (use 4.6 seconds) getenv cost: 4.6 us *) set PRINTF_EXPONENT_DIGITS=3 preheat 1 times, then perform 100 times (use 3.41991 seconds) getenv: 3.41991 us 4 µs is a lot on a modern CPU, and an unacceptable overhead for a basic library function such as sprintf. If I assume getenv() iterates over complete environment in most cases (when PRINTF_EXPONENT_DIGITS is not set), then there is probably much worse overhead in a real world programs which is not likely covered in a trivial test program. The function getenv() shall replace data in CPU cache, which are likely useful for the ongoing program computation, with a lot of junk (the complete process environment). Morous -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/ ___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] printf speedup [PATCH]
On Wed, Apr 1, 2015 at 4:33 PM, Martin Mitáš m...@morous.org wrote: Dne 1. 4. 2015 v 10:13 Mattias Engdegård napsal(a): 1 apr 2015 kl. 05.35 skrev Dongsheng Song dongsheng.s...@gmail.com: In my testing, getenv() is very fast. *) unset PRINTF_EXPONENT_DIGITS preheat 1 times, then perform 100 times (use 4.6 seconds) getenv cost: 4.6 us *) set PRINTF_EXPONENT_DIGITS=3 preheat 1 times, then perform 100 times (use 3.41991 seconds) getenv: 3.41991 us 4 µs is a lot on a modern CPU, and an unacceptable overhead for a basic library function such as sprintf. If I assume getenv() iterates over complete environment in most cases (when PRINTF_EXPONENT_DIGITS is not set), then there is probably much worse overhead in a real world programs which is not likely covered in a trivial test program. The function getenv() shall replace data in CPU cache, which are likely useful for the ongoing program computation, with a lot of junk (the complete process environment). Morous I'm certainly not opposed your patch, but a bit curious why getenv() slow your program. we can always simplify a real world program to a test program to discourse the performance issue. -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
Re: [Mingw-w64-public] printf speedup [PATCH]
Hi Mattias, Could you share your micro benchmark data ? In my testing, getenv() is very fast. *) unset PRINTF_EXPONENT_DIGITS preheat 1 times, then perform 100 times (use 4.6 seconds) getenv cost: 4.6 us *) set PRINTF_EXPONENT_DIGITS=3 preheat 1 times, then perform 100 times (use 3.41991 seconds) getenv: 3.41991 us My CPU is Core2 E6550 at 2.33 GHz. On Wed, Apr 1, 2015 at 10:26 AM, Dongsheng Song dongsheng.s...@gmail.com wrote: Cache getenv() looks a good idea ! Patch is OK for me. On Wed, Apr 1, 2015 at 4:16 AM, Mattias Engdegård matti...@acm.org wrote: The functions in the __mingw_printf family are very slow because of the getenv(”PRINTF_EXPONENT_DIGITS”) call that is made every time, even when that information isn’t actually needed. Please consider this patch. It only calls getenv once, caching the result (as is traditionally done in libraries that use environment variables this way). It also only computes the minimum exponent digits when actually needed, at most once per format call. With this patch, __mingw_sprintf(buf, ”x”) goes from being several orders of magnitude slower than the MSVCRT sprintf, to about 66% faster. You don’t see this kind of improvement every day. #include winsock2.h #include stdlib.h #include stdio.h #include errno.h #define NUMBER_PREHEAT 1 #define NUMBER_PERFORM 100 int main(int argc, char *argv[]) { int i; double t; LARGE_INTEGER freq, pc, pc2; QueryPerformanceFrequency(freq); for (i = 0; i NUMBER_PREHEAT; i++) { getenv(PRINTF_EXPONENT_DIGITS); } QueryPerformanceCounter(pc); for (i = 0; i NUMBER_PERFORM; i++) { getenv(PRINTF_EXPONENT_DIGITS); } QueryPerformanceCounter(pc2); t = (pc2.QuadPart - pc.QuadPart) / (double) freq.QuadPart; fprintf(stdout, preheat %d times, then perform %d times (use %.5lf seconds)\n, NUMBER_PREHEAT, NUMBER_PERFORM, t); fprintf(stdout, getenv cost: %.5lf us\n, t * 100.0 / NUMBER_PERFORM); return 0; } -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
[Mingw-w64-public] printf speedup [PATCH]
The functions in the __mingw_printf family are very slow because of the getenv(”PRINTF_EXPONENT_DIGITS”) call that is made every time, even when that information isn’t actually needed. Please consider this patch. It only calls getenv once, caching the result (as is traditionally done in libraries that use environment variables this way). It also only computes the minimum exponent digits when actually needed, at most once per format call. With this patch, __mingw_sprintf(buf, ”x”) goes from being several orders of magnitude slower than the MSVCRT sprintf, to about 66% faster. You don’t see this kind of improvement every day. --- mingw-w64-crt/stdio/mingw_pformat.c.orig2015-03-31 20:27:41.000465100 +0200 +++ mingw-w64-crt/stdio/mingw_pformat.c 2015-03-31 21:42:05.990919500 +0200 @@ -171,12 +171,17 @@ static int __pformat_exponent_digits( void ) { - char *exponent_digits = getenv( PRINTF_EXPONENT_DIGITS ); - return ((exponent_digits != NULL) ((unsigned)(*exponent_digits - '0') 3)) -|| (_get_output_format() _TWO_DIGIT_EXPONENT) -? 2 -: 3 -; + /* Calling getenv is expensive; only do it once and cache the result. */ + static int two_exp_digits_env = -1; + if (two_exp_digits_env == -1) { +const char *exponent_digits_env = getenv( PRINTF_EXPONENT_DIGITS ); +two_exp_digits_env = exponent_digits_env != NULL + (unsigned)(*exponent_digits_env - '0') 3; + } + return (two_exp_digits_env || (_get_output_format() _TWO_DIGIT_EXPONENT)) + ? 2 + : 3 + ; } #else /* @@ -1222,6 +1227,8 @@ /* Ensure that this is at least as many as the standard requirement. */ + if (stream-expmin == -1) +stream-expmin = PFORMAT_MINEXP; if( exp_width stream-expmin ) exp_width = stream-expmin; @@ -1817,7 +1824,8 @@ (wchar_t)(0), /* leave it unspecified */ 0, /* zero output char count */ max, /* establish output limit */ -PFORMAT_MINEXP /* exponent chars preferred */ +-1 /* exponent chars preferred; + -1 means to be determined. */ }; format_scan: while( (c = *fmt++) != 0 ) -- Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/___ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public