Re: [proto] proto performance
> MacBook Pro, 10.6.6, Core 2 Duo > ProtoContext ProtoTransform > ProtoLambda Loop > GCC 4.2.1 (Apple) : 5.3565438 5.3721942 > 126.38458 1.3657978 > GCC 4.4.5 : 1.8878364 1.8845548 > 70.056237 0.942303 > GCC 4.5.2 : 1.8840608 1.889619 > 1.2806688 1.0589558 > GCC 4.6.0 (2/5/11): 1.88547681.8834438 > 1.2783471.2345208 > CLANG 2.9 (125472): 5.455976 5.4627628 > 3.8251041.2330524 > > Now, removing the ((noinline)), gives (in the same order) > > GCC 4.2.1 (Apple) : 4.1448478 5.3795842 126.53211 > 1.3215378 > GCC 4.4.5 : 1.2505956 1.2500816 69.409665 > 0.7198288 > GCC 4.5.2 : 0.5961430.7213138 0.71969283 > 0.7211534 > GCC 4.6.0 (2/5/11): 1.2942638 1.4324828 0.646147 > 0.6632324 > CLANG 2.9 (125472): 1.2975226 1.2966478 1.3849834 > 1.2452362 Interesting results. I have done a similar test for loops (for, while, with/without pointers) and obtained similar results. Everything depends on the compiler. I think the order of the above numbers will drastically change if the expression is small, like x3 = x1 + 2.0 * x2. > I'm not sure how meaningful this second set of numbers is. If the evaluation > functions are inlined, the compiler > can realize that evaluating them num_of_steps times is unnecessary since the > data isn't changing between > iterations. It then (I believe) optimizes out certain parts of the loop in > certain cases. Maybe it would be better to evaluate something with the increment assign operator, x3 += x1 + 2.0 * x2. ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
Not sure what happened to those tables. I'll try again. MacBook Pro, 10.6.6, Core 2 Duo ProtoContextProtoTransform ProtoLambda Loop GCC 4.2.1 (Apple) : 5.3565438 5.3721942 126.38458 1.3657978 GCC 4.4.5 : 1.8878364 1.8845548 70.056237 0.942303 GCC 4.5.2 : 1.8840608 1.8896191.2806688 1.0589558 GCC 4.6.0 (2/5/11): 1.8854768 1.8834438 1.278347 1.2345208 CLANG 2.9 (125472): 5.4559765.4627628 3.825104 1.2330524 Now, removing the ((noinline)), gives (in the same order) GCC 4.2.1 (Apple) : 4.1448478 5.3795842 126.53211 1.3215378 GCC 4.4.5 : 1.2505956 1.2500816 69.409665 0.7198288 GCC 4.5.2 : 0.5961430.7213138 0.71969283 0.7211534 GCC 4.6.0 (2/5/11): 1.2942638 1.4324828 0.646147 0.6632324 CLANG 2.9 (125472): 1.2975226 1.2966478 1.3849834 1.2452362 Nate smime.p7s Description: S/MIME cryptographic signature ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
On Feb 20, 2011, at 4:43 AM, Joel Falcou wrote: > On 20/02/11 12:41, Eric Niebler wrote: >> On 2/20/2011 6:40 PM, Joel Falcou wrote: >>> On 20/02/11 12:31, Karsten Ahnert wrote: It is amazing that the proto expression is faster then the naive one. The compiler must really love the way proto evaluates an expression. >>> I still dont really know why. Usual speed-up in our use cases here is >>> like ranging from 10 to 50%. >> That's weird. >> > Well, for me it's weird in the good way so I dont complain. Old version > of nt2 had cases where > we were thrice as fast as same vector+iterator based code ... > ___ > proto mailing list > proto@lists.boost.org > http://lists.boost.org/mailman/listinfo.cgi/proto To explore the issue further I modified the original posted test code (see http://pastebin.com/1Vr9BkPP). The modifications include a transform based evaluator, a lambda expression based example, and some attributes to keep the evaluation functions from being inlined. First, the numbers (average after 5 iterations of the main loop). All compilation done with -O3 against Boost 1.45. MacBook Pro, 10.6.6, Core 2 Duo ProtoContextProtoTransform ProtoLambda Loop GCC 4.2.1 (Apple) : 5.3565438 5.3721942 126.38458 1.3657978 GCC 4.4.5 : 1.88783641.8845548 70.056237 0.942303 GCC 4.5.2 : 1.88406081.889619 1.2806688 1.0589558 GCC 4.6.0 (2/5/11): 1.8854768 1.8834438 1.2783471.2345208 CLANG 2.9 (125472): 5.455976 5.4627628 3.825104 1.2330524 Now, removing the ((noinline)), gives (in the same order) GCC 4.2.1 (Apple) : 4.1448478 5.3795842 126.53211 1.3215378 GCC 4.4.5 : 1.2505956 1.2500816 69.409665 0.7198288 GCC 4.5.2 : 0.5961430.7213138 0.71969283 0.7211534 GCC 4.6.0 (2/5/11): 1.2942638 1.4324828 0.646147 0.6632324 CLANG 2.9 (125472): 1.2975226 1.2966478 1.3849834 1.2452362 I'm not sure how meaningful this second set of numbers is. If the evaluation functions are inlined, the compiler can realize that evaluating them num_of_steps times is unnecessary since the data isn't changing between iterations. It then (I believe) optimizes out certain parts of the loop in certain cases. A lot of the additional code came from Eric's cpp-next articles. Nate smime.p7s Description: S/MIME cryptographic signature ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
On 20/02/11 12:41, Eric Niebler wrote: On 2/20/2011 6:40 PM, Joel Falcou wrote: On 20/02/11 12:31, Karsten Ahnert wrote: It is amazing that the proto expression is faster then the naive one. The compiler must really love the way proto evaluates an expression. I still dont really know why. Usual speed-up in our use cases here is like ranging from 10 to 50%. That's weird. Well, for me it's weird in the good way so I dont complain. Old version of nt2 had cases where we were thrice as fast as same vector+iterator based code ... ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
On 2/20/2011 6:40 PM, Joel Falcou wrote: > On 20/02/11 12:31, Karsten Ahnert wrote: >> It is amazing that the proto expression is faster then the naive one. >> The compiler must really love the way proto evaluates an expression. > > I still dont really know why. Usual speed-up in our use cases here is > like ranging from 10 to 50%. That's weird. -- Eric Niebler BoostPro Computing http://www.boostpro.com ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
On 20/02/11 12:31, Karsten Ahnert wrote: It is amazing that the proto expression is faster then the naive one. The compiler must really love the way proto evaluates an expression. I still dont really know why. Usual speed-up in our use cases here is like ranging from 10 to 50%. ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
On 02/20/2011 12:08 PM, Joel Falcou wrote: > On 20/02/11 12:03, Karsten Ahnert wrote: >> On 02/20/2011 12:02 PM, Joel Falcou wrote: >>> On 20/02/11 11:55, Karsten Ahnert wrote: On 02/20/2011 11:57 AM, Eric Niebler wrote: It gcc 4.4 on a 64bit machine. Of course, I compile with -O3. >>> Ding! welcome to gcc-4.4 64bits compiler hellfest. >>> Try 4.5, 4.4 64bits can't inlien for w/e reason. >> Great, I tried with gcc 4.5 and the proto part is now around 5-10 >> percents faster. Thank you. > > We banged our heads for weeks on this issue earlier until we found some > dubious bug report in gcc bugzilla flagged as nofix :/ > Seems the 4.5 branch solved it somehow. It is amazing that the proto expression is faster then the naive one. The compiler must really love the way proto evaluates an expression. > > You cna also try compiling with 4.4 using -m32 > ___ > proto mailing list > proto@lists.boost.org > http://lists.boost.org/mailman/listinfo.cgi/proto -- Dr. Karsten Ahnert Ambrosys GmbH - Gesellschaft für Management komplexer Systeme Geschwister-Scholl-Str. 63a D-14471 Potsdam Tel: +4917682001688 Fax: +493319791300 Ambrosys GmbH - Gesellschaft für Management komplexer Systems Gesellschaft mit beschränkter Haftung Sitz der Gesellschaft: Geschwister-Scholl-Str. 63a, 14471 Potsdam Registergericht: Amtsgericht Potsdam, HRB 21228 P Geschäftsführer: Dr. Karsten Ahnert, Dr. Markus Abel ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
On 20/02/11 12:03, Karsten Ahnert wrote: On 02/20/2011 12:02 PM, Joel Falcou wrote: On 20/02/11 11:55, Karsten Ahnert wrote: On 02/20/2011 11:57 AM, Eric Niebler wrote: It gcc 4.4 on a 64bit machine. Of course, I compile with -O3. Ding! welcome to gcc-4.4 64bits compiler hellfest. Try 4.5, 4.4 64bits can't inlien for w/e reason. Great, I tried with gcc 4.5 and the proto part is now around 5-10 percents faster. Thank you. We banged our heads for weeks on this issue earlier until we found some dubious bug report in gcc bugzilla flagged as nofix :/ Seems the 4.5 branch solved it somehow. You cna also try compiling with 4.4 using -m32 ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
On 02/20/2011 12:02 PM, Joel Falcou wrote: > On 20/02/11 11:55, Karsten Ahnert wrote: >> On 02/20/2011 11:57 AM, Eric Niebler wrote: >> It gcc 4.4 on a 64bit machine. Of course, I compile with -O3. >> > Ding! welcome to gcc-4.4 64bits compiler hellfest. > Try 4.5, 4.4 64bits can't inlien for w/e reason. Great, I tried with gcc 4.5 and the proto part is now around 5-10 percents faster. Thank you. > ___ > proto mailing list > proto@lists.boost.org > http://lists.boost.org/mailman/listinfo.cgi/proto -- Dr. Karsten Ahnert Ambrosys GmbH - Gesellschaft für Management komplexer Systeme Geschwister-Scholl-Str. 63a D-14471 Potsdam Tel: +4917682001688 Fax: +493319791300 Ambrosys GmbH - Gesellschaft für Management komplexer Systems Gesellschaft mit beschränkter Haftung Sitz der Gesellschaft: Geschwister-Scholl-Str. 63a, 14471 Potsdam Registergericht: Amtsgericht Potsdam, HRB 21228 P Geschäftsführer: Dr. Karsten Ahnert, Dr. Markus Abel ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
On 20/02/11 11:55, Karsten Ahnert wrote: On 02/20/2011 11:57 AM, Eric Niebler wrote: It gcc 4.4 on a 64bit machine. Of course, I compile with -O3. Ding! welcome to gcc-4.4 64bits compiler hellfest. Try 4.5, 4.4 64bits can't inlien for w/e reason. ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
On 20/02/11 11:57, Eric Niebler wrote: On 2/20/2011 5:52 PM, Joel Falcou wrote: 1/ how do you measure performances ? Anything which is not the median of 1-5K runs is meaningless. You can see how he measures it in the code he posted. I clicked send too fast :p 2/ Don't use context, transform are usually better optimized by compilers That really shouldn't matter. Well, in our test it does. At least back in gcc 4.4 3/ are you using gcc on a 64 bits system ? On this configuration a gcc bug prevent proto to be inlined. Naive question: are you actually compiling with optimizations on? -O3 -DNDEBUG? And are you sure the compiler isn't lifting the whole thing out of the loop, since the computation is the same with each iteration? Oh yeah I forgot these. On my machine (mac osx dual core intel with g++4-5) i have a 25% speed up by proto ... ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
On 02/20/2011 11:57 AM, Eric Niebler wrote: > On 2/20/2011 5:52 PM, Joel Falcou wrote: >> 1/ how do you measure performances ? Anything which is not the median of >> 1-5K runs is meaningless. > > You can see how he measures it in the code he posted. > >> 2/ Don't use context, transform are usually better optimized by compilers > > That really shouldn't matter. > >> 3/ are you using gcc on a 64 bits system ? On this configuration a gcc >> bug prevent proto to be inlined. > > Naive question: are you actually compiling with optimizations on? -O3 > -DNDEBUG? And are you sure the compiler isn't lifting the whole thing > out of the loop, since the computation is the same with each iteration? It gcc 4.4 on a 64bit machine. Of course, I compile with -O3. > > > > > ___ > proto mailing list > proto@lists.boost.org > http://lists.boost.org/mailman/listinfo.cgi/proto -- Dr. Karsten Ahnert Ambrosys GmbH - Gesellschaft für Management komplexer Systeme Geschwister-Scholl-Str. 63a D-14471 Potsdam Tel: +4917682001688 Fax: +493319791300 Ambrosys GmbH - Gesellschaft für Management komplexer Systems Gesellschaft mit beschränkter Haftung Sitz der Gesellschaft: Geschwister-Scholl-Str. 63a, 14471 Potsdam Registergericht: Amtsgericht Potsdam, HRB 21228 P Geschäftsführer: Dr. Karsten Ahnert, Dr. Markus Abel ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
On 2/20/2011 5:52 PM, Joel Falcou wrote: > 1/ how do you measure performances ? Anything which is not the median of > 1-5K runs is meaningless. You can see how he measures it in the code he posted. > 2/ Don't use context, transform are usually better optimized by compilers That really shouldn't matter. > 3/ are you using gcc on a 64 bits system ? On this configuration a gcc > bug prevent proto to be inlined. Naive question: are you actually compiling with optimizations on? -O3 -DNDEBUG? And are you sure the compiler isn't lifting the whole thing out of the loop, since the computation is the same with each iteration? -- Eric Niebler BoostPro Computing http://www.boostpro.com signature.asc Description: OpenPGP digital signature ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto
Re: [proto] proto performance
1/ how do you measure performances ? Anything which is not the median of 1-5K runs is meaningless. 2/ Don't use context, transform are usually better optimized by compilers 3/ are you using gcc on a 64 bits system ? On this configuration a gcc bug prevent proto to be inlined. ___ proto mailing list proto@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/proto