On Tue, Sep 20, 2016 at 11:29 AM, Carsten Haitzler <ras...@rasterman.com> wrote:
> On Tue, 20 Sep 2016 08:27:25 -0300 Gustavo Sverzut Barbieri
> <barbi...@gmail.com> said:
>
>> >  static inline void *
>> > @@ -104,12 +101,11 @@ eina_array_foreach(Eina_Array *array, Eina_Each_Cb
>> > cb, void *fdata) Eina_Bool ret = EINA_TRUE;
>> >
>> >     EINA_ARRAY_ITER_NEXT(array, i, data, iterator)
>> > -     if (cb(array, data, fdata) != EINA_TRUE)
>> > -       {
>> > -         ret = EINA_FALSE;
>> > -         break;
>> > -       }
>> > -
>> > +     {
>> > +        if (cb(array, data, fdata) == EINA_TRUE) continue;
>> > +        ret = EINA_FALSE;
>> > +        break;
>> > +     }
>> >     return ret;
>> >  }
>>
>>
>> for these wouldn't EINA_LIKELY/UNLIKELY be more clear on what is the
>> expected/preference to the user AND the compiler. What you did is
>> likely to work for one compiler but not the other, if branch
>> prediction preference is difference.
>
> no. likely/unlikely dont do anything useful since 486 architecture days. back
> then the cmp would always predict either true or false (i dont remember) so 
> the
> compiler SHOULD adjust the compare so the more likely outcome is what the cpu
> will then just blindly run while waiting for the result. its is not true 
> anymore
> and these really do nothing useful.

it should have the same impact, but admittedly I did not test to guarantee.



>> Also, with clearlinux they use AutoFDO
>> (https://gcc.gnu.org/wiki/AutoFDO), likely in future most distros
>> should use that, thus this kind of micro-optimization is not that
>> helpful.
>
> incorrect. see my mail reply to cedric. i sped up eoid lookup by almost 3x by
> doing this. its a valid micro optimization. compilers are not that smart. they
> do put the if () content code right where it is in the c, ie right next to the
> compare and after it. i checked the asm output ... or well i was reading the
> asm output while profiling and noticed this... and the whole 2-3x speedup came
> from doing these micro-optimizations based on real code output by a very 
> recent
> compiler.

Did you check this AutoFDO? It's a profile based optimization, it will
collect data of the usage and re-compile using that as input.

I've heard of impressive gains. I recall yocto-project guys saying
that bitbake (their core tool) was much faster if python was compiled
with that... I recall Ubuntu or Fedora already did it.

The issue is more on cross-compilation, since you can't run the tests
to then recompile.



>> Given that we already run software we just compiled in the build tree
>> we could even integrate autofdo to the build system (like compile -
>> benchmark - compile)?
>
> not so easy. but as above. i dropped eoid lookup from ~6-7% to 2.5% of cpu 
> time
> just through goto's and moving "rarely accessed paths" out of the way of the
> code as the compiler LIERALLY would produce asm like the c code is written.
>
> a_stuff();
> if (x) {
>   b_stuff();
> }
> c_stuff();
>
> literally the asm here has
>
> a_stuff asm
> cmp x
> jne after
> b_stuff asm
> after:
> c_stuff asm
>
> with -O3 -march=native on gcc 6.1
>
> it is totally worth it. TOTALLY.

ok, I get that. But most of the cases above were returns... which are
very small and will not change the code that much. Not to say that
since these are static inline functions, when inlining the compiler
will remove checks if they were previously evaluated.



>> > diff --git a/src/lib/eina/eina_inline_hash.x
>> > b/src/lib/eina/eina_inline_hash.x index ab87960..114b584 100644
>> > --- a/src/lib/eina/eina_inline_hash.x
>> > +++ b/src/lib/eina/eina_inline_hash.x
>> > @@ -33,11 +33,13 @@ eina_hash_djb2(const char *key, int len)
>> >     unsigned int hash_num = 5381 ^ eina_seed;
>> >     const unsigned char *ptr;
>> >
>> > -   if (!key) return 0;
>> > -   for (ptr = (unsigned char *)key; len; ptr++, len--)
>> > -     hash_num = ((hash_num << 5) + hash_num) ^ *ptr; /* hash * 33 ^ c */
>> > -
>> > -   return (int)hash_num;
>> > +   if (key)
>> > +     {
>> > +        for (ptr = (unsigned char *)key; len; ptr++, len--)
>> > +          hash_num = ((hash_num << 5) + hash_num) ^ *ptr; /* hash * 33 ^ c
>> > */
>> > +        return (int)hash_num;
>> > +     }
>> > +   return 0;
>> >  }
>>
>>
>> for these, which are error handling, EINA_UNLIKELY() is much, much
>> better to read and doesn't change the whole code because of 1 line.
>
> been there. done that. EINA_UNLIKELY had no effect. see above.

in this example it's adding one instruction. That's not impacting the
cacheline as you say. If the error handling code was much bigger, I'd
agree.


>> >  static inline Eina_Bool
>> >  eina_lock_new(Eina_Lock *mutex)
>> >  {
>> [...]
>> > +   return _eina_lock_new(mutex, EINA_FALSE);
>> >  }
>>
>> since you did not had a symbol before, why not just introduce one
>> without the "_"?
>
> as per my commit. i was tossing up but then i would have to change the
> prototypes so i went for the least intrusive version for now. i was on the
> plane 30kft up not sleeping on my laptop... :)

guessed it was something like that which produced this patchset...
really looks like "i'm bored at the airplane"


-- 
Gustavo Sverzut Barbieri
--------------------------------------
Mobile: +55 (16) 99354-9890

------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to