On Wed, Nov 4, 2015 at 12:38 AM, Carsten Haitzler <[email protected]> wrote: > On Sun, 1 Nov 2015 22:22:47 -0200 Felipe Magno de Almeida > <[email protected]> said: > >> OK, >> >> So, I tried to take a stab at it during the weekend. >> >> I think all the optimizations are actually hurting performance. I >> wanted to test removing eo_do and the whole machinery for stacks etc. >> And just use the _eo_obj_pointer_get. However, for some reason, mixins >> and composites stopped working and I don't have much time to >> investigate. > > but... did you get any perf numbers?
Unfortunately no. Not enough time. Just adding the object to each function took a lot of time. Unfortunately this is just a guess, but if we are going to have this much trouble, at least proving eo_do really brings any benefit should be done. Or else, we might just be doing pessimizations, instead of optimizations. Which seems to me _very_ likely, by trying to be faster than C++, an impossible goal given our requirements, we're running even more code. Besides, we have Eolian generation to help us in optimizations. For example, maybe we should look into devirtualization of function calls instead of caching results in TLS stacks with multiple allocations. > because what you propose is that now wee > have to eoid -> obj ptr lookup via table every call and we can't batch and > share within an eo_do chunk. The caching is more expensive because of synchronization, the lookup is actually a hash-like lookup, so it should be faster than TLS, which actually does hash-lookup too and even more. >> I think this test should be done. Afterall, if we write a lock-free >> table_ids data structure, then we would be _much_ better off than >> using TLS and allocations all over the place. > > indeed lock-free tables might help too but we can use spinlocks on those which > is ok as table accesses for read or write should be extremely short lived. > thats MUCH better than tls. spinlocks are not lock-free. But even that is likely to be faster than the amount of code we need to run to try to optimize. [snip] > this means the oid lookup each and every time... thats not pretty. :( thus .. > perf numbers? sure - atm we do a log of eo_do(obj, func1()); ie only 1 func > we don't use the eo_do batches much yet... Unfortunately I won't have time for that. It sure looks bad. However, the table is going to likely be in cache if we design it correctly. Besides, the stack maintenance is not cheap and requires allocations from time-to-time. We could probably make a table lookup in much less than 100 instructions and a dozen data accesses. And, if we can make it really lock-free, then we will have very little synchronization overhead. I don't think we can do the same with eo_do, which _requires_ us to go around some kind of global to fetch the object we are calling (which creates our synchronization problems). >> I think that eo_do is very likely hurting performance. So we should at >> least prove that it does give better performance before we start using >> macros all over the place, which will be necessary to avoid some TLS. > > my actual profiling shows its the call resolve and fetching of the class data > (scope data) that really are costing a lot. those are the huge things - and > those have to be done one way or the other. so dropping eo_do doesn thelp at > all here. It does if we just kill eo_do and start optimizing that. Right now, just making eo work right is not an easy task. Unfortunately I won't be able to prove either way and will only be able to get back to this by the end of november. However, if we do not freeze Eo interface right now then we could have more time to bring data to the discussion, or if someone else might be willing to try. >> Best regards, >> -- >> Felipe Magno de Almeida Kind regards, -- Felipe Magno de Almeida ------------------------------------------------------------------------------ _______________________________________________ enlightenment-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
