Amen! On Wed, Sep 7, 2016 at 1:13 AM, Carsten Haitzler <ras...@rasterman.com> wrote:
> On Wed, 7 Sep 2016 08:13:54 +0900 Carsten Haitzler (The Rasterman) > <ras...@rasterman.com> said: > > > > Assuming it's what we talked about on IRC, I'm happy with it. It's > > > essentially an implementation what I wanted from the beginning (Eo > being > > > single threaded and if you want multi-thread, do something like > > > socketpair()) with the added coolness of enforcing IDs are unique per > > > thread. We'll probably need to change the error messages about ID not > > > found to maybe check the other threads if the ID is valid there so we > > > can say something like: "You tried accessing object X from thread Y, > but > > > it's actually an object of thread Z. DENIED! Read more about > foobar().". > > > > > > My only concern, as always, is performance. I know you said locks are > > > slower than TLS, but while we only needed to lock on write, now we use > > > TLS on both read and write, so it is a concern. > > > > > > Hope it pans out well. I'm looking forward to seeing benchmarks (and of > > > course, tests with threads and etc). :) > > > > well it's just as slow as the spinlocks... as long as i have the optional > > locking support. literally the if (tdata->shared) or if (domain == > shared) > > then lock+unlock ... just that alone that if causes it to go from 2.5% > to 5%. > > without the if we are back to almost pre-locking speed (a bit worse. w > vs 2.5% > > ) plus all the goodies. > > > > so choices: > > > > 1. drop the idea of shared objects at all and you HAVE to send an obj > > (socketpair like above plus fd passing where ownership transfers). > > 2. find some way to optimize this... this is not easy. i've tried a few > so > > far. i'll try some more, but i suspect i'll hit a wall. > > 3. give up and accept the fate of the lock! > > > > i dislike #3. #1 is the frontrunner BUT i won't go there until #2 is well > > beaten into a pulp. i actually had an idea in bed that .... may help. i > > noticed the asm output/dump matching the c code seems to have a LOT of > the > > actual if handling interspersed with the code and maybe just maybe all > the > > if's and their child code cause a l1 cache miss to l2 since the > cachelines > > are not packed with instructions we execute, so moving all the error > handling > > to one big error block at the end might be an idea... another idea i > have as > > to the cost of THESE if's it the speculative execution. the cpu might be > > executing BOTH branches ready to throw away one of them ... but since one > > branch is a spinlock this causes a huge cpu stall anyway as it snoops > caches > > with other cpu's etc. and so everything suppers as if we actually called > the > > lock anyway. if this is true... UGH. PAAAAIN. scratching head right now > as to > > a solution. > > well well well... my ideas of the l1 cacheline + instruction fetching were > right! i'm down to 2.7% WITH optional spinlocks. how to fix this? move all > slow-path error handling to a bunch of goto's at the end of the function > (one > label + return NULL per error type). with gotos this moved the error > handling > form continuous in memory to "at the end of the function in a block" and > presto... speedup. well wel well. i'll be. i never considered this before. > mental note. keep in mind the instruction cache + cahcelines. > > they say "don't use goto". begone ye pesky comp sci theorists... goto is > AWESOME! seriously. this is a major thing. we have TONNNES of if's in our > code. > TONNES of them, often for handling odd cases that sometimes happen. if the > handling is really anything more than "return" or "break" or "continue" or > a > simple op (i++, i=10, i--, etc.) AND its not a common path (EINA_UNLIKELY) > it > is TOTALLy worth putting the case handling code in a separate code block. > there > is a real measurable speedup. in real life. no callgrind or virtual cpu > stuff.. > with real memory and real caches and so on. well well well. > > i hereby BLESS the goto and encourage its use for any "rare case handling" > as > above. (actually the code is more readable too!) :) > > > -- > ------------- Codito, ergo sum - "I code, therefore I am" -------------- > The Rasterman (Carsten Haitzler) ras...@rasterman.com > > > ------------------------------------------------------------ > ------------------ > _______________________________________________ > enlightenment-devel mailing list > enlightenment-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel > ------------------------------------------------------------------------------ _______________________________________________ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel