On Sat, 3 Sep 2016 19:38:42 -0300 Gustavo Sverzut Barbieri <[email protected]> said:
> > So here is an idea. I checked. A TLS lookup for any var is 1/5th the cost > > or so of a lock+unlock. We could use __thread and this is FREE (no cost - > > well it's a mov only), but this is only free in binaries not shared > > libraries, so let's talk TLS which is much cheaper than a lock anyway. Why > > TLS? > > > > 1. It will drop out cost above a lot. > > 2. We can remove several other locks in EO too cutting some more costs. > > 3. We can REALLY CHEAPLY enforce the rule of "you may not access an object > > outside its owning thread". Because every thread has it's own EOID table, > > the EOID will be local to that thread only. Looking it up in another thread > > is looking up an "invalid" EOID. In fact we can make this pretty much > > always fail by using some domain bits in EOID like i mentioned above (steal > > some from table entries and/or generation count). Now literally stuff will > > FAIL and not magically work 99% of the time if you disobey these rules and > > access a rectangle object or a timer from another thread that doesn't own > > them as the objects literally are not in the local thread EOID table. The > > tread cannot see them (well is very unlikely to see them). > > this is a very good approach and the side-effect is a nice one :-) > > given that we do not propose or use lots of multi-thread, particularly > multiple threads concurring for the same object, this looks the best > path forward. yeah.. BUT we get lots of noise from people wanting to use multiple threads with efl. we have ways (eocre_main_loop_async_call/sync_call, ecore_main_loop_begin/end, ecore_thread with "result" cb's in mainloop etc.). the way their mindset works, that i have gathered, is all efl funcs be it widgets, timers, whatever can be used anywhere in any thread and all will be magically fine. that's night impossible especially with a mainloop and ui. especially ui. ui would end up half-rendering a ui with the widgets half configured if a thread was still setting up something when a render pass came in. thus the "render when idle" method. the begin/end method above allows you to basically sync with mainloop from thread, take a lock, do your stuff and get out again. to avoid glitches like above we'd need this anyway, so we have it. we should be using more threads - though mostly inside efl to make an op async or faster so blocking time is shorter, but we still need to support people. we cannot support the way they think threads should be done, but with the begin/end we get as close as it actually sanely possible. so while we don't use a lot of threads, others do. that is one of the design aspecs for efl interfaces making the mainloop and object. the idea is you could have different threads with different loops. there will always be a MAIN loop (in the thread with main() called), BUT why not have other loops in threads that can communicate to/from the main loop? this would make these people above much happier. also this "split eo objects" means they could use non-ui objects pretty easily in these threads - eg efl.net and thus a local EOID table would make a lot of sense. > > 2. This thread init sets up an initial generation count at a random value so > > generation counts can't easily be in sync and maybe swizzles the order > > EOID's are found/allocated and so on to minimize ID "sameness". > > random is bad, maybe we can come with some way to guarantee it will > not collide with the threads in use? only the domain bits would guarantee that. generation count start value is irrelevant really, i was just thinking to make it random to avoid 2 threads having the same creation patterns thus likely to have same gen count AND the table entry allocation. > > 4. We make it a clear rule that threads cannot access objects outside of > > those that they created UNLESS: > > 4.1 An object is explicitly SENT from one thread to another (we can do this > > later but if this is done, the object must have a refcount of 1 only, no > > parent, no children, no objects referenced in keys, weak refts to/from this > > object etc.). We can release the EOID entry in thread 1, but not call > > destructor and free object memory, send the POINTER to thread 2, and here a > > new EOID local to that thread is allocated and that pointer adopted. > > Not sure this is good. AFAIR we have some cases where we start a > working thread, do something with that object in that thread, then > send it back to the main thread to be used. This used to be cheap, now > it won't. ummm we actually didn't allow objects at all to work this way WITHOUT a begin/end. you could send DATA and have that worker calculate data, do i/o etc. then send result back to mainloop to implement on objects (or do it directly with a begin/end section). without the ability to send you can never transfer an object from thread to thread. my thoughts on this were for messaging or for setting up message "pipes" with objects. one at each end. th1 creates 2 objects (2 ends of a comm pipe - 2 way like a socket), then sends one end to th2. th2 gets a "i have a new object" callback in its loop, discovers that it's the other end of that pipe and now that's an object locally in its eoid table. the objects are bound internally like a socketpair/pipe are in the kernel. we'd need to be able to send objects to do this. we'd have to have the internals work if either end was deletedwhile the other lives etc. there would be limitations. but it'd allow you to set up multiple comms pipes from any thread to any other and each thread can just release its end when its done. i was also thinking of object sending for inter-thread ipc. send a message from th1 to th2 and there can be an optional object as a payload. it has to be simple (like no children etc.) BUT this would then be possible. yes. you have to release from the eoid table on one end and alloc in the eoid table on the other. not cheap/free but better than spinlocks. :) > can't we just flag it somehow and for those we spinlock? Objects are > thread-private unless they are efl_add_multithread()? then you start > with the spinlocks for that eoid. this gets more complex. you do not know if the obj needs locking or not and every entry/exit point needs to check if it needs it then do a lock/unlock. we'd add LOTs of code we don't even have right now to every eo base class method and every class you inherit too. you have an issue with locks on objects too with cb's - you have to unlock in a cb then re-lock on return from calling the cb. it adds a lot of code. if we allocate a bit in eoid to know if its a sharable object (thus needs locks), we still have an issue that if we have different local eoid tables there will be an id CLASH where the id can exist in both tables (ignore the "shareable and needs locking" bit). you would have to shynchronise all eo tables and their content (but leave foreign content as NULL in the leaf nodes to avoid being able to access it). this will mean still need locks on the EOID table ANd making the tables sync now will raise costs as you have to "stop all threads and sync" on every alloc/release of a table id. a local EOID space that just ignores all others until some is mapped in with a "stop that thread" assumption or the explicit "adopt a new obj ptr into your eoid table" would be far cheaper as it puts these costs only in those places for those objects/cases and not the other 99% :) object sending is far less code, and i think it'll be rare to send objects compared to all other eo transactions. > In that sense we could even use that same bit to the eo operations in > the object itself, the obj would have a mutex on its own and all > calls/events would be guarded by that... kinda of a "@synchronized" in > other languages. you would need synced EOID tables. you COULD send an object WITHOUT releasing at the other end. it now has 2 OEID's that refer to it. this is something i think we could do later and THEN when a SHARED object is deleted you need to know all tables its shared between, know all the EOID's that map to it, then message those other threads to release their eoid refs and when the last one is released the obj is actually deleted. likely we would need to still have a master owning thread with the others having a share eoid "view". we can have a bit in the eoid table (no need to use an EOID bit in the ID) to know that your entry is a shared one and someone else has the master copy (in the master table another bit to know that this obj is now shared out and other threads have a ref and you need to wait for them all to message you, release their thread refs, then when those are at 0, you can call some callback to tell the master thread that owned/created it that everyone is done and it can then release it's ref). this ALSO means all eo methods have to use the mutex above you describe. this is a LOT MORE work as i said to have a totally shared object and i think we can do this later without breaking api/abi and just internally, but it's too much work for the moment. > > 7. It's an EO (and EFL)-wide rule that you should not make threadsafe > > objects because EO just won't support it - you have to explicitly send > > objects around or do a begin/end of another thread to look at it's objects > > (and then that is limited to a thread of a different domain - we have 4 so > > not bad). > > it's a reasonable rule, we can remove the "@synchronized" thing from > above if it's not easy to implement. But if we could easily add that, > then we can drop the rule and extend. Initially I'd go with your > proposed rule. the main reason it's not easy is the need to lock at every eo func/method invoke, unlock on every exit point in the method, and to unlock at every callback call within this locked state and lock again on return from the cb. that has to go everywhere. :( at the moment #7 is actually our rule for ALL "objects" in efl except for a few, and the only owning thread is mainloop. so its less limited. we have the issue of main loop begin/end where i have a proposed solution here with domains, and i just realised ecore_thread (and some class functions i think...) which has functions to send feedback, check if you are cancelled etc. - we could use the EOID sending if we had eo equivalents, but we don't. we could keep ecore_thread legacy only like we do now and design something new. that'd be the way to go i think. > > When you are in a begin/end section and you see 2 EOID tables, when you > > CREATE a new object... which one does it go into? Remember that when you > > CALL a method on an obj it may go create objects internally too. How can > > you determine which to use? You should be able to access both without > > creating without issue with domains as above. You could delete fine since > > an object knows which table it belongs to in the current thread context > > based on domain number. They will be different. You can't bind a foreign > > domain in if it matches yours - it'll fail. But creation is special. > > > > One option... if you create WITH a parent passed, the child must go into the > > same domain automatically. Operations mixing domains in an object tree > > should fail. What about other cases? Create a bare object with no parent... > > you add as a child later. How to choose which domain it goes into? Local or > > fireign? Maybe there is a context you can switch that is in your TLS that > > tells you which to use (local or foreign table). If we have a push/pop > > setup it'd be nice, but it's easy to get wrong. An explicit call to crate > > with foreign and eo_add is local? So eo_foregin_add() uses the foreign > > domain (if adopted at the time, and if not it will either fail or just use > > local domain then). Worth thinking about. > > Raster, you lost me here... I guess you have too much in your mind and > assumed it was clear, at least it's not clear to me what you meant... > and I read these 2 paragraphs couple of times :-) oh... yeah. sorry. :) ummm we have the begin/end thing right? you have mainloop + thread. thread can call "ecore_main_loop_begin()" and this will sync with mainloop and STOP themainloop at a safe point, then the func will return and any code you now run is "assumed to be in the mainloop context". you can mess with ui and create timers and everything, until "ecore_main_loop_end()" which releases this lock and lets the main loop continue on. the idea is every efl loop will have these begin/end methods that let u sync and lock out the loop and PRETEND to be that loop for a hopefully small section of code that for example updates the ui with data you have locally, then releases the loop again to keep running. *IF* we use TLS then during this period where you pretend to be another loop ... you STILL CANNOT see the other loops objects because they live in that thread's TLS data. right? so how to solve this? we can just move over the tls pointer from mainloop to thread temporarily, then do your stuff, then release. fine. during this block you CANNOT access your "local" objects at all because your whole EOID namespace switched. they will be mainloop EOID's not yours. can we solve this? yes we can! EO has 2 EOID pointers in TLS. 1 is "local". the other is "foreign". 995 of the time foreign is NULL. when you do a begin, foreign then gets the ptr for the "local" EOID table of the thread you are doing begin on.. so stop, block other thread and continue. it is not SAFE to continue as there is no contention as only 1 thread is working on this data at all. but how do you know an EOID is your local one or a foreign one? solution, allocate 2 bits in the EOID that is like a "thread id" but i am calling it a domain ID. your domain for your thread MUST be different to the one you are doing begin on otherwise this cannot work. so i would make the mainloop ALWAYS have domain 0, and other threads can choose (with the default being 1, and expecting the threads then to only do begin/end on mainloop and no other threads, ut we have 2 more values (2, 3) that can be used for other threads then a thread in domain 1 can do begin./end on one in domain 2 or 3 etc.). this domain value (2 bits, value 0 to 3), will let us know to look in the local table OR the foreign table for the object. we know the local domain id and if domain in EOID == local id, look in local table OR OTHERWISE look in the foreign table (we can just make it an array of 4 items - one per domain slot and look in that slot. when you begin() on another thread it puts that threads LOCAL table into your slot for that domain locally so now everything can be accessed). NOW we can access BOTH our local objects and the objects of the "foreign" thread we have done begin and end on. in fact if we use the above array any single thread can begin() on up to 3 other threads at any time and access everything. the issue is with creation of new things. where do they go? > To summarize my understanding with your restrictions: you create the > object in a thread OR send it to a thread. Then when you create, the > domain and all are all set to the current thread. Parent needs to be well i'm asking.. what domain should created stuff belong to when you have multiple domains mapped into your thread. > used, thus of course it's only valid in that thread. If you send the > object, some special machinery would remove its availability in the > current thread and create a new one in the secondary thread... (which > is more complex when we think about children, what to do... and even > more complex if we think about other, non Eo resources that may happen > to cause problems, like imagine you send a Efl.Net.Dialer.Http to a > secondary thread, CURL will barf). well i wasn't thinking of sending at all. just needing to decide WHICH domain a created object belongs to. once in a domain it has to stay there. same with children, parents etc. > Maybe we should focus more on easily communicate between two main > loops/threads? That way you do not need to pass objects and hit the > above complexity. All you do is to send information, and on the > target thread you do the actions, like create the object. as above - messaging between loops is on the cards. being able to send objects that REPRESNT some complex piece of data would be nice. imagine a simple "database" object where you can query by key, row, column, path etc. - like an sql object for example (urgh ok not sql but you get the point) and thuis object is a database object which is really just an obj representing a big backend store of data you can read/write. you want another thread to access data? create db object, send it over. that's the kind of thing i think sending should be used for. > > This buys us REALLY NICE "thread safety" in the way that objects are just > > not allowed to span threads. They must be explicitly sent over and thus > > ownership (and EOID value) changes, or you must explicitly do a begin/end > > on another thread and adopt it's ID table into your local space as a > > "foregin" table that allows you definitely "read only" access easily, even > > the ability to modify and delete, but just creation is tricky. This really > > will clear up lots of mistakes we have been seeing from code that uses > > threads and does "bad things" that happen to work 99% of the time then fail > > oddly 1% of the time. We don't have to write "is this my thread" checking > > code in every method because the design will do that mostly for us as a > > side-effect of TLS and normal EOID checking. This also buys us simplicity > > when dealing with objects as we can assume the nice old fashioned way of > > "no need to lock or consider threads - it will not be an issue", and it > > buys us a good speedup vs what we have now. It does mean another bit of a > > re-jigging of eo internals and we need to add some API's to be able to do > > begin/end etc. and we can later add object sending. > > reading this and thinking about clear multi-thread cases makes me > think that we need easier communication more than sharing. we HAVE to handle the begin/end case. we can't support our legacy api otherwise. it's a SPECIAL case in that the other thread will be paused, so its sharing with no locks, but we HAVE to do it to retain begin/end. i first just through of a single bit "mainloop vs everone else", but i realized that that is too limiting. 2 bits would make it far better. same idea though. :) > ecore-con: we need thread to asynchronously resolve names. You do not > need the actual object to do that, send a string, return a struct with > the return of getaddrinfo(). oh this is internal threads. not even mentioning that here. :) ecore_con deals with that internally. it could be a thread or an async resolve etc. - the point being that its not exposed. :) > evas: we need thread to compute what's to be rendered and paint the > pixels. information sent is what changed and if it was done. > > image loading... video decoding... same for all the above. internals. i'm talking exposed use of threads "in an app" :) > I guess the ecore_thread does most of that, if not we can extend a > little bit. But none of them would need to send objects. you do need to EXPOSE objects though with begin/end. for sending - see above db object example, or the set up endpoints for communications within a process etc. > > I'm rather happy with this kind of direction. We never have to make eo > > objects thread safe or create a special eo thread safe base class. Ever. > > You want to talk from thread to thread, then we can have endpoint objects > > that get created with one object on one end of the msg pipe (like a > > socketpair() or pipe()), and a different object on the other end (create in > > one place then send one end to another thread? The object internals hook > > them up via pipes or threadqueues?). This makes currently "incorrect and > > dangerous code" fail early instead of 1% of the time. It catches issues > > fast. It gets us speedups. This potentially impacts promises too, but > > probably in a good way. This also affects bindings - looking at JS/Lua > > specificially. If Lua had a threading model right now it'd be to have 1 > > luastate per thread, but this means we can't share objects... this means > > eo's model is the same and you would have to detach an object from one > > thread (luastate) and make one appear on the other end. I'm also happy that > > I think this solves a disagreement on how to do threading. I know we have > > to somehow and not just stick our heads in the sand. This solves it. It > > gives a clear optimal model that is relatively robust and efficient. > > agreed, but as I said I don't think passing objects will be used, we > can skip that complexity. sending is rather simple actually, thus why i mention it. and it can be used for comms setup. like one end does a bind+listen, the other a connect. one thread sets up an "incoming comms endpoing add" callback on lets say their loop object and when another thread "connects" they get a cb with a new obj representing their end of that comms pipe. :) that is one way to do arbitrary async point to point comms setup. and sending db objects around where you use this as "large datastores" and share 1 place at a time, but zero copy is kind of nice and works nicely with bindings. think of the db object as a higher level version of sending a void * around. :) > > The downsides are the extra work, the need to add API's to init eo per > > thread, need to check for objects on thread shutdown (need the TLS free > > func to do sanity checking for still-alive objects), the need to add all > > this foreign vs. local table stuff for begin+end and then to hook that in, > > and the likely need to add at least object sending, and the messaging > > infra. Also it will use a bit more memory if you use eo across threads and > > actually create objects in other threads. As well any mempools and other > > things that we currently share across threads may need to become per-thread. > > > > So ... with that. Questions, comments, queries, devils-advocate. Find issues > > with this. Wrap your head around it. This is very important as this impacts > > many things subtly and some directly and clearly. > > I'm very happy with the TLS side effect of blocking calls from > multiple threads AND speeding up due lack of locking. > > But I don't think sending object to other threads will be that useful, > if we make it easier to communicate threads, just communicate and ask > that thread to create it for you. the sending is simple. far far far far more complex is the begin/end stuff. shared objects are insanely hard to do right and even more work. we HAVE to do begin/end to maintain api compatibility anyway. so doing a send is a walk in the park in comparison and has uses FOR the inter-thread communication - like payload of messages can be a db object for example. > We could do a special thread communication primitive that sends the > object internal data, creating a new EOID there... but then we have thats EXACTLY what i was thinking of for object sending. it keep the object ptr alive, not calling destructor. it releases the EOID in thread 1, sends ptr to thread 2, it "adopts" the ptr and allocates a new EOID for it. presto. objetct moved threads. the cost is an EOID release and an EOID alloc and otherwise an ipc "pointer" from thread to thread. if you are messaging already this last part is free. the problem is parent, child etc. objects of the one you sent, thus the restrictions on reference count, parent/children etc. > children, then we have non EFL resources like CURL. So at least a well children - disallowed. curl - does it matter? well ok it matters if the data the object is carrying like let's say curl data, is unable to work outside a specific thread. then this is not possible. so that is a good point - this is why it probably makes sense to have maybe a senable() method in eo base that returns true by default (we will make eo base sendable), but any class on top that cannot be send (eg internal data like curl cant move from one thread to another), then it overrides and returns false. design point - only ever override to then return false. never flip it back to true again because a parent class cant be sent. this would limit sending to a few specific objects. we could make it false by default and only enable if you know for sure you and all your parent classes can be sent, but how do you KNOW unless the parent class already returns true? :) > class information saying "never send me to another thread". If we > block sending objects, the constructor could just check if another > thread was used and refuse to create... but if it's created, EO needs > to help by checking that. yup. i'm thinking sendable() method returning true/false. > OTOH I do think that we could use just one bit (instead of multiple > domains as you said) that means "use the global EOID table guarded > with spinlocks". urgh. we could do that. BUT you'd need more than 1 bit. for begin/end you need to be able to see 2 eoid tables at once. one of them would have to be global, one then private. if mainloop is private then by definition EVERY thread must use global. this is bad. BUT with domains maybe 0 is private, 1 is global, 2 and 3 are 2 more private domains as i described. there still is an issue - global ids then mean peolpe THINK objects are threadsafe and thus they have to be made threadsafe... and back to the above for that. :) > Whenever it's feasible or desired to offer some "@synchronized" for > Eo, like objects created with that bit would have an internal mutex > and all access would be guarded by it.. need to be careful with the > deadlocks if methods are calling others, the lock would be already > acquired... so maybe a recursive mutex? yeah - i know. this is the pain. recursive mutex also works for not worrying about unlock on calling a function that exists your "frame" into a child frame (callback call or any other func/method). i am not sure if recursive mutexes are portable. it seems to work on most *nix's - not sure on openbsd, and then windows seems to have them. we COULD have eo actually do the lock/unlock on method call if the obj is a lockable obj. if we have a global eoid table then objects in this table will have to be lockable like this. we need to decide if this is a good path or not. sending is cheap if you do not go back and forth a LOT. shared with mutexes is better if you do. i actually like the idea of a specific domain that is shareable, with others private. private == more performance, but we need different domains like above to SEE multiple private domains and those EOID tables for begin/end. but the question remains - when you then do eo_add() which does it belong to? private or sharable or any other domain if they exist)? > -- > Gustavo Sverzut Barbieri > -------------------------------------- > Mobile: +55 (16) 99354-9890 > > ------------------------------------------------------------------------------ > _______________________________________________ > enlightenment-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel > -- ------------- Codito, ergo sum - "I code, therefore I am" -------------- The Rasterman (Carsten Haitzler) [email protected] ------------------------------------------------------------------------------ _______________________________________________ enlightenment-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
