Re: [E-devel] RFC: EOID + Threads + TLS proposal

The Rasterman Sun, 04 Sep 2016 21:46:07 -0700

On Sun, 4 Sep 2016 20:26:14 -0300 Gustavo Sverzut Barbieri <barbi...@gmail.com>
said:


> >> given that we do not propose or use lots of multi-thread, particularly
> >> multiple threads concurring for the same object, this looks the best
> >> path forward.
> >
> > yeah.. BUT we get lots of noise from people wanting to use multiple threads
> > with efl. we have ways (eocre_main_loop_async_call/sync_call,
> > ecore_main_loop_begin/end, ecore_thread with "result" cb's in mainloop
> > etc.).
> >
> > the way their mindset works, that i have gathered, is all efl funcs be it
> > widgets, timers, whatever can be used anywhere in any thread and all will be
> > magically fine. that's night impossible especially with a mainloop and ui.
> > especially ui. ui would end up half-rendering a ui with the widgets half
> > configured if a thread was still setting up something when a render pass
> > came in. thus the "render when idle" method. the begin/end method above
> > allows you to basically sync with mainloop from thread, take a lock, do
> > your stuff and get out again. to avoid glitches like above we'd need this
> > anyway, so we have it.
> >
> > we should be using more threads - though mostly inside efl to make an op
> > async or faster so blocking time is shorter, but we still need to support
> > people. we cannot support the way they think threads should be done, but
> > with the begin/end we get as close as it actually sanely possible. so while
> > we don't use a lot of threads, others do. that is one of the design aspecs
> > for efl interfaces making the mainloop and object. the idea is you could
> > have different threads with different loops. there will always be a MAIN
> > loop (in the thread with main() called), BUT why not have other loops in
> > threads that can communicate to/from the main loop? this would make these
> > people above much happier. also this "split eo objects" means they could
> > use non-ui objects pretty easily in these threads - eg efl.net and thus a
> > local EOID table would make a lot of sense.
> 
> I get all of this and this is because I'm saying we should rather
> focus on the communication part instead of the MT aspect of objects.

communication is on the plans for efl loop etc. - but we have to do this so it
can work across bindings too. that means abstracting stuff, and that's what eo
is for.  :) anyway...

> My perspective on this changed a bit after taking a look at Golang.
> Most people don't get main loops and being called back, most try to
> solve this by spawning threads... but it doesn't solve, just adds to
> the problem.

you may have a really good point here. when i first saw this i went "oh ooooh
that's how that works. that's neat! a loop that runs for me and multiplexes
everything that needs calling. that's so much neater than a massif switch/if
+then+elf etc. table and it can be extended without continually hacking away at
this call table! nice!" at least for general loops. for callbacks it made
perfect sense. "pass this function's ptr in here and it's stored for later and
then the machine will call whatever this pointer points to when needed". it's
functional programming without all the fuss. well i never through of it that
way, but i see that now.

i don't get it why this concept is so hard for people ... but you have a point.
but here is the catch. we can't change efl, and imho having a mix of designs is
worse that having a single harder one. but none of this really FORCES a loop on
everyone - but we do need it for the mainloop no matter what. we are callback
heavy. that's who we are and changing design will hurt more than help as we cant
do it all at once and it'll break even more than eo+interfaces do. it means
starting again basically.

but i'm taking this as a mental note - we need to talk about what callbacks are
much more. that a function can be an "object" or piece of data you store and
access. how loops work and how they are really a call dispatcher etc. - maybe
need to document this more and more.

> After lots of thinking I believe most people struggle to partition
> their problem into multiple functions and segment their work. One of
> the reasons is that most start programming in procedural, most
> algorithms are procedural, etc.
> 
> There is no "context", no partitioning and no preemption... just do
> what you have to do, like a busy wait reading from db, search, sort,
> paint... When you add threads, you solve some of these issues, but get
> the damn preemption to add to your nightmares.

well there is a context - the stack. but they don't think of it that way. :)

> Go solved this by incorporating channels and select into their
> language. It's super simple to send information and it's
> batch-friendly from the POV of users programming it.
> 
> So my suggestion is to focus on communication and maybe offer a way to
> convert back from a main loop to a "select" for non-main threads. In a
> worker thread you could do something like in go:
> 
>      select {
>         case data :=  <- data_source1:
>             do_something_1();
>         case data :=  <- data_source2:
>             do_something_2();
>     }
> 
> In C we could do:
> 
>      ew = efl_event_waiter_new();
>      efl_event_waiter_add(ew, data_source1, DATA_SOURCE_EVENT_1, some_id1);
>      efl_event_waiter_add(ew, data_source2, DATA_SOURCE_EVENT_2, some_id2);
>      switch (efl_event_waiter_wait(ew, &data)) {
>          case some_id1:
>             do_something_1();
>             break;
>          case some_id2:
>             do_something_1();
>             break;
>     }
>     efl_event_waiter_del(ew);

a massive design change, but here - orthogonal to the tls for eoid's, being
able to map them in thread to thread etc.

> it would essentially run an internal main loop on the worker thread,
> connect Eo events on each source (in the main thread) and when they
> happen, return the Efl_Event in "&data" (options on this below).

we've been talking of running a REAL mainloop on threads - not all as you'd
have to explicitly do this, but it'd be highly encouraged. so from some thread
to do:

th2 = efl_thread_new
  (myloop,
   efl_event_callback_add(EFL_EVENT_THREAD_RUN, _thread_main, NULL),
   efl_event_callback_add(EFL_EVENT_DEL, _thread_end, NULL),
   efl_event_callback_add(EFL_EVENT_THREAD_DATA, _thread_data, NULL));

and efl thread object LOCALLY will represent a remote thread. the
EFL_EVENT_THREAD_RUN event is SPECIAL as it would be intercepted and the cb
used as the thread main function, and the data passed to it. the object passed
in would not be the same thread eoid but a NEW one created in the other
thread's TLS eoid area which represents some kind of "other end view" of the
thread. so that thread if it wants to send data to whoever has the thread
handle (the thread that created it) then calls like efl_thread_send(myself,
"blah"); and this data message arrives at the spawning thread as a data event
or something like this. you should be able to send the other way too.

the issue with this here is that this merges thread and loop. if you explicitly
create a loop in the _thread_main then you have to bind this loop to the thread
for i/o - i am not sure how this should be done yet. so either merge or bind.

anyway this is off-topic now. :) to send eoid's you'd need a comms channel to
send s header down with "this is a new object for you to create an eoid for"
and a pointer. that's it. the other end's eo could handle it. it'd have to hook
into the thread or loop infra to do this.

> However this is easy for these people to map to how they learnt to
> program. There is no "void *", no callbacks. It's an alternative to
> OOP where to solve this they force you to inherit from a class (that
> its internal data serves as "void *context") and instead of callbacks
> they call your method. (which is something good when the language
> helps you, but in C it's a PITA, maybe we can also offer the OOP as an
> option by using override on a bare object?).
> 
> Note I'm not suggesting people use this to write the whole EFL itself,
> but could be a way to map the communication for people used to batch
> programming and let them do these stuff into secondary threads.
> 
> Options to implement the above:
> 
>  - efl_event_waiter_add() finds the thread owning the object, adds an
> event callback with a proxy function. The function would pause that
> thread and wakeup the secondary thread (efl_event_waiter_wait()),
> until the next efl_event_waiter_wait() or efl_event_waiter_del().
> 
>  - explicitly create a "channel" for an object on its owning thread.
> This channel adds an event callback, when it's activated it will
> report to all listeners. Then on the secondary thread you would
> receive pre-created channel, no need to find owner thread. Everything
> else would be the same, pause object thread and wakeup the worker.
> 
>  - one of the above, but instead of pause, serialize the event-info
> and send to thread. With Eolian we can generate these
> serializers/duplicate, maybe add as callback to the event description
> structure.
> 
>  - in addition let a channel to be created explicitly to send random
> information without emitting an event callback. Like create a channel
> of integers, when you want to send and int just channel_send(ch,
> &myint).
> 
>  - timeouts would come as channels as well
> 
> Remember this is to help those not used to callbacks or segment their
> code, they can write their stuff in a busy-loop or even do as a batch
> programming (instead of putting it inside a while(), wait on different
> channels at different time, then return).
> 
> Not solved in this description is how to actuate on the objects of the
> main thread. Suppose a GUI, the main thread have some buttons (events
> would be handled as channels per above), but then you want to change a
> label.
> 
> Given raster's TLS solution, objects would be invisible/unreachable to
> the thread. Safe, but not that usable. People would have to send back
> to the other thread what to do, this is cumbersome.

yes. that's why we have begin/end :) it's "give me a window into the other
thread please so i can mess about and do things with it safely then close the
window" :)

> If we go with a "@synchronized" approach, we must serialize the event
> callbacks if we dispatch callbacks as locked -- otherwise deadlocks --
> or we must unlock before calling back, which is cumbersome and brings
> problems.
> 
> Another option is to extend the channel to be an object proxy. When
> you call a method on the proxy, it would cooperate with the channel
> that did the "pause" of the object thread, run the method there and
> return values. If the main thread is not being paused (info was
> serialized), then we would ecore_main_loop_begin() - call - end().
> 
> Anyway, I do not want to hijack the thread purpose. My point is that
> users of threads and some patterns we do not use have a reason, I
> truly believe the reason is the one stated above. If you agree, then
> we should help them to get their work done the way they like... and
> this would impact how we're optimizing Eo. But picture this in your
> mind, most people want:
> 
>      thread-1: // main loop + gui, invisible to the user
> 
>      thread-2: // check what to do from UI
>          while (running) {
>             wait one of {
>               timeout 1s: update_clock();
> 
>               start button pressed:
>                   do_some_sequential_action_in_thread3();
>                   ui_progress_pulse_start();
> 
>               stop pressed:
>                   cancel_some_sequential_action_in_thread3();
>                   ui_progress_pulse_stop();
>           }
>          }
> 
>      thread-3: // something that would be tedious with cbs, simple in batch
>          header = read_from_net(2) // blocking
>          if (header.bla)
>               header_bla = read_from_net(8)
>          if (header.ble)
>               header_ble = read_from_net(12)
> 
>          ui_label_set(header.blo)
> 
> currently we're the opposite of that, since we're modeled after the
> single thread main loop pattern. We request people to connect 3
> callbacks (one for timer, 2 for buttons) + recreate the
> algorithm/parser to get data from network to get unknown bytes of data
> from the net. Most people don't know how to do that, and it's a real
> PITA to do it right :-)

yup. thus begin/end. you can sit and do blocking i/o if you want. begin/end
work from any thread anywhere right now. i dont see we should change that. that
allows simple code like

   read(network, buffer, 1024);
   begin();
   set_text(obj1, buffer);
   set_test(obj2, buffer + 512);
   end();

that is almost as simple as we can make it. as long as you realize begin is
expensive and may require waiting many ms.

> >> > 4. We make it a clear rule that threads cannot access objects outside of
> >> > those that they created UNLESS:
> >> > 4.1 An object is explicitly SENT from one thread to another (we can do
> >> > this later but if this is done, the object must have a refcount of 1
> >> > only, no parent, no children, no objects referenced in keys, weak refts
> >> > to/from this object etc.). We can release the EOID entry in thread 1,
> >> > but not call destructor and free object memory, send the POINTER to
> >> > thread 2, and here a new EOID local to that thread is allocated and that
> >> > pointer adopted.
> >>
> >> Not sure this is good. AFAIR we have some cases where we start a
> >> working thread, do something with that object in that thread, then
> >> send it back to the main thread to be used. This used to be cheap, now
> >> it won't.
> >
> > ummm we actually didn't allow objects at all to work this way WITHOUT a
> > begin/end. you could send DATA and have that worker calculate data, do i/o
> > etc. then send result back to mainloop to implement on objects (or do it
> > directly with a begin/end section). without the ability to send you can
> > never transfer an object from thread to thread.
> 
> We didn't allow or block. Thus if you used an object exclusively on a
> single thread, there were no issues. Or data structures, like using a
> binbuf to store some data. You  could use an object as well, given it
> doesn't depend on some shared resource.

well our documentation explicitly says efl is not threadasfe except for eina,
and a few ecore calls related to threading. it's near the beginning of our docs
for elm and on e.org too. :) you actually couldn't really use an object only in
one thread - we had nothing that did that. all ui was tied to canvas and
mainloop etc. ecore_con, timer, ... everything was. :) except ecore_thread and
a few calls (not creation but checking/feedback send), the sync_call,
async_call and begin/end, and eina of course "where sensible" eg messing with
the same eina_list without a lock around it from 2 threads - bad. but
hidden/invisible stuff like mempool allocs and stringshare data etc. should be
threadsafe.

> > my thoughts on this were for messaging or for setting up message "pipes"
> > with objects. one at each end. th1 creates 2 objects (2 ends of a comm pipe
> > - 2 way like a socket), then sends one end to th2. th2 gets a "i have a new
> > object" callback in its loop, discovers that it's the other end of that
> > pipe and now that's an object locally in its eoid table. the objects are
> > bound internally like a socketpair/pipe are in the kernel. we'd need to be
> > able to send objects to do this. we'd have to have the internals work if
> > either end was deletedwhile the other lives etc. there would be
> > limitations. but it'd allow you to set up multiple comms pipes from any
> > thread to any other and each thread can just release its end when its done.
> 
> but is this for any object? Like I get an Elm_Window and make it work
> like that? Or is it a specific object for communication?

no no no. hell no! not any object. very limited. eg objects with zero parents,
no children, refcount of 1 (not reference by another obj somewhere), etc. etc.
- so you can only send some simple special case objects designed for sending. a
sendable() method returning true/false would do. :)

> > i was also thinking of object sending for inter-thread ipc. send a message
> > from th1 to th2 and there can be an optional object as a payload. it has to
> > be simple (like no children etc.) BUT this would then be possible. yes. you
> > have to release from the eoid table on one end and alloc in the eoid table
> > on the other. not cheap/free but better than spinlocks. :)
> 
> again, send a message is an explicit "send this info there"? Or is it
> "call a method that results in a message being sent"?

it's explicit.

> >> can't we just flag it somehow and for those we spinlock? Objects are
> >> thread-private unless they are efl_add_multithread()? then you start
> >> with the spinlocks for that eoid.
> >
> > this gets more complex. you do not know if the obj needs locking or not and
> > every entry/exit point needs to check if it needs it then do a lock/unlock.
> > we'd add LOTs of code we don't even have right now to every eo base class
> > method and every class you inherit too. you have an issue with locks on
> > objects too with cb's - you have to unlock in a cb then re-lock on return
> > from calling the cb. it adds a lot of code.
> 
> I was thinking to do it less fined grained locks. If non-shared
> (thread private), then execute the function "X", like it does now. If
> shared, then it would essentially lock-call X-unlock. Even when
> dispatching the CB, since you're calling in the same thread, there is
> no need to release the lock, if it's marked as recursive, you can take
> it as many times as needed from that thread.  Problems would result if
> a secondary thread is triggered and would result in a deadlock. But
> since this would be repeatable, users would have a consistent behavior
> and it would always break, so they would have to do something else,
> like defer the action.

that's an invisible begin/end and they are costly. expect to be able to manage
like 1k begin/ends per second to another thread. they could have many ms of
delay on a begin.

> > if we allocate a bit in eoid to know if its a sharable object (thus needs
> > locks), we still have an issue that if we have different local eoid tables
> > there will be an id CLASH where the id can exist in both tables (ignore the
> > "shareable and needs locking" bit). you would have to shynchronise all eo
> > tables and their content (but leave foreign content as NULL in the leaf
> > nodes to avoid being able to access it). this will mean still need locks on
> > the EOID table ANd making the tables sync now will raise costs as you have
> > to "stop all threads and sync" on every alloc/release of a table id.
> 
> not sure i get you or you get me.
> 
> What I mean is:
> 
>  - if shared-bit is set: use ANOTHER EOID table, one that is global
> and protected by locks;
> 
>  - if shared-bit is not set: use a TLS EOID table that is exclusive to
> a given thread.

that was the domain bits. but i used 2. we could assign domain 0 == mainloop, 1
== shared, 2 is the default for any thread and 3 is a spare you can switch to
if you want to act as a go-between for other threads and/or mainloop etc.

> IOW you do what we're doing right now IF ONLY IF bit is set. Otherwise
> we use the TLS version that needs no lock.

similar but you just have a tls struct:

typedef struct {
  unsigned char local_domain;
  unsigned char current_domain;
  Eoid_Data *table[4];
} Eo_Tls_Data;

__thread eo_tls_data = {0, 0, {NULL, NULL, NULL, NULL}};

so you just strip off the domain and use

eo_tls_data.table[domain]->xxxx;

the domain comes from the 2 domain bits in the eoid. when you CREATE a new
object you use eo_tls_data.current_domain as the domain to use UNLESS the
parent object is non-NULL and then you use the domain from the parent obj. i
was thinking that maybe an api to set current domain for these situations, or a
push/pop with a "reset" and maybe begin/end will auto switch current domain to
the "foreign imported" domain and switch back to local on end. so a begin other
than a sync with the target thread loop is:


// do regular begin here
eo_tls_data.current_domain = foreign->local_domain;
eo_tls_data.table[foreign->local_domain] = foreign->table
[foreign->local_domain];

presto! we can now access objects from the foreign domain and created objects
go into foreign domain (unless explicitly switched) then on end before
releasing foreign thread we do:

eo_tls_data.table[foreign_domain] = NULL;
eo_tls_data.current_domain = eo_tls_data.local_domain;
// do regular end here

> > a local EOID space that just ignores all others until some is mapped in
> > with a "stop that thread" assumption or the explicit "adopt a new obj ptr
> > into your eoid table" would be far cheaper as it puts these costs only in
> > those places for those objects/cases and not the other 99% :)
> >
> > object sending is far less code, and i think it'll be rare to send objects
> > compared to all other eo transactions.
> 
> It think it will be super-rare, to the point we shouldn't even bother
> with the resulting complexities (children, non-EFL resources like
> CURL... etc). :-)

i don't think we have to deal with them - just limit sending to sendable
objects. :)

> >> In that sense we could even use that same bit to the eo operations in
> >> the object itself, the obj would have a mutex on its own and all
> >> calls/events would be guarded by that... kinda of a "@synchronized" in
> >> other languages.
> >
> > you would need synced EOID tables. you COULD send an object WITHOUT
> > releasing at the other end. it now has 2 OEID's that refer to it. this is
> > something i think we could do later and THEN when a SHARED object is
> > deleted you need to know all tables its shared between, know all the EOID's
> > that map to it, then message those other threads to release their eoid refs
> > and when the last one is released the obj is actually deleted. likely we
> > would need to still have a master owning thread with the others having a
> > share eoid "view". we can have a bit in the eoid table (no need to use an
> > EOID bit in the ID) to know that your entry is a shared one and someone
> > else has the master copy (in the master table another bit to know that this
> > obj is now shared out and other threads have a ref and you need to wait for
> > them all to message you, release their thread refs, then when those are at
> > 0, you can call some callback to tell the master thread that owned/created
> > it that everyone is done and it can then release it's ref). this ALSO means
> > all eo methods have to use the mutex above you describe. this is a LOT MORE
> > work as i said to have a totally shared object and i think we can do this
> > later without breaking api/abi and just internally, but it's too much work
> > for the moment.
> 
> maybe I'm being too naïve, but in my understanding it's about getting
> all Eo.h and making it check the bit, if set, lock using object's
> mutex, call the actual function, then unlock. That's what
> @synchronized do in other languages :-)

you need recursive locks. then it's doable. as i said i'm warming to the idea
of shareable objects that are threadsafe etc. - you do not want ui etc. to be
like this but explicitly allocated objects that have extra lock overhead then
yes - sure.

> >> > 7. It's an EO (and EFL)-wide rule that you should not make threadsafe
> >> > objects because EO just won't support it - you have to explicitly send
> >> > objects around or do a begin/end of another thread to look at it's
> >> > objects (and then that is limited to a thread of a different domain - we
> >> > have 4 so not bad).
> >>
> >> it's a reasonable rule, we can remove the "@synchronized" thing from
> >> above if it's not easy to implement. But if we could easily add that,
> >> then we can drop the rule and extend. Initially I'd go with your
> >> proposed rule.
> >
> > the main reason it's not easy is the need to lock at every eo func/method
> > invoke, unlock on every exit point in the method, and to unlock at every
> > callback call within this locked state and lock again on return from the cb.
> > that has to go everywhere. :(
> 
> make the function an "_internal" or "_locked". Then you do not need to
> chase all returns... just call it guarded by locks, very simple :-)

well if you do it inside the obj. if its in eo itself then you dont have to,
and recursive mutex's solve callbacks BUT the dont work in efl atm.

> I'm not sold that we'd need to unlock before calling back the user,

if you arent recursive, you have to.

> just use recursive mutex that allows the same thread to lock multiple
> times, so when the user calls a method from the callback, it wouldn't
> deadlock.
> 
> 
> > at the moment #7 is actually our rule for ALL "objects" in efl except for a
> > few, and the only owning thread is mainloop. so its less limited. we have
> > the issue of main loop begin/end where i have a proposed solution here with
> > domains, and i just realised ecore_thread (and some class functions i
> > think...) which has functions to send feedback, check if you are cancelled
> > etc. - we could use the EOID sending if we had eo equivalents, but we
> > don't. we could keep ecore_thread legacy only like we do now and design
> > something new. that'd be the way to go i think.
> 
> ok
> 
> 
> >> > When you are in a begin/end section and you see 2 EOID tables, when you
> >> > CREATE a new object... which one does it go into? Remember that when you
> >> > CALL a method on an obj it may go create objects internally too. How can
> >> > you determine which to use? You should be able to access both without
> >> > creating without issue with domains as above. You could delete fine since
> >> > an object knows which table it belongs to in the current thread context
> >> > based on domain number. They will be different. You can't bind a foreign
> >> > domain in if it matches yours - it'll fail. But creation is special.
> >> >
> >> > One option... if you create WITH a parent passed, the child must go into
> >> > the same domain automatically. Operations mixing domains in an object
> >> > tree should fail. What about other cases? Create a bare object with no
> >> > parent... you add as a child later. How to choose which domain it goes
> >> > into? Local or fireign? Maybe there is a context you can switch that is
> >> > in your TLS that tells you which to use (local or foreign table). If we
> >> > have a push/pop setup it'd be nice, but it's easy to get wrong. An
> >> > explicit call to crate with foreign and eo_add is local? So
> >> > eo_foregin_add() uses the foreign domain (if adopted at the time, and if
> >> > not it will either fail or just use local domain then). Worth thinking
> >> > about.
> >>
> >> Raster, you lost me here... I guess you have too much in your mind and
> >> assumed it was clear, at least it's not clear to me what you meant...
> >> and I read these 2 paragraphs couple of times :-)
> >
> > oh... yeah. sorry. :) ummm we have the begin/end thing right?
> >
> > you have mainloop + thread.
> >
> > thread can call "ecore_main_loop_begin()" and this will sync with mainloop
> > and STOP themainloop at a safe point, then the func will return and any
> > code you now run is "assumed to be in the mainloop context". you can mess
> > with ui and create timers and everything, until "ecore_main_loop_end()"
> > which releases this lock and lets the main loop continue on.
> 
> ahhh.... THAT begin/end. ecore_main_loop_*...

yeah... THAT one. that is what is requiring access to another domain. we could
do it that you lose access to your local domain, but that'd be far too painful
i think. :(

> > the idea is every efl loop will have these begin/end methods that let u sync
> > and lock out the loop and PRETEND to be that loop for a hopefully small
> > section of code that for example updates the ui with data you have locally,
> > then releases the loop again to keep running.
> >
> > *IF* we use TLS then during this period where you pretend to be another
> > loop ... you STILL CANNOT see the other loops objects because they live in
> > that thread's TLS data. right? so how to solve this? we can just move over
> > the tls pointer from mainloop to thread temporarily, then do your stuff,
> > then release. fine. during this block you CANNOT access your "local"
> > objects at all because your whole EOID namespace switched. they will be
> > mainloop EOID's not yours. can we solve this?
> >
> > yes we can! EO has 2 EOID pointers in TLS. 1 is "local". the other is
> > "foreign". 995 of the time foreign is NULL. when you do a begin, foreign
> > then gets the ptr for the "local" EOID table of the thread you are doing
> > begin on.. so stop, block other thread and continue. it is not SAFE to
> > continue as there is no contention as only 1 thread is working on this data
> > at all.
> >
> > but how do you know an EOID is your local one or a foreign one? solution,
> > allocate 2 bits in the EOID that is like a "thread id" but i am calling it a
> > domain ID. your domain for your thread MUST be different to the one you are
> > doing begin on otherwise this cannot work. so i would make the mainloop
> > ALWAYS have domain 0, and other threads can choose (with the default being
> > 1, and expecting the threads then to only do begin/end on mainloop and no
> > other threads, ut we have 2 more values (2, 3) that can be used for other
> > threads then a thread in domain 1 can do begin./end on one in domain 2 or 3
> > etc.).
> >
> > this domain value (2 bits, value 0 to 3), will let us know to look in the
> > local table OR the foreign table for the object. we know the local domain
> > id and if domain in EOID == local id, look in local table OR OTHERWISE look
> > in the foreign table (we can just make it an array of 4 items - one per
> > domain slot and look in that slot. when you begin() on another thread it
> > puts that threads LOCAL table into your slot for that domain locally so now
> > everything can be accessed).
> >
> > NOW we can access BOTH our local objects and the objects of the "foreign"
> > thread we have done begin and end on. in fact if we use the above array any
> > single thread can begin() on up to 3 other threads at any time and access
> > everything. the issue is with creation of new things. where do they go?
> 
> My idea is simpler than that, see above: 1 bit: "shared"
> 
>  - shared=1. If so, use a global table (non-TLS), guarded with locks;
>  - shared=0, use a TLS, no locks.
> 
> If to use a second bit, I'd do the per-object mutex for all operations
> as I described above. But it's an addition, nothing to worry now.

but that means everything becomes shared and we have to put mutexes in eo for
every object - as basically everything we have belongs to the mainloop - the 3%
of cost still is there as its all objects in the ui.

> >> To summarize my understanding with your restrictions: you create the
> >> object in a thread OR send it to a thread. Then when you create, the
> >> domain and all are all set to the current thread. Parent needs to be
> >
> > well i'm asking.. what domain should created stuff belong to when you have
> > multiple domains mapped into your thread.
> 
> as above. If you create it with the bit set, then you use the global
> table (non-TLS), guarded with locks. If you do not, then it's in the
> TLS of the current thread.

yeah. you are thinking shared by default (well all current objects). i'm not.
i'm thinking private/local tls only by default with explicit mapping of other
domains into your view on a begin, release on end. sending of objects to handle
some cases where an obj needs to change hands from thread to thread over its
life (do not send it back and forth all the time as that will be costly), and
now i'm thinking shared objects make sense as an alternative to sending.

> If you want to create in another Thread/TLS, then you send a message
> to that thread and let it create it there. Thus why I'm saying to
> focus on communication, to make this simpler.

that's exactly what i was thinking of sending :)

> Doing the creation within ecore_main_loop_{begin,end} doesn't change
> things at all.
> 
> 
> 
> >> Maybe we should focus more on easily communicate between two main
> >> loops/threads? That way you do not need to pass objects and hit the
> >> above complexity. All you do is to send  information, and on the
> >> target thread you do the actions, like create the object.
> >
> > as above - messaging between loops is on the cards. being able to send
> > objects that REPRESNT some complex piece of data would be nice. imagine a
> > simple "database" object where you can query by key, row, column, path etc.
> > - like an sql object for example (urgh ok not sql but you get the point)
> > and thuis object is a database object which is really just an obj
> > representing a big backend store of data you can read/write. you want
> > another thread to access data? create db object, send it over. that's the
> > kind of thing i think sending should be used for.
> 
> But here the underlying library may not accept being used from
> multiple threads. As it's the case for CURL. Usually this is the case
> :-/

yup. thus can't be sent. :)

> >> reading this and thinking about clear multi-thread cases makes me
> >> think that we need easier communication more than sharing.
> >
> > we HAVE to handle the begin/end case. we can't support our legacy api
> > otherwise. it's a SPECIAL case in that the other thread will be paused, so
> > its sharing with no locks, but we HAVE to do it to retain begin/end. i
> > first just through of a single bit "mainloop vs everone else", but i
> > realized that that is too limiting. 2 bits would make it far better. same
> > idea though. :)
> 
> as per above, what's the limitation? it would mean legacy API would
> create with the shared bit set, would do what we're doing now.

that is our disagreement. i'm thinking NOT shared by default. :)

> Internal widgets and stuff like that would still benefit from
> thread-private/no-locks if we wish so (unless we return the widget,
> then we'd have to use the same bit as the parent).
> 
> Worst case is what we have now :-) Best case is lock-free for those
> that care about performance!
> 
> However see my early comment about why people use threads and why we
> should improve that before/while reworking Eo internals.

well the reasons i've seen are different. it's like you get paid per thread you
use. :)

> [...]
> 
> >> children, then we have non EFL resources like CURL. So at least a
> >
> > well children - disallowed. curl - does it matter? well ok it matters if the
> > data the object is carrying like let's say curl data, is unable to work
> > outside a specific thread. then this is not possible. so that is a good
> > point - this is why it probably makes sense to have maybe a senable()
> > method in eo base that returns true by default (we will make eo base
> > sendable), but any class on top that cannot be send (eg internal data like
> > curl cant move from one thread to another), then it overrides and returns
> > false. design point - only ever override to then return false. never flip
> > it back to true again because a parent class cant be sent. this would limit
> > sending to a few specific objects. we could make it false by default and
> > only enable if you know for sure you and all your parent classes can be
> > sent, but how do you KNOW unless the parent class already returns true? :)
> 
> ok, at least something like "sendable" must be defined so we can know
> if it would work or blow.
> 
> 
> >> OTOH I do think that we could use just one bit (instead of multiple
> >> domains as you said) that means "use the global EOID table guarded
> >> with spinlocks".
> >
> > urgh. we could do that. BUT you'd need more than 1 bit. for begin/end you
> > need to be able to see 2 eoid tables at once. one of them would have to be
> > global, one then private. if mainloop is private then by definition EVERY
> > thread must use global. this is bad.
> >
> > BUT with domains maybe 0 is private, 1 is global, 2 and 3 are 2 more private
> > domains as i described. there still is an issue - global ids then mean
> > peolpe THINK objects are threadsafe and thus they have to be made
> > threadsafe... and back to the above for that. :)
> 
> I still fail to see these problems... See my comments above and if you
> still think there are problems, describe some example with comments on
> how/where it would be a problem.

as per irc. :)

> >> Whenever it's feasible or desired to offer some "@synchronized" for
> >> Eo, like objects created with that bit would have an internal mutex
> >> and all access would be guarded by it.. need to be careful with the
> >> deadlocks if methods are calling others, the lock would be already
> >> acquired... so maybe a recursive mutex?
> >
> > yeah - i know. this is the pain. recursive mutex also works for not worrying
> > about unlock on calling a function that exists your "frame" into a child
> > frame (callback call or any other func/method). i am not sure if recursive
> > mutexes are portable. it seems to work on most *nix's - not sure on
> > openbsd, and then windows seems to have them. we COULD have eo actually do
> > the lock/unlock on method call if the obj is a lockable obj. if we have a
> > global eoid table then objects in this table will have to be lockable like
> > this. we need to decide if this is a good path or not. sending is cheap if
> > you do not go back and forth a LOT. shared with mutexes is better if you do.
> 
> I guess you can do recursive mutexes everywhere these days, Python and
> other languages use them to do their synchronized stuff.

well eina doesnt support recursive locks atm. need to investigate why.

> > i actually like the idea of a specific domain that is shareable, with others
> > private. private == more performance, but we need different domains like
> > above to SEE multiple private domains and those EOID tables for begin/end.
> >
> > but the question remains - when you then do eo_add() which does it belong
> > to? private or sharable or any other domain if they exist)?
> 
> efl_add() -> private
> efl_mt_add() -> shared
> 
> efl_mt_is() -> check eoid bit
> 
> efl_event_callback_call(o, ev, info) {
>    if (efl_mt_is(o))
>       real_ptr = efl_global_eoid_table_get(o);
>    else
>       real_ptr = efl_local_eoid_table_get(o);
> 
>    if (!real_ptr) {
>     ERR("%p is not a valid object", o);
>     return;
>    }
> 
>    if (efl_synchronized_is(o)) // if we opt to use that extra bit
>        mutex_lock(real_ptr->mutex);
> 
>    _efl_event_callback_call(real_ptr, ev, info);
> 
>    if (efl_synchronized_is(o)) // if we opt to use that extra bit
>        mutex_unlock(real_ptr->mutex);
> 
>    // cleanup
> }
> 
> 
> 
> -- 
> Gustavo Sverzut Barbieri
> --------------------------------------
> Mobile: +55 (16) 99354-9890
> 
> ------------------------------------------------------------------------------
> _______________________________________________
> enlightenment-devel mailing list
> enlightenment-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel


-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler)    ras...@rasterman.com


------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] RFC: EOID + Threads + TLS proposal

Reply via email to