Re: [E-devel] Ecore - Efl.Loop + Task + Thread + App + Appthread + Exe

The Rasterman Thu, 08 Mar 2018 00:58:09 -0800

On Mon, 05 Mar 2018 17:25:09 -0500 Cedric Bail <[email protected]> said:

> On March 5, 2018 6:00 AM, Gustavo Sverzut Barbieri <[email protected]> wrote:
> 
> <snip>
> 
> > Well, I already mentioned this to you in irc, but replying here just
> > to make my point:
> > 
> > I think the design is upside down, trying to make life easier at some
> > point resulted in messy at the other side.
> > 
> > okay, call it a task, work for both process and threads, the hope is
> > to facilitate "switch from threads to process, and vice versa", but
> > we're getting the worst part of each it seems.
> > 
> > ie: most thread usage is about sending shared memory back and forth,
> > after all the benefit of threads is just that: shared memory. What we
> > got? argv + int, useless for real life threads.
> > 
> > solution? "stdio" for threads... you can send/receive I/O, done with
> > pipes, like it would be for processes.
> > 
> > I really, really don't like that design. To me threads should be
> > exposed as such, with pointer argument and result... if you wish to
> > make it easier, you could "box" it someway, like eina_value... or just
> > let the traditional "void *pointer". But that's fundamental, say for
> > Python bindings to use threads, you want to send PyObject pointer to
> > the thread, that object will contain the information to callback the
> > user (the user's Python-callable). And for most usages as well, like
> > sending file info to be opened/decoded, etc.
> 
> I do second you completely Gustavo. I have been raising my discontent with
> this design for a long time now and fully agree with your point here.
> 
> > thread I/O, while useful in some cases, I don't see why they should
> > always exist. They should be optional, and as such they should be
> > created explicitly.
> 
> Agreed.
> 
> > Last but not least, I'd expose the stack just like in the OS, to not
> > make confusion:
> > 
> > -   Application ("the binary") -> Processes -> Threads -> main loop
> > 
> > you can present "getters" that make it easier to access (ie: main loop
> > "get" application, in addition to "get parent" which would return the
> > thread).
> > 
> > But mixing stuff "in the hope to make it easier" it not, it's just
> > making things more complicated... ALSO note that the developer that
> > would use this kind of API is "not the average developer", these don't
> > mess with such low level primitives. The developer that is likely to
> > use these primitives will say "what the fuck is this? I'm used to
> > processes and threads, these guys just messed it up".
> >
> > I'm in favor of interfaces for things that are the same, so if the
> > methods in process and threads share the same concept, behavior and
> > parameters, make them an interface... when switching from process x
> > threads one doesn't need to "sed" everything. However, definitely
> > constructors are NOT the same concept, behavior or parameters, thus
> > not part of the interface.
> 
> I would add that this thread model doesn't satisfy any of the use case we
> ourself have. For example :
> 
> - Ector: use thread to offload CPU and memory heavy task to another thread.
> The simple change to BFL has slowed us down. Now, if we do have to
> serialize/unserialize everything ? I have already my bet on the result.

ector using eo was and is a mistake. it's a performance sensitve part of
rendering and going through eo to do everything is bad/wrong.

the alternative is even slower eo with more locking on a global table or no
locking which is what we originally had and then we had mysterious random
crashes instead. hooray!

don't use eo for ector. simple. it's inappropriate.

> - Evas async renderer: same case as Ector. Nobody might remember, but at the
> beginning it was really slower, because of the memory copy that was done and
> it had to be removed. If we were to use this new infrastructure, it would
> lead to be back to a slower solution.

eh? it was barely noticeable. and i've been very clear that thew i/o using using
a pipe as an IMPLEMENTATION DETAIL. it could use shared buffers. but they still
require a copy into the shared buffer in userspace then out again. something
like thread queue when it doesn't do copies has thread queue about 1.6x the
speed for high volume messages in and out. i've done the benchmarks. arm
seems to have a hiccup that shouldn't be the case and is maybe. i shall bring
numbers to my aid:

https://phab.enlightenment.org/P170
https://phab.enlightenment.org/P168

a pipe is 60% slower than a zero-copy thread queue (x86) on arm 8x slower and
as kernel devs told me "that shouldn't be the case" so it's a bug it seems. the
io interface can't work with zero copy because that requires allocating your
buffers ON the thread queue to avoid a copy and not passing them in like the io
iface does. so you'd add a copy into the thread queue anyway... which i would
suspect would nuke even more of delta between them. it's barely worth talking
about in terms of performance especially for low volume. for high volume
perhaps the implementation can change but it's unlikely to make a big
different.s it doesn't change the DESIGN though/

so i'm not sure what you are talking about here in terms of it being so
massively faster to go from zero copy thread queue to pipes. it's not. it's a
bit faster *IF* you can make use of zero copy. the io interface cannot. you
provide slices with buffers to it. not the other way around.

> - Eio: we offload the io and the content generation from another thread. Even
> unecessary function call do significantly slow us down while listing a
> directory. That is why we are grouping things in array to avoid this function
> call. I do not expect things to get better with the above proposal.

i don't see how they are related. eio can still use raw threads and pipes. it
certainly cannot use ecore_thread if it wants to be called out of the mainloop.
so let's say it does use efl.thread/task etc. to manage these back-end workers.
where are all those extra function calls that are not already designed into
the efl.io interface (having to listen to a can read changed cb, then check if
can read, then call read - these all go in and out of eo. the read is a syscall
as it would generally need to be and can read any amount of data it likes in 1
read call).

then how else do you propose to do i/o? what alternate api for i/o that will
then be consistent through efl? remember then this means changing efl.net too.

> - Efl_Net: the design of the interfaces allow for moving an object to another
> loop, but because we can't transfert a socket to another loop, we can't use
> that. It means that we can not do any load balancing from the main loop. We

actually efl.net design never seems to have cared to move objects between
loops. but you could set up different service listener objects or clients in
different threads and loops and have them work in parallel. in theory. of
course the current implementation in efl.net doesn't allow for this and can only
work on the main loop.

> can use Efl.Io to redirect traffic to another thread, but it will always
> require a wake up and work in the main loop.

that is how efl.nmet is currently design and implemented. you'd need to find a
way to move an efl.net client and it's fd from loop to loop and that's kind of
nasty. i wouldn't even bother with that as i don't think it's an issue that will
actually matter any time soon.

> If it doesn't solve our own problem, I am not convinced it will be solving
> anyone problem.

well where did WE ever have an issue where we need to move ecore_con client fd
handling out of the main loop due to performance issues?  it never happened.

efl.thread solves mangy problems *I* have had with ecore_thread. it is now
2-way. i have a guaranteed loop on the other side to handle incoming messages.
i mean just look at the ecore animator. i have to set up a pipe to have input
to it and have a select/epoll there to listen to both the device interrupt
input and requests from main loop. i had to by hand write a adhoc loop
implementation.

i've also designed around ecore_thread and limitations in rage. the above is
actually addressing issues i have actually had and worked around. now i have
threads that have symmetrical i/o in and out and run a loop listening to i/o
nicely in an event handling loop. i don't have to send messages to main loop to
set up an ecore timer to timeout to them message back to a thread... which i
have done. i have also moved things between threads and processes before in
rage.

so this actually is fully addressing issues i have absolutely had.

i do not like the io interface. it leads to far more code than i like, but it
is what we have. forgetting IMPLEMENTATIONS this is a lot like
ecore_thread_feedback_run() but:

1. handling feedback is optional (just dont set an event handler for it)
2. you can continually send data to the thread unlike feeback
3. it doesn't try use an existing worker and place the job on a queue, so it's
equivalent to "true" for try_no_queue so it cant be blocked by other workers
filling up the queue and blocking it which is actually by far my most used
option and usage of ecore_thread so that for me is the majority of use cases
4. it gives you a full loop in the thread itself which neveer was possible with
ecore_thread thus no need to do ecore_thread_check() to see if you're
cancelled yet regularly or to do your own select/epoll loop which i've had to
do before
5. it's not limited to mainloop -> thread worker. it can do this at any level.
threads can spawn threads and also talk to their children etc.

so i am not sure how you object totally to this.. it's what we have and then
some on top where i can point to examples of how i've worked around the
limitations.

if it's omg pipe so slow" please show me numbers that prove that they are so
slow AND please show me how they are not just an internal implementation detail
and can be changed/replaced and don't affect design EXCEPT with efl.io as i
mentioned above not allowing zero copy with a thread queue due to not having
the io allocate the buffers. so if they were/are so slow they can be replaced,
but prove that first that it makes a real impact. and if you have large volumes
of data... you can write POINTERS down the pope to the data you want to work
with and not the data itself... that works just fine.

i could add a thread pool behind this to avoid creating new threads too. but
there isn't one now as i'm keeping the implementation simple. args handling is
a tiny bit of code compared implementing a thread pool. but again - a thread
pool doesn't require the design to change.

...

in the meantime i've added in AND out ptr data so these get passed - you could
have used the io streams to pass ptrs - any number of them, but ok. it seems
this wasn't realised so i'll pass a ptr and and out too. so you can use this if
you want, or string args. iv'e also added call() and call_sync() to call
functions in a child thread or in the parent thread/loop automatically. it
works both ways. far far nicer than what ecore_thread offered and saves writing
a command interpreter on either side for the io as the func + data ptr are the
command and data payload to run. and awesomely you can reply with a call the
same way symmetrically (beware... don't reply to a sync call with a sync all..
boom. deadlock - nothing i think that can really be done here, BUT i have made
sync calls able to return a void ptr so the sync call gets that void ptr as a
return value so you can reply this way).

to me this cleans up a lot of hackery around ecore_thread and the singleton
ecore mainloop. it makes a lot of stuff so so so much easier. it makes it
easier to move things fro thread to process and back (and i've done that). it
makes i/o symmetrical with efl.net as well as threads and exe's. the only thing
it's really inefficient at are impl details that can be changed. e.g. use
socketpairs not pipes. multiplex control and io down the same socketpair.
perhaps use a side channel data queue for the i/o data and sockets only for
signalling. use a thread pool. all these are internal implementation details.

> Cedric
> 
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> enlightenment-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> 

-- 
------------- Codito, ergo sum - "I code, therefore I am" --------------
Carsten Haitzler - [email protected]

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Re: [E-devel] Ecore - Efl.Loop + Task + Thread + App + Appthread + Exe

Reply via email to