On Fri, 12 Jan 2018 19:48:15 +0000 Mike Blumenkrantz <michael.blumenkra...@gmail.com> said:
> First, thanks for taking the time to provide numbers and details to support > your feedback. I realize that it may seem like a burden to you, but it's > hard for us developers to engage in a debate against vague handwaving in > the absence of any real data. I've replied to your points inlined for > better readability, and I hope that this will address most of the feedback > that you've given towards the current direction of development. > > On Fri, Jan 12, 2018 at 9:45 AM Carsten Haitzler <ras...@rasterman.com> > wrote: > > > On Fri, 12 Jan 2018 13:48:46 +0000 Stephen Houston <smhousto...@gmail.com> > > said: > > > > > Nobody said anything about GL. > > > > you want gadgets to be separate external processes (one per gadget) and i > > covered that. > > > > *IF* you want gl the cost goes up even more. if you don't use gl then you > > have a > > problem of having to render on cpu and transfer that data to gpu accessible > > memory (or make the gpu have to read slower memory when doing rendering as > > a > > source buffer). it's generally far better to keep it all in gpu video > > memory > > rather than do this, if it's possible to do. thus it'd probably be good to > > have > > gadgets render with the gpu. > > > > but i covered both cases. all the numbers below do split gl vs non-gl. the > > end > > is the same. still a LARGE amount of overhead for a process. > > > > Both of these cases are solved by rendering the buffers (both gl and > software) directly in the outer compositor; see > https://phab.enlightenment.org/T6592 on the gadgets workboard for this task > which is already nearing completion. that doesn't change anything. some work somewhere has to either copy the data to video memory every change OR the the gpu will likely have to composite and thus access data over a bus (eg over pci bus). if it's an embedded system all memory is equal, but subsurface are going to unlikely help because you'll not have enough hw layers to assign to a host of gadgets. there will generally maybe be 2-5 ... maybe on a good day 8 or 10 layers, and these are far better used for application windows (e.g. separating desktop, active/focused window or window being moved/dragged, etc. etc.). it's still a gadget rendering to a buffer THEN that buffer having to be copied again ... always. at all times. > > > On Fri, Jan 12, 2018, 6:02 AM Carsten Haitzler <ras...@rasterman.com> > > wrote: > > > > > > > On Thu, 11 Jan 2018 17:13:22 +0000 Mike Blumenkrantz > > > > <michael.blumenkra...@gmail.com> said: > > > > > > > > > Thanks for your valuable input! We'll take it into consideration. If > > you > > > > > have feedback for the developer team in the future, please be sure to > > > > > include technical data to support your comments so that we can > > evaluate > > > > it > > > > > fairly. > > > > > > > > why is it that when i read the above it's just dripping in sarcasm. i > > > > smell it > > > > dripping between every line. the snideness just oozes from it. > > > > > > > > now i wasted a good 30 mins redoing numbers below that i already pretty > > > > much > > > > summarized with "don't use lots of processes" as i've spent years > > looking > > > > at > > > > this data, but obviously my history, experience etc. mean little to > > you. > > > > this is > > > > stuff you should have done before choosing this path. you obviously > > didn't > > > > given > > > > "please provide technical data" above that you seemingly don't have. > > > > > > > > if you have a gadget as a window then that window must have a buffer > > and > > > > that > > > > buffer consumes memory where otherwise no buffer would exist. did that > > even > > > > need to be justified? just do the math. a 40x40@32bpp gadget will > > consume > > > > at least ~6k of ram and if gl rendering is involved that'd be 12 or > > 18k or > > > > so. > > > > maybe more. that's a start. > > > > See above comments regarding proxy rendering which would avoid duplication > of buffers. i was not talking about duplication of buffers. the buffer is used directly. what i'm saying is that there is an enforced 100% of the time intermediate buffer to contain all of the gadget. this is in addition to the source image data (as malloced memory or as textures) e.g. an image of a battery that is then drawn to a backbuffer/shm buffer/whatever... this backbuffer is the extra buffer. you'll be swapping around 2 or 3 of these all the time. that's the extra required buffer. > > > > > > > > now let me add up the dirty pages just from loaded .so files (dirty > > pages > > > > are > > > > private per-process pages) for just a single efl process like > > > > enlightenment_askpass that has the same linking as a gadget: that's > > about > > > > 1800kb of dirty pages PER PROCESS that is overhead. (1872kb to be > > exact). > > > > pmap > > > > -x. use it. it's a rough tool but does the job for this topic. > > Quicklaunch solves this; https://phab.enlightenment.org/T6395 on the > gadgets workboard. To address your criticism of quicklaunch in your other > reply, it's a functional technology which can be leveraged for this > purpose. Whether it is a "royal pain" for you personally is not technically > relevant since you are not actively contributing to the gadget project. quicklaunch has lots of issues with fd's as it has to keep some but not others. quicklaunch has broken things and then in order to fix it it has steadily been made less and less efficient sharing less and less with the parent process (i.e. less being allocated/created before). it totally doesn't work with eina debug or threads too. any threads created prior to fork are lost and have to be rec-created. this leads to more and more code handling this corner case that is fragile. > > > > now if you want accelerated rendering with gl... that'd need a gl > > context > > > > and > > > > gl dependencies loaded. dirty pages loaded from .so's goes up to > > 4764kb ... > > > > > > > > now heap memory usage when using gl goes up too. gl contexts and so on > > > > (heap > > > > meaning via malloc etc.) and this goes up from 4.9m to 8.8m just to > > have > > > > acceleration. this will multiple by the number of processes. > > > > While I agree that you have a great point regarding GL memory usage here, > there's no purpose in using hardware acceleration for the vast majority of > gadgets. The render size is very small and the content seldom changes, so > software rendering would be a clear win over using hardware. i actually disagree here. given growing dpi's and larger gadgets like ibar style things have far more pixels to render... even worse, ibar isn't doable as an external process gadget because it relies on getting client window objects to display in the popup. these are only doable in x11 if you implement a mini compositor/redirector of your own to get the pixmaps and so on, and in wayland are totally inaccessible, so portability-wise this fails here. pager is another example. but ok. let's say that you have a choice. internal module for these things and external process for others, the larger the gadget the more likely you want to use hardware rendering. this also applies to saving memory. if 5 gadgets all have an "arrow up" icon that comes fro the same theme, they will all load and decode their own private copy of it in memory, then if e is using that too it will have it in a texture too. > > > > now there is also the overhead that every process loading an edj file > > has > > > > to > > > > have its own malloced data structures for that edj file. if you look at > > > > detailed massif data you'll find that edje + efreet + fontconfig > > > > + openssl will malloc privately per process just to load the theme, > > shared > > > > efreet cache files and cached fontconfig data etc. a total of about > > 1639kb. > > > > > > > > let me do the math for you: > > > > > > > > 1639 + 1872 (if software) = 3511kb per process overhead just for what i > > > > mentioned. it's more with each process's own stack and other things > > also > > > > loaded. but let's call it 3.5m. > > > > > > > > now if we want gl acceleration that goes up by ANOTHER 3.9 + 2.8m = > > 6.7m. > > > > so > > > > 10m per process overhead if you want to use acceleration to render and > > not > > > > the > > > > cpu. at least as a new stand-alone process. > > > > > > > > for a more coarse measurement, just do the simple one: run free; run > > > > enlightenment_askpass, then free again (with system otherwise stable): > > > > > > > > without gl mem usage goes up by 10848k. with gl it goes up by 21772k. > > this > > > > is > > > > nvidia drivers btw here. > > > > > > > > there's also all the added overhead now of having to launch all the > > > > processes > > > > that are gadgets and process launching is not cheap. it's slow and > > costly - > > > > especially all the post-process work like loading edj files, > > decompressing > > > > the > > > > same data multiple times etc. etc. ... > > This is interesting, thanks for sharing the numbers here on memory usage of > edje files. These files can be cached in the compositor process and shared > using quicklaunch to avoid opening them multiple times, and this should > mitigate any increased memory usage across multiple instances of gadgets. actually loading edje files require evas. evas creates threads. threads don't survive over forks. thus loading edje files in a parent then forking and expecting it all to work ... is not going to. how many more threads to you have to fix up to be able to "restart after being terminated with no notice"? the fork may terminate them and any thread handles or variables indicating a thread is alive (the thread handle for example and it was never joined) will be incorrect. shutting down all possible threads first before a fork, then starting them all up again is an unsustainable amount of work. > > > > > > > > are you telling me you didn't do any design evaluation or numbers > > before > > > > deciding on a path and you need me to do this for you? i have done > > these > > > > kinds > > > > of numbers again and again over the years and they consistently show > > the > > > > same > > > > thing: > > > > > > > > 1. every process costs. especially if it's complex with lots of > > libraries > > > > and > > > > data sets. it's 100's of kb just for the simplest basic libc using > > process > > > > stretching into multiple mb for those using larger toolkits. and that's > > > > just for > > > > common shared data that isn't specific to that process. > > > > 2. opengl adds a mountain of overhead on top per process. > > > > > > > > most of these are very hard to get rid of and designing something that > > > > actively > > > > works against "the way things are" is a very poor way of doing things. > > > > especially to everything enlightenment is about, like being efficient. > > I've been charitable and allowed for the highest level of technical > competency when evaluating your proposals from this thread to others who > have privately come to me with questions about them. I've similarly been > very civil with my initial reply, even if you chose not to read it as such. > I would appreciate it if, when you come to provide your feedback, you could > extend the same courtesies to the developers who have already spent > considerable time and resources implementing this. your mails do not read as being civil. they read as being hugely sarcastic and precisely the opposite. cedric thinks quicklaunch is a viable solution. i actually wrote it. i have come around to the position that it is not for many technical reasons. complexity with sharing fd's (or not), problems in sharing memory (shm segments don't get COW'd but a fork kills assumptions on them - e.g. that your mapping of that segment is seen by only you and not also a copy of you). it has major problems with threads which we are increasingly using more and more of, etc. etc. ... if we have to write efl with the assumption that "at any point someone might fork AND continue to execute without immediately exec()ing away", any locks held that a thread would have unlocked will remain deadlocked now forever as an example. we're going to have an endless chain of fork bugs here. that means every thread, every fd, every mem segment and any libs we rely on that also use these might be a problem. be it libpulse and any fd's it has (like sockets connected to pulse server), x11 connection sockets (this is a major pain) wayland client socket connections, and then perhaps pipes or other things libs like gstreamer might have internally, not to mention any currently running threads that can't sensibly survive a fork... having to clean them all up, a quicklaunch system just is not sane. it's beating a dead horse. and this comes from the guy who wrote it and thought it a good idea at the time. i'd also have appreciated at least a high level heads up "we're going this way" discussion from you or anyone involved beforehand. when you do something quietly in a back room, then decide to launch it as "the new way to do x" then you need to expect someone is going to probably disagree with it. and i do. i have my reasons. and i know quicklaunch is going to end in a bucket of tears. i have shed those tears myself and have now gotten over it. i've spent enough time fixing it and finally reducing it to do as little as possible in the parent to avoid issues. it is currently broken and it'll remain so pretty much forever. > > > > > On Thu, Jan 11, 2018 at 3:41 AM Carsten Haitzler < > > ras...@rasterman.com> > > > > > wrote: > > > > > > > > > > > On Wed, 10 Jan 2018 20:38:27 +0000 Stephen Houston < > > > > smhousto...@gmail.com> > > > > > > said: > > > > > > > > > > > > i'm going to put my take on this: i think this is bad. it flies in > > the > > > > > > face of > > > > > > everything e was built to do. to be efficient. having a process per > > > > gadget > > > > > > is > > > > > > horribly inefficient. this is precisely what helps bloat out gnome > > and > > > > kde > > > > > > memory footprints and what has kept e lean. > > > > > > > > > > > > while avoiding e crashing due to a bad gadget is a good thing, the > > > > > > following is > > > > > > just a bad way to do it. > > > > > > > > > > > > 2 other alternatives: > > > > > > > > > > > > 1. have a SINGLE "gadget process" to hold all gadgets and load > > modules > > > > into > > > > > > this process, so we don't have 20 processes, but instead have just > > e + > > > > > > gadget > > > > > > process. > > > > > > > > This is an interesting proposal, and there's nothing in the current > infrastructure model which would prevent implementing a monolithic gadget > process. Whether doing so is a good idea is outside the scope of this > thread--originally created to inform about documentation of the existing > sandbox gadget infrastructure. well it's a discussion to have. i proposed better solutions because if things go the way of "we now have 10 or 20 gadget processes" and all the costs that come with it (as i said - quicklaunch i dont think will cut it), then this would be a far better compromise IMHO. you get stability of the core wm and if an issue happens, at worst a dozen or whatever gadgets flicker off/on and reset (as opposed to a single one). the main wm will keep marching on and user will be largely unaffected. the current design i see(as i read it) with an environment variable seems directly designed for a single gadget per process. > > > > > > 2. remote ui. gadgets could be written in any language that can do > > > > stdio. > > > > > > they > > > > > > echo/printf commands to e to create objects and change their state, > > > > and e > > > > > > "echos" back on stdin events and things the gadget should know. > > your > > > > gadget > > > > > > could be a stripped down super-lean basic libc executable. python. > > > > shell > > > > > > script. anything that can do stdio. while this still has 1 process > > per > > > > > > gadget - > > > > > > it's for the back-end only and this should make then far leaner > > than > > > > with a > > > > > > full ui. > > > > > > > > > > > > i actually wanted to spend time on #2 but am busy with efl atrm. #2 > > > > would > > > > > > be > > > > > > great because it'd massively lower the work needed to quickly make > > a > > > > > > gadget of > > > > > > your own. write it in python or shell quickly and dirtily and life > > > > would be > > > > > > easy. > > > > There's also nothing in the current infrastructure model which would > prevent this, and there is similarly nothing preventing you from adding > extensions to gadget sandboxing to provide these facilities if you so > decided. Such a proposal is, however, also outside the scope of this thread > and the existing sandbox gadget infrastructure. well where was the original design discussion for this? i never remember seeing it. > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > I would like to point everyone to a new wiki page we have that > > > > details > > > > > > > developing gadgets for Enlightenment using the sandbox method. > > > > > > > > > > > > > > https://www.enlightenment.org/develop/e/sandbox_gadgets > > > > > > > > > > > > > > Feel free to use the guide to create cool new gadgets for E and > > make > > > > sure > > > > > > > you contribute back feedback about the guide so we can improve > > it as > > > > > > needed. > > > > > > > > > > > > > > Thanks! > > > > > > > Stephen (okra) > > > > > > > > > > > > > > > > > > To address further comments that you've made in other replies: > > > and in what cases does this actually matter? if you want to sandbox every > > module/gadget then i think that is going far too far. my take is that a > > module/gadget is a trusted tool that is integrated with your desktop. > thus the > > above doesn't count. and if you really do want to isolate then using > something > > like elua and a private lua state per gadget would work best as then it > doesn't > > mean we have to duplicate the entire process N times, only a fairly > minimal lua > > state (in comparison to the rest behind it). > > I can appreciate how much you value the longstanding Enlightenment > development model of using modules, but requiring that every gadget be a > module significantly raises the barrier to entry for both new developers > and existing developers who want to contribute new gadgets. External > modules effectively require users to accept an EULA in order to > activate--sandboxed gadgets do not. There is still nothing preventing > anyone from using modules to develop gadgets in the way that you've > suggested. i never said "keep things as they are". the linux kernel also complains about being tainted with modules too. same idea. we ask users to once-off approve a "tainted module". that's not exactly problematic. but i did NOT say keep things as they are which is what you imply. something like elua and lua modules would easily be something that i don't think would warrant that (for example). > > will be incredibly rare to want to do this (sandbox your gadgets). as i > > mentioned above. for cases where security is an issue they should probably > > split into a very simple back-end that is very locked down and a > front-end to > > talk to it and authenticate etc. the back-end maybe at most would reauest > an > > auth itself (e.g. may execute enlightenment_askpass or have a ui of its > own > > though this makes it more complex, or have a passwd asking tool of its > own it > > directly runs etc. etc.). > > Certainly this is a viewpoint that some people may have, but I think it's > important to look at the sandboxing infrastructure from another > perspective: the purpose of this, beyond providing increased security and > stability, is to make it significantly easier for people to contribute to > the Enlightenment desktop. Writing a module and using a specific gadget API > is more difficult than writing a simple EFL-based application. Forcing > developers to only use C, or requiring that there be language-specific > infrastructure/bindings for the Enlightenment APIs is similarly more > difficult than letting developers use whatever language they want. what i proposes with stdin/out actually would allow far more languages to be used than what you are doing with this process-per-gadget system. far, far far more. also development would be simpler as there is no library or toolkit to link to or import or sdk etc. beyond echo/printf/read/scanf. either way it's theoretical as it doesn't exist. but "one process per gadget" is going to end up with a huge bloat (see about quciklaunch - even without quicklaunch you drop any sharing of resources for widgets/images/whatever loaded later). a "gadget holder" process might be a decent compromise, but then gadgets are modules again. the e module interface was not designed FOR gadgets. it was designed to be super generic for anything. thus it's a lot of work to do a gadget on top. gadget specific modules with specific api's for just that could be simpler. a gadget holder per language (one for c/c++ and lua via elua, one for python, etc.) that "sourced" its gadgets as bits of script and kept them merged would ultimately work out as a compromise that might work. at least cost is only per language and not per gadget. > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > enlightenment-devel mailing list > enlightenment-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel > -- ------------- Codito, ergo sum - "I code, therefore I am" -------------- Carsten Haitzler - ras...@rasterman.com ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel