On Wed, 14 Oct 2015 18:56:29 +0100 Tom Hacohen <[email protected]> said:
hmm to be more clear, i'm doing this for general education for developers and
users so people are aware of this in future when doing development as this
stuff does matter. by the sheer size of the change in dirty pages (2mb+) you
can see that it matters a lot.
eolian generates lots of structures in the c files to describe classes and
methods. like Type x = { ... }; since these are compiled in structures - they
get mapped at runtime and used like all code and global variables. like most
code, this SHOULD have been mapped once for the entire system and shared
between all processes thus the cost is paid only once, and paid only "on
demand" ie if pages have to be read - then they are loaded from disk for real.
they can also be thrown out when not used for a while because they can be
easily paged back in. throwing out costs nothing. ie - they should have behaved
like all good read-only mapped pages.
but eo WROTE to this memory at runtime - it wrote the SAME thing in every
process and what it wrote modified SOME bytes but not all. so you had maybe 1
byte in 100 modified. most of it was read-only but some was read-write. this
means that to write a few bytes an entire 4k page had to be duplicated and made
private per process (copy on write). this was a "hidden" allocation at runtime
that things like massif don't pick up as the kernel silently duplicated these
modified pages. the solution - separate out the writable data into their own
packed arrays, and keep read-only separate and/or stop writing to some of the
data if you can avoid it. :)
now ... topic digression
now to a wider reason i want to bring this up now. every lib we have has some
global vars we write to. that means every lib has AT LEAST 1 page of memory
that is modified PER PROCESS. that means our mem footprint for these globals
scale by number of libs AND number of processes. we have too many libraries. i
am DEAD SET AGAINST adding more library .so's to efl. we have to do the
REVERSE. we have to REDUCE them. every lib has not just a single dirty page for
these globals, some have a few of them. so we're likely wasting about 30-60kb
PER process. because of the massive number of libs we have. this does not
include the fact that mappings are at BEST page aligned. that means if we have
4132 bytes to load, we actually consume 8192 (2 pages). so every mapping comes
with a rounding "overhead". a quick count shows we have about 105 mappings in
memory .. just for efl library .so's, if we on average we use half of a rounded
page - that's 200kb of wasted memory ... just for rounding of mappings. every
efl lib has at least 3 mappings from the .so. add on the modules (i count 8)
and tats another 24 mappings - so just under 50kb. sure - most of this is
read-only and thus is a single cost across a whole system. but keep in mind
that on the lower end of things we want efl to run on systems with 32m of ram.
wasting 200-250k ... matters when you have only 32m.
so to that effect in future we MUST merge libs. MUST merge modules. this does
not mean we have to rename functions, but we need to build differently. my take
for NOW is that we need to do this:
lib...
efl, eina, eo, emile, eet, ecore, eio
-> efl
ecore_input, ecore_input, ecore_audio, elocation, ecore_con, ecore_ipc,
ecore_imf, efreet, efreet_trash, efreet_mime, eldbus, ecore_file
-> efl_sys
ecore_drm, ecore_x, ecore_fb, ecore_wayland, ecore_input_evas, ecore_imf_evas,
ecore_evas, evas, edje, emotion, ector, ephysics
-> efl_gfx
ethumb_client
-> stay for now (be replaced in future with a non-dbus file "preview"
infra that doesn't require a central daemon - things like SMACK make ethumb
pretty pointless, so move it more to a fork+execced child slave process to do
the heavy lifting so permissions are inherited from the parent).
eeze
-> stay for now (be replaced with a high level device abstraction that
doesn't look like or depend on linux sysfs at all)
elementary
-> stay
for modules things are harder. ecore_evas and evas should merge really. the
module elm makes for edje should be removed and done via api to register the
module instead of loaded. prefs module in elm - tbd. imf modules are hard to
merge. so we can get rid of 2 modules there.
so at 3 mappings per lib, and we go from 8 modules to 6, and from 35 libraries
to 6 (without breaking api or abi - we use symlinks to keep old .so lib names
in place) we save 31 * 3 mappings (every mapping is a cost) so 93 mappings
saved. that's 186kb saved... just be generating different binaries (.so
binaries). the bonus is also that compiler can link-time optimize better within
each .so, startup is faster as there is less seeking and fewer syscalls to set
up LOTs of mappings. even better - with larger .so's we can feasibly use jumbo
pages (2mb pages) which cuts down overhead a LOT. sure - doesn't save memory
but the closer to a 2mb boundary we are the less it would waste.
the point being - without breaking anything we can be far leaner and faster.
this DOES lead to packaging issues though now as you cant ship wayland
separately to x11 support - you need to build efl pkgs in varieties. x11 only,
x11 + wl, or wl only. though at this point you are forced to do that for
elementary anyway so it's no WORSE than we currently are.
in the end i want to see ALL the x, wayland, windows, cocoa, fb, drm etc. stuff
all merged into one module per "system". we need to have a SINGLE system
abstraction that covers low level (upower and friends in ecore) and higher
level (ui/gfx). this would be a goal for efl 2 imho.
... so comments? keep in mind that this is a direction we pretty much have to
go, we're really just going to argue the details, unless someone can convince
me otherwise. i've spent a LOT of time thinking about this... :)
> Oops, I only sent the commit to Carsten in private, didn't CC the ML. :P
> e2344b9b9ef664c8a251ef21d1afabee0b8326fd
>
> In the EFL is the relevant one.
>
> The problem was that a lot of information was written into RW memory pages,
> so the OS couldn't share those pages across applications, which means
> memory usage would be much bigger than it should be. This commit fixes this
> by shifting some things around, and marking the relevant data structures as
> "const" (read only), so they would be placed in a RO section.
>
> I hope that clarifies things. Let me know if you need more info.
>
> --
> Tom.
>
> On Wed, Oct 14, 2015 at 6:29 PM, Amitesh Singh <[email protected]>
> wrote:
>
> > Hello Tom,
> >
> >
> > On Wed, Oct 14, 2015 at 6:41 PM, Tom Hacohen <[email protected]> wrote:
> >
> > > On 28/09/15 18:43, Tom Hacohen wrote:
> > >
> > >> On 10/07/15 04:27, Carsten Haitzler wrote:
> > >>
> > >>> it has come to my attention that we have a bit of a memory bloat issue
> > >>> with eo.
> > >>>
> > >>> and that has to do with eo at runtime doing things like:
> > >>>
> > >>> _eo_class_funcs_set()
> > >>>
> > >>> read it. loop over all ops int he op desc table (array) foir the class
> > >>> then go
> > >>> MODIFYING the array - setting pointers
> > >>>
> > >>> op_desc->op = op_id;
> > >>> +
> > >>> op_desc->op = api_desc->op;
> > >>> op_desc->doc = api_desc->doc;
> > >>> etc.
> > >>>
> > >>> now here is the rub. across a LOT of classes, we will have 50, 100+ of
> > >>> these
> > >>> arrays. each one will maybe span 1-2 or maybe 3 pages of ram, this is
> > RW
> > >>> memory
> > >>> coming from the lib and every single app goes duplicating this then
> > >>> modifying
> > >>> these pointers to be THE SAME THING in each process.
> > >>>
> > >>> first problem - the modified bits vs static bits are spread out. thus
> > we
> > >>> may
> > >>> modify only 1 or 2 ptrs per item in the array, but the whole array
> > spans
> > >>> a fair
> > >>> bit of memory and thus... we end up modifying multiple pages of memory
> > >>> .. per
> > >>> library etc. - this adds up quickly. maybe 40-80kb per process. then
> > this
> > >>> multiples by the number of processes we have using efl. that's costly.
> > >>>
> > >>> we need to fix/adjust this so it can/is set up at COMPILE TIME with a
> > >>> bunch of
> > >>> consts. eolian needs to take care of this.
> > >>>
> > >>> catch - eo allows runtime dynamic overriding of a class. you can
> > >>> literally
> > >>> generate it and modify it as you please at runtime based on current
> > code
> > >>> paths.
> > >>> we still would like this to work. so we need to have 2 paths. 1 where
> > we
> > >>> pre-fill a class at compile-time all cosnt'd out, then to override at
> > >>> runtime -
> > >>> duplicate to non-const version then modify at will.
> > >>>
> > >>> the catch here is - to do these fixes will mean breaking eo abi. i'm
> > >>> looking at
> > >>> it now and i can't see an alternative there.
> > >>>
> > >>> so first - we need to hold off on calling eo stable. then fix this.
> > this
> > >>> doesn't stop interfaces work - it just means we have extra to do.
> > >>>
> > >>>
> > >> Finally got around to it. Fixed now. Please take a look and tell me if
> > >> you see anything bad still.
> > >>
> > >
> > > OK, wow. This has been very annoying. :P
> > > It's definitely fixed, though for some reason all of my tests to verify
> > > this have failed until now. Now everything finally shows the correct
> > > results. More specifically, I can see the improvement in clean vs dirty
> > > pages when running: pmap -p -XX `pidof elementary_test`.
> > > I attached the full pmap output from before and after my fix, just an
> > > example for elementary, private clean and private dirty:
> > > Clean Dirty
> > > Before: 0 2336
> > > After: 2224 0
> > >
> > > An obvious win.
> > >
> > > Thanks again to Carsten for noticing this. This is something that slipped
> > > in Eo2 (was OK with Eo1) and almost went unnoticed.
> > >
> > > This is great. :) and I am sure fixing this had been fun. ;)
> > I am curious to know how did you fix it? Whenever you get time, please do
> > list out the relevant commits, so that I can relate things.
> > I did some reverse engineering before (during college days) so can pick up
> > this quickly. :)
> >
> > --
> > > Tom.
> > >
> > >
> > >
> > ------------------------------------------------------------------------------
> > >
> > > _______________________________________________
> > > enlightenment-devel mailing list
> > > [email protected]
> > > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> > >
> > >
> >
> > ------------------------------------------------------------------------------
> > _______________________________________________
> > enlightenment-devel mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> >
> ------------------------------------------------------------------------------
> _______________________________________________
> enlightenment-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
>
--
------------- Codito, ergo sum - "I code, therefore I am" --------------
The Rasterman (Carsten Haitzler) [email protected]
------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel