I see that this has been pushed and is already being used despite some objections being raised? I guess I probably missed IRC discussions.
The reasoning for needing this sounds like we should probably just use jemalloc ( https://www.facebook.com/notes/facebook-engineering/scalable-memory-allocation-using-jemalloc/480222803919/), which does more, has more people actively devoted to improving/developing it, and has been widely tested and benchmarked so we know it will definitely have the effect that we want while also reducing our maintenance+development overhead. On Fri, Nov 4, 2016 at 10:08 AM Carsten Haitzler <[email protected]> wrote: > On Fri, 4 Nov 2016 10:18:33 -0200 Gustavo Sverzut Barbieri < > [email protected]> > said: > > > On Thu, Nov 3, 2016 at 9:27 PM, Carsten Haitzler <[email protected]> > wrote: > > > On Thu, 3 Nov 2016 11:24:14 -0200 Gustavo Sverzut Barbieri > > > <[email protected]> said: > > > > > >> I guessed mempool and eina_trash did that > > > > > > nah - mempool i don't think has a "purgatory" for pointers. > > > they are released back into the pool. > > > > well, it could... OTOH it's just for "empty blocks", since if it's in > > a mempool that has memory blocks and they're still in use, it will > > just flag as unused. > > > > also, it simplifies bookkeeping of the memory if they are all of the > > same size, like you said Eina_List, it knows the size of each entry, > > thus just need to mark each position that is usable, not try to > > allocate based on size or similar -- much more efficient. > > yah. that's what mempool does... but it doesnt have 2 states for an > allocation. > it doesnt have "in use" "freed but not able to be reused yet" and "free and > able to be re-used". it just has 1. in use or not. > > > > trash is actually a cache for storing ptrs but it never > > > actually frees anything. it doesn't know how to. you have to manually > clean > > > trash yourself and call some kind of free func when you do the clean. > trash > > > doesn't store free funcs at all. > > > > I don't see why it couldn't. > > but it doesn't, and eina_trash is all static inlines with structs exposed > so > we'd break struct definition, memory layout and api to do this. if an > eina_trash is exposed from a lib compiled against efl 1.18 against other > code > compiled against 1.19 - it'd break. even worse eina_trash is a single > linked > list so walking through it is scattered through memory thus basically > likely a > cache miss each time. > > > but I find this is trying to replace malloc's internal structures, > > which is not so nice. As you know, malloc implementation can > > postpone/defer actual flushes, it's not 1:1 with brk() and munmap() > > since like our mempools the page or stack may have used bits that > > prevents that to be given back to the kernel. > > i know. but it's out of our control. we can't change what and how malloc > does > this. we can't do smarter overwrite detection. malloc has options for > filling > freed memory with a pattern - but it will do it to any sized allocation. 1 > byte > or 1 gigabyte. with a custom implementation WE can decide eg only fill in > up to > 256 bytes as this is what might be sued for small objects/list nodes but > leave > big allocations untouched or .. only fill in the FIRST N bytes of an > allocation with a pattern. if the pattern has been overwritten between > submission to a free queue AND when it is actually freed then we have a > bug in > code somewhere scribbling over freed memory. at least we know it and know > what > to be looking for. malloc is far more limited in this way. > > also we can defer freeing until when WE want. e.g. after having gone idle > and > we would otherwise sleep. malloc really doesnt have any way to do this > nicely. > it's totally non-portable, libc specific (eg glibc) etc. and even then very > "uncontrollable". a free queue of our own is portable AND controllable. > > > what usually adds overhead are mutexes and the algorithms trying to > > find an empty block... if we say freeq/trash are TLS/single-thread, > > then we could avoid the mutex (but see malloc(3) docs on how they try > > to minimize that contention), but adding a list of entries to look for > > a free spot is likely worse than malloc's own tuned algorithm. > > no no. i'm not talking about making a CACHE of memory blocks. simply a > fifo. > put a ptr on the queue with a free func. it sits there for some time and > then > something walks this from beginning to end actually freeing. e.g. once we > have > reached and idle sleep state. THEN the frees really happen. once on the > free > queue there is no way off. you are freed. or to be freed. only a question > of > when. > > if there is buggy code that does something like: > > x = malloc(10); > x[2] = 10; > free(x); > y = malloc(10); > y[2] = 10; > x[2] = 5; > > ... there is a very good chance y is a recycled pointer - same mem > location as > x. when we do x[2] = 5 we overwrite y[2] with 5 even tho it now should be > 10. > yes. valgrind can catch these... but you HAVE to catch them while running. > maybe it only happens in certain logic paths. yes. coverity sometimes can > find > these too through static analysis. but not always. and then there are the > cases > where this behaviour is split across 2 different projects. one is efl, the > other is some 3rd party app/binary that does something bad. the "y" malloc > is > in efl. the c one is in an app. the app now scribbles over memory owned by > efl. > this is bad. so efl now crashes with corrupt data structures and we can > never > fix this at all as the app is a 3rd party project simply complaining that a > crash is happening in efl. > > we can REDUCE these issues by ensuring the x pointer is not recycled so > aggressively by having a free queue. have a few hundred or a few thousand > pointers sit on that queue for a while and HOPE this means the buggy code > will > write to this memory while its still allocated but not in use... thus > REDUCING > the bugs/crashes at the expense of latency on freeing memory. it doesn't > fix > the bug but it mitigates the worst side effects. > > of course i'd actually like to replace all our allocations with our own > special > allocator that keeps pointers and allocations used in efl separated out > into > different domains. e.g. eo can have a special "eo object data" domain and > all > eo object data is allocated from here. pointers from here can never be > recycled > for a strdup() or a general malloc() or an eina_list_append (that already > uses > a mempool anyway), etc. - the idea being that its HARDER to accidentally > stomp > over a completely unrelated data structure because pointers are not > re-cycled > from the same pool. e.g. efl will have its own pool of memory and at least > if > pointers are re-used, they are re-used only within that domain/context. if > we > are even smarter we can start using 32bit pointers on 64bit by returning > unisigned ints that are an OFFSET into a single 4gb mmaped region. even > better > bitshifting could give us 16 or 32 or even 64gb of available address space > for > these allocations if we force alignment to 4, 8 or 16 bytes (probably a > good > idea). so you access such ptrs with: > > #define P(dom, ptr) \ > ((void *)(((unsigned char *)((dom)->base)) + (((size_t)ptr) << 4)) > > so as long as you KNOW the domain it comes from you can compress pointers > down > to 1/2 the size ... even 1/4 the size and use 16bit ptrs... like above. > (that > would give you 1mb of memory space per domain so for smallish data sets > might > be useful). this relies on you knowing in advance the domain source and > getting > this right. we can still do full ptrs too. but this would quarantine > memory and > pointers from each other (libc vs efl) and help isolate bugs/problems. > > but this is a hell of a lot more work. it needs a whole malloc > implementation. > i'm not talking about that. far simpler. a queue of pointers to free at a > future point. not a cache. not trash. not to then be dug out and re-used. > that > is the job of the free func and its implementation to worry about that. if > it's > free() or some other free function. only put memory into the free queue > you can > get some sensible benefits out of. it's voluntary. just replace the > existing > free func with the one that queues the free with the free func and ptr and > size > for example. do it 1 place at a time - totally voluntary. doesn't hurt to > do it. > > -- > ------------- Codito, ergo sum - "I code, therefore I am" -------------- > The Rasterman (Carsten Haitzler) [email protected] > > > > ------------------------------------------------------------------------------ > Developer Access Program for Intel Xeon Phi Processors > Access to Intel Xeon Phi processor-based developer platforms. > With one year of Intel Parallel Studio XE. > Training and support from Colfax. > Order your platform today. http://sdm.link/xeonphi > _______________________________________________ > enlightenment-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel > ------------------------------------------------------------------------------ Developer Access Program for Intel Xeon Phi Processors Access to Intel Xeon Phi processor-based developer platforms. With one year of Intel Parallel Studio XE. Training and support from Colfax. Order your platform today. http://sdm.link/xeonphi _______________________________________________ enlightenment-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
