On Mon, 29 Aug 2016 08:33:15 +0200 marcel-hollerb...@t-online.de said: > On Sun, Aug 28, 2016 at 02:25:25AM +0900, Carsten Haitzler wrote: > > On Sat, 27 Aug 2016 14:15:31 +0200 marcel-hollerb...@t-online.de said: > > > > > On Sat, Aug 27, 2016 at 09:59:51AM +0900, Carsten Haitzler wrote: > > > > On Fri, 26 Aug 2016 17:34:25 +0200 marcel-hollerb...@t-online.de said: > > > > > > > > > Hello, > > > > > > > > > > On Mon, Aug 22, 2016 at 08:04:09PM -0700, Carsten Haitzler wrote: > > > > > > raster pushed a commit to branch master. > > > > > > > > > > > > http://git.enlightenment.org/core/efl.git/commit/?id=561f8eaa8faebe32b9a03cf9674c147cf0d6d9ab > > > > > > > > > > > > commit 561f8eaa8faebe32b9a03cf9674c147cf0d6d9ab > > > > > > Author: Carsten Haitzler (Rasterman) <ras...@rasterman.com> > > > > > > Date: Tue Aug 23 11:59:37 2016 +0900 > > > > > > > > > > > > efreet - save about 240-300k or so of memory used by efreet mime > > > > > > > > > > > > so efreet mime was loading a bunch of mime type info files, > > > > > > parsing them on startup and allocating memory to store all this > > > > > > mime info - globs, mimetype strings and more. all a big waste of > > > > > > memory as its allocated on the heap per process where its the SAME > > > > > > data files loaded every time. > > > > > > > > > > > > so make an efreet mime cache file and a tool to create it from > > > > > > mime files. mmap this file with all the hashes/strings in it so all > > > > > > that data is mmaped once in memory and shared between all processes > > > > > > and it is only paged in on demand - as actually read/needed so if > > > > > > your process doesnt need to know about mime stuff.. it wont touch it > > > > > > anyway. > > > > > > this saves about 240-300k or so of memory in my tests. this has > > > > > > not covered the mime MAGIC files which still consume memory and are > > > > > > on the heap. this is more complex so it will take more time to come > > > > > > up with a nice file format for the data that is nicely mmaped etc. > > > > > > > > > > > > @optimize > > > > > > --- > > > > > > > > > > [snip] > > > > > > > > > > > + > > > > > > EAPI int > > > > > > efreet_mime_init(void) > > > > > > { > > > > > > @@ -194,14 +386,15 @@ efreet_mime_init(void) > > > > > > } > > > > > > > > > > > > efreet_mime_endianess = efreet_mime_endian_check(); > > > > > > - > > > > > > - monitors = eina_hash_string_superfast_new(EINA_FREE_CB > > > > > > (ecore_file_monitor_del)); - > > > > > > efreet_mime_type_cache_clear(); > > > > > > > > > > > > + _efreet_mimedb_update(); > > > > > > + > > > > > > if (!efreet_mime_init_files()) > > > > > > goto unregister_log_domain; > > > > > > > > > > > > + _efreet_mime_update_func = _efreet_mimedb_update; > > > > > > + > > > > > > return _efreet_mime_init_count; > > > > > > > > > > > > unregister_log_domain: > > > > > > @@ -228,6 +421,9 @@ efreet_mime_shutdown(void) > > > > > > if (--_efreet_mime_init_count != 0) > > > > > > return _efreet_mime_init_count; > > > > > > > > > > > > + _efreet_mimedb_shutdown(); > > > > > > + _efreet_mime_update_func = NULL; > > > > > > + > > > > > > efreet_mime_icons_debug(); > > > > > > > > > > > > IF_RELEASE(_mime_inode_symlink); > > > > > > @@ -241,10 +437,7 @@ efreet_mime_shutdown(void) > > > > > > IF_RELEASE(_mime_application_octet_stream); > > > > > > IF_RELEASE(_mime_text_plain); > > > > > > > > > > > > - IF_FREE_LIST(globs, efreet_mime_glob_free); > > > > > > IF_FREE_LIST(magics, efreet_mime_magic_free); > > > > > > - IF_FREE_HASH(monitors); > > > > > > - IF_FREE_HASH(wild); > > > > > > IF_FREE_HASH(mime_icons); > > > > > > eina_log_domain_unregister(_efreet_mime_log_dom); > > > > > > _efreet_mime_log_dom = -1; > > > > > > @@ -387,11 +580,10 @@ efreet_mime_magic_type_get(const char *file) > > > > > > EAPI const char * > > > > > > efreet_mime_globs_type_get(const char *file) > > > > > > { > > > > > > - Eina_List *l; > > > > > > - Efreet_Mime_Glob *g; > > > > > > char *sl, *p; > > > > > > - const char *s; > > > > > > - char *ext, *mime; > > > > > > + const char *s, *mime; > > > > > > + char *ext; > > > > > > + unsigned int i, n; > > > > > > > > > > > > EINA_SAFETY_ON_NULL_RETURN_VAL(file, NULL); > > > > > > > > > > > > @@ -406,25 +598,27 @@ efreet_mime_globs_type_get(const char *file) > > > > > > while (p) > > > > > > { > > > > > > p++; > > > > > > - if (p && (mime = eina_hash_find(wild, p))) return > > > > > > mime; > > > > > > + if (p && (mime = _efreet_mimedb_extn_find(p))) return > > > > > > mime; p = strchr(p, '.'); > > > > > > } > > > > > > } > > > > > > > > > > > > - /* Fallback to the other globs if not found */ > > > > > > - EINA_LIST_FOREACH(globs, l, g) > > > > > > + // Fallback to the other globs if not found > > > > > > + n = _efreet_mimedb_glob_count(); > > > > > > + for (i = 0; i < n; i++) > > > > > > { > > > > > > - if (efreet_mime_glob_match(file, g->glob)) > > > > > > - return g->mime; > > > > > > + s = _efreet_mimedb_glob_get(i); > > > > > > + if (efreet_mime_glob_match(file, s)) > > > > > > + return _efreet_mimedb_glob_mime_get(i); > > > > > > > > > > The upper line is introducing a problem. > > > > > > > > > > Before the return was a stringshare, now it is not anymore. > > > > > I am not so sure if it is a good idea to fix it with > > > > > > > > the api docs do not say it is a stringshare string, thus relying on > > > > this is a violation of api contract ... for exactly the kind of > > > > optimizations i've done here. efreet returns lots of strings. all > > > > except 2 are NOT defined to be stringshare strings. :) this is because > > > > freet gets a lot of strings DIRECTLY from mmaped cache files for > > > > desktop files, icon search databases and so on. this is "regular how > > > > efreet works" stuff. so it's also not a design pattern in efreet. in > > > > fact the design pattern has ben the opposite. efreet has been returning > > > > strings to mmaped files for a huge number of it's apis for a long time. > > > > but to specs... > > > > > > Problem is more, look at the two last commits in enlightenment, until > > > right now people uesd it and it perfectly worked. Also its (in my > > > opinion) much nicer to do "if (mime1 == mime2) { ..." than doing > > > "if (!strcmp(mime1 == mime2)" (also in terms of performance). > > > > doesn't matter. it's not int he contract. if you happen to > > > > int x = malloc(sizeof(int)); > > free(x); > > *x = 10; > > > > and it works one day then starts crashing after an update... who is at > > fault? should libc just make this work anyway and not sometimes crash? > > > > it's not int he contract. it's an assumption that was neither explicit nor > > following any design patterns in efreet. > > > > the cost of having such behaviour is 240k of ram PER PROCESS. that's a > > decent sized chunk ii haven't fixed up the magic stuff yet to use a shared > > cache. that's on my list. > > > > > Also if there is absolutly no assertion that a stringshare is returned, > > > why are then even static strings stringshared in efreet_mime.c? > > > > because it's quite possible that that same string gets duplicated then in an > > app or in a widget and so on and having multiple copies of it would be > > silly. > > He? Can you tell me why this argument applies for static strings but not > to the mimetypes of the glob ? I can also display them in a widget app > or anywhere.
widgets almost everywhere internally stringshare_add BECAUSE the widget keeps a copy. then edje keeps a copy with stringshare_add then evas text object makes another copy.. they all use stringshares to avoid duplicating a string half a dozen or dozens of times. i wrote the original stringsdhare afetr some memory profiling and i found lots of 1 byte allocations... they were "" strings. like 100 or so. then like 4 or 5 copies of full file paths of like 50-150 chars, and duplicates of things like "OK" "Cancel" and so on many times as widget, edje, text et.c all make their own copies... so everyone uses stringhsare to avoid duplicates. the strings coming FROM efreet_mime are RARELY displayed in ui thus not duplicated that often, but they may be, so my making it a stringshare to begin with you reduce the possible memory footprint. that is the POINT of a stringshare. i wrote the original after profiling. the point was to SAVE MEMORY. the fact you can do str1 == str2 is just a side-effect of stringshare, but it was never ever the primary purpose. the problem is there are 100's of kb of strings in RAM per process duplicated again and again per process and MOST are not used except temporarily in looking up some icon or wondering what kind of mimetype this file is. they are mirroring a "static on disk" resource (that rarely changes) and so a shared db makes a lot more sense and it is what the rest of efreet does. > > > > > > > > > > from the api specs: > > > > > > > > /** > > > > * @param file The file to check the mime type > > > > * @return Mime type as a string. > > > > * @brief Retrieve the mime type of a file using globs > > > > */ > > > > EAPI const char *efreet_mime_globs_type_get(const char *file); > > > > > > > > > > I know, problem is (maybe its just me), i often look at the code before > > > using the function. Its often that functions are not really good > > > documented. So looking at the function itself gives you a deeper > > > understanding of the function, and if you then see that even static > > > strings are returned as stringshares, you 100% go for "Okay this thing > > > returns stringshares". > > > > you didn't look at all the other code in efreet then to figure out that that > > efreet_mime was an aberration and broken in terms of being bloated and not > > having a shared cache backing it that's mmaped. > > > > an api is defined by its contract - that's types in api as well as > > documentation. making assumptions that are more than the most conservative > > take of this is not right. most of the time we DON'T document that we return > > stringshares because we want to have the freedom to change in future. the > > fact that it's NOT DOCUMENTED actually is a warning "make no assumptions of > > this string beyond the fact that you don't have to free it as its const > > char * and that you should not write to it..." if it was INTENDED that you > > CAN ASSUMING its a stringshare, then it should be documented so. otherwise > > never make the assumption. it's like assuming the above malloc/free thing. > > just because you read the code for libc and note that memory freed is kept > > around for a while until it may be really released and the above is safe, > > does not make it safe. > > > > > > does not document to say it's a stringshare and does not return > > > > Eina_Stringshare * which would also say that. :) > > > > > > > > > return eina_stringshare_add(_efreet_mimedb_glob_mime_get(i)); > > > > > > > > hmm. no. this would be a leak. the code before returns a string DIRECTLY > > > > from an internal hash. if we return a stringshare_add to avoid a leak > > > > the caller would have to stringshare_del when done. if the caller did > > > > not call stringshare_del like old code would do the refcount of this > > > > string would forever increase every use and thus basically be unable to > > > > be removed from memory if no longer needed/used. even worse it'd > > > > continue until the reference counter in stringshare loops over to like > > > > 1 then someone dos a stringshare_del on that string and everyone's > > > > pointers to this stringshare are not invalid and next access they crash > > > > or something bad... because the counter now has no meaning as it looped > > > > over because someone just kept reffing a string blindly. :) > > > > > > > > > So i am letting you know this was here :) > > > > > > > > i know. but code depending on it being a stringshare is "wrong". :) > > > > documented nowhere to guarantee this. :) > > > > > > But worked the last decade like this ... AND efreet_mime.c is pretty > > > much build to return stringshares everywhere, even static strings. > > > > as above. you made an assumption that was utterly invalid to make. > > > > > > > > > Greetings > > > > > bu5hm4n > > > > > > > > > > > > > > > ------------------------------------------------------------------------------ > > > > > _______________________________________________ > > > > > enlightenment-devel mailing list > > > > > enlightenment-devel@lists.sourceforge.net > > > > > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel > > > > > > > > > > > > > > > > > -- > > > > ------------- Codito, ergo sum - "I code, therefore I am" -------------- > > > > The Rasterman (Carsten Haitzler) ras...@rasterman.com > > > > > > > > > > ------------------------------------------------------------------------------ > > > _______________________________________________ > > > enlightenment-devel mailing list > > > enlightenment-devel@lists.sourceforge.net > > > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel > > > > > > > > > -- > > ------------- Codito, ergo sum - "I code, therefore I am" -------------- > > The Rasterman (Carsten Haitzler) ras...@rasterman.com > > > -- ------------- Codito, ergo sum - "I code, therefore I am" -------------- The Rasterman (Carsten Haitzler) ras...@rasterman.com ------------------------------------------------------------------------------ _______________________________________________ enlightenment-devel mailing list enlightenment-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/enlightenment-devel