On Sun, Aug 28, 2016 at 02:25:25AM +0900, Carsten Haitzler wrote:
> On Sat, 27 Aug 2016 14:15:31 +0200 marcel-hollerb...@t-online.de said:
> 
> > On Sat, Aug 27, 2016 at 09:59:51AM +0900, Carsten Haitzler wrote:
> > > On Fri, 26 Aug 2016 17:34:25 +0200 marcel-hollerb...@t-online.de said:
> > > 
> > > > Hello,
> > > > 
> > > > On Mon, Aug 22, 2016 at 08:04:09PM -0700, Carsten Haitzler wrote:
> > > > > raster pushed a commit to branch master.
> > > > > 
> > > > > http://git.enlightenment.org/core/efl.git/commit/?id=561f8eaa8faebe32b9a03cf9674c147cf0d6d9ab
> > > > > 
> > > > > commit 561f8eaa8faebe32b9a03cf9674c147cf0d6d9ab
> > > > > Author: Carsten Haitzler (Rasterman) <ras...@rasterman.com>
> > > > > Date:   Tue Aug 23 11:59:37 2016 +0900
> > > > > 
> > > > >     efreet - save about 240-300k or so of memory used by efreet mime
> > > > >     
> > > > >     so efreet mime was loading a bunch of mime type info files, 
> > > > > parsing
> > > > >     them on startup and allocating memory to store all this mime info 
> > > > > -
> > > > >     globs, mimetype strings and more. all a big waste of memory as its
> > > > >     allocated on the heap per process where its the SAME data files
> > > > > loaded every time.
> > > > >     
> > > > >     so make an efreet mime cache file and a tool to create it from 
> > > > > mime
> > > > >     files. mmap this file with all the hashes/strings in it so all 
> > > > > that
> > > > >     data is mmaped once in memory and shared between all processes and
> > > > > it is only paged in on demand - as actually read/needed so if your
> > > > >     process doesnt need to know about mime stuff.. it wont touch it
> > > > > anyway. 
> > > > >     this saves about 240-300k or so of memory in my tests. this has 
> > > > > not
> > > > >     covered the mime MAGIC files which still consume memory and are on
> > > > > the heap. this is more complex so it will take more time to come up
> > > > > with a nice file format for the data that is nicely mmaped etc.
> > > > >     
> > > > >     @optimize
> > > > > ---
> > > > 
> > > > [snip]
> > > > 
> > > > > +
> > > > >  EAPI int
> > > > >  efreet_mime_init(void)
> > > > >  {
> > > > > @@ -194,14 +386,15 @@ efreet_mime_init(void)
> > > > >       }
> > > > >  
> > > > >     efreet_mime_endianess = efreet_mime_endian_check();
> > > > > -
> > > > > -   monitors = eina_hash_string_superfast_new(EINA_FREE_CB
> > > > > (ecore_file_monitor_del)); -
> > > > >     efreet_mime_type_cache_clear();
> > > > >  
> > > > > +   _efreet_mimedb_update();
> > > > > +
> > > > >     if (!efreet_mime_init_files())
> > > > >       goto unregister_log_domain;
> > > > >  
> > > > > +   _efreet_mime_update_func = _efreet_mimedb_update;
> > > > > +
> > > > >     return _efreet_mime_init_count;
> > > > >  
> > > > >  unregister_log_domain:
> > > > > @@ -228,6 +421,9 @@ efreet_mime_shutdown(void)
> > > > >     if (--_efreet_mime_init_count != 0)
> > > > >       return _efreet_mime_init_count;
> > > > >  
> > > > > +   _efreet_mimedb_shutdown();
> > > > > +   _efreet_mime_update_func = NULL;
> > > > > +
> > > > >     efreet_mime_icons_debug();
> > > > >  
> > > > >     IF_RELEASE(_mime_inode_symlink);
> > > > > @@ -241,10 +437,7 @@ efreet_mime_shutdown(void)
> > > > >     IF_RELEASE(_mime_application_octet_stream);
> > > > >     IF_RELEASE(_mime_text_plain);
> > > > >  
> > > > > -   IF_FREE_LIST(globs, efreet_mime_glob_free);
> > > > >     IF_FREE_LIST(magics, efreet_mime_magic_free);
> > > > > -   IF_FREE_HASH(monitors);
> > > > > -   IF_FREE_HASH(wild);
> > > > >     IF_FREE_HASH(mime_icons);
> > > > >     eina_log_domain_unregister(_efreet_mime_log_dom);
> > > > >     _efreet_mime_log_dom = -1;
> > > > > @@ -387,11 +580,10 @@ efreet_mime_magic_type_get(const char *file)
> > > > >  EAPI const char *
> > > > >  efreet_mime_globs_type_get(const char *file)
> > > > >  {
> > > > > -   Eina_List *l;
> > > > > -   Efreet_Mime_Glob *g;
> > > > >     char *sl, *p;
> > > > > -   const char *s;
> > > > > -   char *ext, *mime;
> > > > > +   const char *s, *mime;
> > > > > +   char *ext;
> > > > > +   unsigned int i, n;
> > > > >  
> > > > >     EINA_SAFETY_ON_NULL_RETURN_VAL(file, NULL);
> > > > >  
> > > > > @@ -406,25 +598,27 @@ efreet_mime_globs_type_get(const char *file)
> > > > >          while (p)
> > > > >            {
> > > > >               p++;
> > > > > -             if (p && (mime = eina_hash_find(wild, p))) return mime;
> > > > > +             if (p && (mime = _efreet_mimedb_extn_find(p))) return
> > > > > mime; p = strchr(p, '.');
> > > > >            }
> > > > >       }
> > > > >  
> > > > > -   /* Fallback to the other globs if not found */
> > > > > -   EINA_LIST_FOREACH(globs, l, g)
> > > > > +   // Fallback to the other globs if not found
> > > > > +   n = _efreet_mimedb_glob_count();
> > > > > +   for (i = 0; i < n; i++)
> > > > >       {
> > > > > -        if (efreet_mime_glob_match(file, g->glob))
> > > > > -          return g->mime;
> > > > > +        s = _efreet_mimedb_glob_get(i);
> > > > > +        if (efreet_mime_glob_match(file, s))
> > > > > +          return _efreet_mimedb_glob_mime_get(i);
> > > > 
> > > > The upper line is introducing a problem.
> > > > 
> > > > Before the return was a stringshare, now it is not anymore.
> > > > I am not so sure if it is a good idea to fix it with 
> > > 
> > > the api docs do not say it is a stringshare string, thus relying on this 
> > > is
> > > a violation of api contract ... for exactly the kind of optimizations i've
> > > done here. efreet returns lots of strings. all except 2 are NOT defined to
> > > be stringshare strings. :) this is because freet gets a lot of strings
> > > DIRECTLY from mmaped cache files for desktop files, icon search databases
> > > and so on. this is "regular how efreet works" stuff. so it's also not a
> > > design pattern in efreet. in fact the design pattern has ben the opposite.
> > > efreet has been returning strings to mmaped files for a huge number of 
> > > it's
> > > apis for a long time. but to specs...
> > 
> > Problem is more, look at the two last commits in enlightenment, until
> > right now people uesd it and it perfectly worked. Also its (in my opinion) 
> > much nicer to do "if (mime1 == mime2) { ..." than doing 
> > "if (!strcmp(mime1 == mime2)" (also in terms of performance).
> 
> doesn't matter. it's not int he contract. if you happen to
> 
> int x = malloc(sizeof(int));
> free(x);
> *x = 10;
> 
> and it works one day then starts crashing after an update... who is at fault?
> should libc just make this work anyway and not sometimes crash?
> 
> it's not int he contract. it's an assumption that was neither explicit nor
> following any design patterns in efreet.
> 
> the cost of having such behaviour is 240k of ram PER PROCESS. that's a decent
> sized chunk ii haven't fixed up the magic stuff yet to use a shared cache.
> that's on my list.
> 
> > Also if there is absolutly no assertion that a stringshare is returned,
> > why are then even static strings stringshared in efreet_mime.c?
> 
> because it's quite possible that that same string gets duplicated then in an
> app or in a widget and so on and having multiple copies of it would be silly.

He? Can you tell me why this argument applies for static strings but not
to the mimetypes of the glob ? I can also display them in a widget app
or anywhere.

> 
> > > 
> > > from the api specs:
> > > 
> > > /**
> > >  * @param file The file to check the mime type
> > >  * @return Mime type as a string.
> > >  * @brief Retrieve the mime type of a file using globs
> > >  */
> > > EAPI const char *efreet_mime_globs_type_get(const char *file);
> > >
> > 
> > I know, problem is (maybe its just me), i often look at the code before
> > using the function. Its often that functions are not really good
> > documented. So looking at the function itself gives you a deeper
> > understanding of the function, and if you then see that even static
> > strings are returned as stringshares, you 100% go for "Okay this thing
> > returns stringshares".
> 
> you didn't look at all the other code in efreet then to figure out that that
> efreet_mime was an aberration and broken in terms of being bloated and not
> having a shared cache backing it that's mmaped.
> 
> an api is defined by its contract - that's types in api as well as
> documentation. making assumptions that are more than the most conservative 
> take
> of this is not right. most of the time we DON'T document that we return
> stringshares because we want to have the freedom to change in future. the fact
> that it's NOT DOCUMENTED actually is a warning "make no assumptions of this
> string beyond the fact that you don't have to free it as its const char * and
> that you should not write to it..." if it was INTENDED that you CAN ASSUMING
> its a stringshare, then it should be documented so. otherwise never make the
> assumption. it's like assuming the above malloc/free thing. just because you
> read the code for libc and note that memory freed is kept around for a while
> until it may be really released and the above is safe, does not make it safe.
> 
> > > does not document to say it's a stringshare and does not return
> > > Eina_Stringshare * which would also say that. :)
> > > 
> > > > return eina_stringshare_add(_efreet_mimedb_glob_mime_get(i));
> > > 
> > > hmm. no. this would be a leak. the code before returns a string DIRECTLY
> > > from an internal hash. if we return a stringshare_add to avoid a leak the
> > > caller would have to stringshare_del when done. if the caller did not call
> > > stringshare_del like old code would do the refcount of this string would
> > > forever increase every use and thus basically be unable to be removed from
> > > memory if no longer needed/used. even worse it'd continue until the
> > > reference counter in stringshare loops over to like 1 then someone dos a
> > > stringshare_del on that string and everyone's pointers to this stringshare
> > > are not invalid and next access they crash or something bad... because the
> > > counter now has no meaning as it looped over because someone just kept
> > > reffing a string blindly. :)
> > > 
> > > > So i am letting you know this was here :)
> > > 
> > > i know. but code depending on it being a stringshare is "wrong". :)
> > > documented nowhere to guarantee this. :)
> > 
> > But worked the last decade like this ... AND efreet_mime.c is pretty
> > much build to return stringshares everywhere, even static strings.
> 
> as above. you made an assumption that was utterly invalid to make.
> > > 
> > > > Greetings
> > > >    bu5hm4n
> > > >  
> > > > 
> > > > ------------------------------------------------------------------------------
> > > > _______________________________________________
> > > > enlightenment-devel mailing list
> > > > enlightenment-devel@lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> > > > 
> > > 
> > > 
> > > -- 
> > > ------------- Codito, ergo sum - "I code, therefore I am" --------------
> > > The Rasterman (Carsten Haitzler)    ras...@rasterman.com
> > > 
> > 
> > ------------------------------------------------------------------------------
> > _______________________________________________
> > enlightenment-devel mailing list
> > enlightenment-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
> > 
> 
> 
> -- 
> ------------- Codito, ergo sum - "I code, therefore I am" --------------
> The Rasterman (Carsten Haitzler)    ras...@rasterman.com
> 

------------------------------------------------------------------------------
_______________________________________________
enlightenment-devel mailing list
enlightenment-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/enlightenment-devel

Reply via email to