Re: [Evolution-hackers] Folder summaries with mmap() (version nine -- fixes a leak)
On Tue, 2006-07-11 at 13:17 +0200, Philip Van Hoof wrote: This is a version (version eight) that actually works (afaics). This version (version nine) fixes a leak when new messages arrive. However. The implementation/architecture of Camel requires that new messages get transformed into CamelMessageInfo instances and added to the CamelSummaryInfo structure. I think it would be better to simply a.s.a.p. write it to the summary file and at the end of getting the new messages, create a new CamelSummaryInfo structure with the new message info instances ... using the mmap technique. Currently it's obviously consuming more memory when new messages arrive (compared to the mmap technique). In Evolution, such a folder never gets closed. So it will for ever keep using this memory structure (in stead of the information using the mmap). Doing this would require appending to the summary file, closing the summary object instance (which closes the mmap), reopening it when receiving new messages completed and letting the mmap code parse that into message-info instances. Note .. starting from add_message_from_data (in camel-imap-folder.c), there's a leak. I think the leak is at camel_folder_summary_info_new_from_message and more specific the message_info_new_from_header. The leak is ~200 kb for a folder of ~10,000 headers http://pvanhoof.be/files/camel_folder_summary_with_mmap_fixes09.diff There's lots of decoding and copying happening there. So it's not going to be easy to spot it. Therefore I'm re-adding the evolution-patches mailing list. Because the mailing lists are slow, I'm also adding some people that might have a direct interest in this. Please review this patch carefully and multiple times. Please do ask me questions about it (if needed). Please do test it extensively. Please do launch your evolution with valgrind massif and also under gdb. In case a crash happens in the ..decode_uint32 method , 'up' twice and 'print *s' so that you know the filename and other interesting infor- mation. Also check s-filepos to know the current file position and for example check variables like 'count' and 'len' after doing one 'up'. You can also find a copy here. I will upload new versions to the same location (and will increase the version number in the filename). http://pvanhoof.be/files/camel_folder_summary_with_mmap_fixes08.diff -- Philip Van Hoof, software developer at x-tend home: me at pvanhoof dot be gnome: pvanhoof at gnome dot org work: vanhoof at x-tend dot be http://www.pvanhoof.be - http://www.x-tend.be ? .broken-date-parser.c.swp ? camel-mime-tables.c Index: camel-file-utils.c === RCS file: /cvs/gnome/evolution-data-server/camel/camel-file-utils.c,v retrieving revision 1.17 diff -u -p -r1.17 camel-file-utils.c --- camel-file-utils.c 2 Jun 2006 00:52:29 - 1.17 +++ camel-file-utils.c 11 Jul 2006 14:34:18 - @@ -105,6 +105,46 @@ camel_file_util_decode_uint32 (FILE *in, } +unsigned char* +camel_file_util_mmap_decode_uint32 (unsigned char *start, guint32 *dest, gboolean is_string) +{ + guint32 value = 0, rlen; + int v; + unsigned char *retval; + + /* until we get the last byte, keep decoding 7 bits at a time */ + +while ( ((v = *start++) 0x80) == 0 ) { +value |= v; +value = 7; +} + + rlen = value | (v 0x7f); + retval = start; + + if (is_string) { +#ifdef DEVEL + if (rlen == 1) { + rlen = 0; + } else { + unsigned int slen = strlen ((char*)retval)+1; + if (slen != rlen) { +printf (Problem in fileformat. String was %d, should be %d for %s %02x at %d\n, rlen, slen, retval, *retval, retval); +rlen = slen; + } + } +#else + if (rlen == 1) + rlen = 0; +#endif + + } + + + *dest = rlen; + return retval; +} + /** * camel_file_util_encode_fixed_int32: * @out: file to output to @@ -167,7 +207,7 @@ int \ camel_file_util_decode_##type(FILE *in, type *dest) \ { \ type save = 0; \ - int i = sizeof(type) - 1; \ + int i = sizeof(type) -1; \ int v = EOF; \ \ while (i = 0 (v = fgetc (in)) != EOF) { \ @@ -181,6 +221,24 @@ camel_file_util_decode_##type(FILE *in, } +#define MMAP_DECODE_T(type)\ +unsigned char* \ +camel_file_util_mmap_decode_##type(unsigned char *start, type *dest) \ +{ \ + type save = 0; \ + int i = sizeof(type) - 1; \ + int v; \ + \ +while (i = 0) {\ + v = *start++;\ + save |= ((type)v) (i * 8); \ + i--; \ + } \ + *dest = save; \ + return start; \ +} + + /** * camel_file_util_encode_time_t: * @out: file to output to @@ -202,6 +260,7 @@ CFU_ENCODE_T(time_t) * Return value: 0 on success, -1 on error. **/ CFU_DECODE_T(time_t) +MMAP_DECODE_T(time_t) /** * camel_file_util_encode_off_t: @@ -225,6 +284,7 @@ CFU_ENCODE_T(off_t) * Return value: 0 on success, -1 on failure. **/ CFU_DECODE_T(off_t)
Re: [Evolution-hackers] Folder summaries with mmap() (version nine -- fixes a leak)
On Tue, 2006-07-11 at 16:47 +0200, Philip Van Hoof wrote: On Tue, 2006-07-11 at 13:17 +0200, Philip Van Hoof wrote: This is a version (version eight) that actually works (afaics). This version (version nine) fixes a leak when new messages arrive. This version (version ten) fixes a few strdup()'s, free()'s and other stuff. Carefully look at the comments of the providers. I had to disable some things that would otherwise corrupt the mmap()ed memory block. These things probably need some refactoring thoughts. However. I'm sending this using an Evolution that is running from LD_LIBRARY_PATH=/opt/camel/lib. So far Evolution hasn't crashed and it's still writing summary new files for all my folders ;-). http://pvanhoof.be/files/camel_folder_summary_with_mmap_fixes10.diff -- Philip Van Hoof, software developer at x-tend home: me at pvanhoof dot be gnome: pvanhoof at gnome dot org work: vanhoof at x-tend dot be http://www.pvanhoof.be - http://www.x-tend.be ? camel-mime-tables.c Index: camel-file-utils.c === RCS file: /cvs/gnome/evolution-data-server/camel/camel-file-utils.c,v retrieving revision 1.17 diff -u -p -r1.17 camel-file-utils.c --- camel-file-utils.c 2 Jun 2006 00:52:29 - 1.17 +++ camel-file-utils.c 11 Jul 2006 19:50:11 - @@ -105,6 +105,46 @@ camel_file_util_decode_uint32 (FILE *in, } +unsigned char* +camel_file_util_mmap_decode_uint32 (unsigned char *start, guint32 *dest, gboolean is_string) +{ + guint32 value = 0, rlen; + int v; + unsigned char *retval; + + /* until we get the last byte, keep decoding 7 bits at a time */ + +while ( ((v = *start++) 0x80) == 0 ) { +value |= v; +value = 7; +} + + rlen = value | (v 0x7f); + retval = start; + + if (is_string) { +#ifdef DEVEL + if (rlen == 1) { + rlen = 0; + } else { + unsigned int slen = strlen ((char*)retval)+1; + if (slen != rlen) { +printf (Problem in fileformat. String was %d, should be %d for %s %02x at %d\n, rlen, slen, retval, *retval, retval); +rlen = slen; + } + } +#else + if (rlen == 1) + rlen = 0; +#endif + + } + + + *dest = rlen; + return retval; +} + /** * camel_file_util_encode_fixed_int32: * @out: file to output to @@ -167,7 +207,7 @@ int \ camel_file_util_decode_##type(FILE *in, type *dest) \ { \ type save = 0; \ - int i = sizeof(type) - 1; \ + int i = sizeof(type) -1; \ int v = EOF; \ \ while (i = 0 (v = fgetc (in)) != EOF) { \ @@ -181,6 +221,24 @@ camel_file_util_decode_##type(FILE *in, } +#define MMAP_DECODE_T(type)\ +unsigned char* \ +camel_file_util_mmap_decode_##type(unsigned char *start, type *dest) \ +{ \ + type save = 0; \ + int i = sizeof(type) - 1; \ + int v; \ + \ +while (i = 0) {\ + v = *start++;\ + save |= ((type)v) (i * 8); \ + i--; \ + } \ + *dest = save; \ + return start; \ +} + + /** * camel_file_util_encode_time_t: * @out: file to output to @@ -202,6 +260,7 @@ CFU_ENCODE_T(time_t) * Return value: 0 on success, -1 on error. **/ CFU_DECODE_T(time_t) +MMAP_DECODE_T(time_t) /** * camel_file_util_encode_off_t: @@ -225,6 +284,7 @@ CFU_ENCODE_T(off_t) * Return value: 0 on success, -1 on failure. **/ CFU_DECODE_T(off_t) +MMAP_DECODE_T(off_t) /** * camel_file_util_encode_size_t: @@ -248,6 +308,7 @@ CFU_ENCODE_T(size_t) * Return value: 0 on success, -1 on failure. **/ CFU_DECODE_T(size_t) +MMAP_DECODE_T(size_t) /** @@ -269,11 +330,22 @@ camel_file_util_encode_string (FILE *out if ((len = strlen (str)) 65536) len = 65536; + + if (len==0) len=-1; if (camel_file_util_encode_uint32 (out, len+1) == -1) return -1; - if (len == 0 || fwrite (str, len, 1, out) == 1) - return 0; + + if (len != 0) + { + if (fwrite (str, len, 1, out) == 1) + { + if (fputc ('\0', out) == -1) +return -1; + return 0; + } + } + return -1; } @@ -291,7 +363,8 @@ int camel_file_util_decode_string (FILE *in, char **str) { guint32 len; - register char *ret; + register char *ret, c; + long a; if (camel_file_util_decode_uint32 (in, len) == -1) { *str = NULL; @@ -311,7 +384,14 @@ camel_file_util_decode_string (FILE *in, return -1; } + a = ftell (in); + + if ((c = fgetc (in)) != '\0') + /* If this is the old format, oeps .. we are so sorry */ + fseek (in, a, SEEK_SET); + ret[len] = 0; + *str = ret; return 0; } Index: camel-file-utils.h === RCS file: /cvs/gnome/evolution-data-server/camel/camel-file-utils.h,v retrieving revision 1.11 diff -u -p -r1.11 camel-file-utils.h --- camel-file-utils.h 10 Jan 2006 07:56:46 - 1.11 +++ camel-file-utils.h 11 Jul 2006 19:50:12 - @@ -46,6 +46,7 @@ int camel_file_util_encode_fixed_int32 ( int camel_file_util_decode_fixed_int32 (FILE *in, gint32 *); int camel_file_util_encode_uint32
Re: [Evolution-hackers] Folder summaries with mmap() (version nine -- fixes a leak)
On Wed, 2006-07-12 at 09:12 +0530, Harish Krishnaswamy wrote: On Tue, 2006-07-11 at 20:33 +0200, Philip Van Hoof wrote: On Tue, 2006-07-11 at 23:14 +0530, Harish Krishnaswamy wrote: ps. I think we should do a in-depth tech-meeting about this stuff. What do you think, Harish? I have a few ideas on reducing memory usage when Evolution is online (if Evolution is online, summary information of new messages is still stored in memory, and not immediately reloaded using mmap) -- this makes Evolution consume (a lot) more memory when its online, when using the patch (as much as the current implementation). Sure, Philip. We could discuss it on the weekly IRC meeting (Wednesday 1000 UTC - #evolution-meet) - though I do not think it is convenient for fejj, if we wishes to join. We can work out a separate meeting and post it to e-h later too. I am more positive about the *real* effectiveness of this meeting unlike the gtk-mozembed episode as we are talking with CODE and there are concrete achievables at an arm's length. :-). Count me in. During such a meeting I'm going to propose actual Camel architectural changes. They will affect mostly internal changes (like how to deal with the CamelFolderSummary and CamelMessageInfo(Base) type from the providers). The external API will probably change but not afaics in mail/ something that would affect the current Evolution code a lot. I would mainly do the adding of messages to the CamelFolderSummary very different. I wouldn't store it using malloc (nor strdup nor pstring_add) in CamelMessageInfo(Base) instances. I would simply append it to the summary file and at the end of scanning .. launch my mmap code to parse the offsets (of also the new items) out of the mmap()ed memory region. This is very different from the current technique, where all is simply added and the data itself is stored in strdup, malloc and pstring_add memory. And sometimes (when?) dumped to the summary file. But where the CamelFolderSummary instance isn't very often reloaded (which would give it the chance to use my mmap code). This is why when being online, Evolution still consumes +220M. Yet when being offline, Evolution consumes ~30M (I have 112 folders, a lot of them with more than 10,000 headers -- mailing lists --). ps. My original Evolution consumes ~380M -- Philip Van Hoof, software developer at x-tend home: me at pvanhoof dot be gnome: pvanhoof at gnome dot org work: vanhoof at x-tend dot be http://www.pvanhoof.be - http://www.x-tend.be ? camel-mime-tables.c Index: camel-file-utils.c === RCS file: /cvs/gnome/evolution-data-server/camel/camel-file-utils.c,v retrieving revision 1.17 diff -p -u -r1.17 camel-file-utils.c --- camel-file-utils.c 2 Jun 2006 00:52:29 - 1.17 +++ camel-file-utils.c 12 Jul 2006 00:16:11 - @@ -105,6 +105,54 @@ camel_file_util_decode_uint32 (FILE *in, } +unsigned char* +camel_file_util_mmap_decode_uint32 (unsigned char *start, guint32 *dest, gboolean is_string) +{ + guint32 value = 0, rlen; + int v; +#if DEBUG + int a=0; +#endif + unsigned char *retval; + + /* until we get the last byte, keep decoding 7 bits at a time */ + +while ( ((v = *start++) 0x80) == 0 ) { +value |= v; +value = 7; +#if DEBUG + a++; + if (a 5) + return NULL; +#endif +} + + rlen = value | (v 0x7f); + retval = start; + + if (is_string) { +#ifdef DEVEL + if (rlen == 1) { + rlen = 0; + } else { + unsigned int slen = strlen ((char*)retval)+1; + if (slen != rlen) { +printf (Problem in fileformat. String was %d, should be %d for %s %02x at %d\n, rlen, slen, retval, *retval, retval); +rlen = slen; + } + } +#else + if (rlen == 1) + rlen = 0; +#endif + + } + + + *dest = rlen; + return retval; +} + /** * camel_file_util_encode_fixed_int32: * @out: file to output to @@ -167,7 +215,7 @@ int \ camel_file_util_decode_##type(FILE *in, type *dest) \ { \ type save = 0; \ - int i = sizeof(type) - 1; \ + int i = sizeof(type) -1; \ int v = EOF; \ \ while (i = 0 (v = fgetc (in)) != EOF) { \ @@ -181,6 +229,24 @@ camel_file_util_decode_##type(FILE *in, } +#define MMAP_DECODE_T(type)\ +unsigned char* \ +camel_file_util_mmap_decode_##type(unsigned char *start, type *dest) \ +{ \ + type save = 0; \ + int i = sizeof(type) - 1; \ + int v; \ + \ +while (i = 0) {\ + v = *start++;\ + save |= ((type)v) (i * 8); \ + i--; \ + } \ + *dest = save; \ + return start; \ +} + + /** * camel_file_util_encode_time_t: * @out: file to output to @@ -202,6 +268,7 @@ CFU_ENCODE_T(time_t) * Return value: 0 on success, -1 on error. **/ CFU_DECODE_T(time_t) +MMAP_DECODE_T(time_t) /** * camel_file_util_encode_off_t: @@ -225,6 +292,7 @@ CFU_ENCODE_T(off_t) * Return value: 0 on success, -1 on failure. **/
Re: [Evolution-hackers] Folder summaries with mmap() (version nine -- fixes a leak)
On Tue, 2006-07-11 at 23:14 +0530, Harish Krishnaswamy wrote: I'm removing the patches mailing list ;-) On Tue, 2006-07-11 at 16:47 +0200, Philip Van Hoof wrote: Currently it's obviously consuming more memory when new messages arrive (compared to the mmap technique). In Evolution, such a folder never gets closed. So it will for ever keep using this memory structure (in stead of the information using the mmap). I expect the patch to improve things as they stand now and will run it through some testing to get the hard numbers. Just curious to know if you whetted this change with some heavy-duty search folders at your end. They seem to skew the observations thus far. The hardest test that I did with it was sorting huge folders using the Subject and From fields. I didn't yet get Evolution stable enough to do even more aggressive tests like, indeed, making huge vfolders and searches. But after seeing the sorting results, I'm almost confident it will probably be as fast as in-real-memory. The big difference with huge searches (over all folders) will be that all summary files will be hit. So it's basically a does mmap scale to folder-count amount of mmap's?-question. The kernel, by default, pages 16 pages ahead of time. You can increase this using posix_madvise (which isn't platform independent, in case tml is listening). In random mode, it will not page ahead of time at all. This will be a nice option for mobile devices with very very very few memory (but very very very slow sorting and searching). ps. I think we should do a in-depth tech-meeting about this stuff. What do you think, Harish? I have a few ideas on reducing memory usage when Evolution is online (if Evolution is online, summary information of new messages is still stored in memory, and not immediately reloaded using mmap) -- this makes Evolution consume (a lot) more memory when its online, when using the patch (as much as the current implementation). -- Philip Van Hoof, software developer at x-tend home: me at pvanhoof dot be gnome: pvanhoof at gnome dot org work: vanhoof at x-tend dot be http://www.pvanhoof.be - http://www.x-tend.be ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers