[Evolution-hackers] Loading really large E-mails on devices with not enough Vm
This is what happens if you try to open a truly large E-mail on a device that has not as much memory available: Is there something we can do about this? Can we change the MIME parsing algorithm to be less memory demanding for example? Note that GArray is not really very sparse with memory once you start having a really large array. Perhaps we can in stead change this to a normal pointer array of a fixed size (do we know the size before we start parsing, so that we can allocate an exact size in stead, perhaps?) 0x40cc4e74 *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:67 1 0x40cc6450 *__GI_abort () at abort.c:88 2 0x40c32ba8 IA__g_logv (log_domain=0x40c5a954 "GLib", log_level=1112474896, format=0x40c6129c "%s: failed to allocate %lu bytes", args1=0x424f0abc) at gmessages.c:502 3 0x40c32be8 IA__g_log (log_domain=0x0, log_level=4973, format=0x40c6129c "%s: failed to allocate %lu bytes") at gmessages.c:522 4 0x40c320c8 IA__g_realloc (mem=0x0, n_bytes=16777216) at gmem.c:172 5 0x40c097f4 g_array_maybe_expand (array=0x4161dc08, len=4973) at garray.c:339 6 0x40c09a38 IA__g_array_append_vals (farray=0x4161dc08, data=0x440e80, len=4071) at garray.c:132 7 0x40c0a384 IA__g_byte_array_append (array=0x4161dc08, data=0x136d , len=1086722716) at garray.c:653 8 0x402a07d4 camel_mime_part_construct_content_from_parser (dw=0x293f78, mp=0x50d710) at camel-mime-part-utils.c:71 9 0x402a13e0 construct_from_parser (mime_part=0x293f78, mp=0x50d710) at camel-mime-part.c:968 10 0x402a148c camel_mime_part_construct_from_parser (mime_part=0x293f78, mp=0x50d710) at camel-mime-part.c:996 11 0x402a7ba8 construct_from_parser (multipart=0x2f4e90, mp=0x50d710) at camel-multipart.c:577 12 0x402a7c8c camel_multipart_construct_from_parser (multipart=0x2f4e90, mp=0x50d710) at camel-multipart.c:609 13 0x402a081c camel_mime_part_construct_content_from_parser (dw=0x57a028, mp=0x50d710) at camel-mime-part-utils.c:122 14 0x402a13e0 construct_from_parser (mime_part=0x57a028, mp=0x50d710) at camel-mime-part.c:968 15 0x4029ea24 construct_from_parser (dw=0x0, mp=0x50d710) at camel-mime-message.c:597 16 0x402a148c camel_mime_part_construct_from_parser (mime_part=0x57a028, mp=0x50d710) at camel-mime-part.c:996 17 0x402a14c6 construct_from_stream (dw=0x57a028, s=0x571378) at camel-mime-part.c:1012 18 0x40296070 camel_data_wrapper_construct_from_stream (data_wrapper=0x57a028, stream=0x571378) at camel-data-wrapper.c:270 19 0x414e410c maildir_get_message (folder=0x57bf28, uid=0x294270 "1200956392.3060_5.Nokia-N810-03-11", type=1086722716, param=1086722752, ex=0x424f0ce8) at camel-maildir-folder.c:276 20 0x4025281e camel_folder_get_message (folder=0x57bf28, uid=0x294270 "1200956392.3060_5.Nokia-N810-03-11", type=CAMEL_FOLDER_RECEIVE_ANY_OR_FULL, param=-1, ex=0x414e40b9) at camel-folder.c:1191 21 0x401e3c94 -- Philip Van Hoof, freelance software developer home: me at pvanhoof dot be gnome: pvanhoof at gnome dot org http://pvanhoof.be/blog http://codeminded.be ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
[Evolution-hackers] A different implementation of imap_rescan
Hi there, The imap_rescan function of the default IMAP provider of Evolution will in case at the front of the IMAP Mailbox items got expunged, more or less start marking all of the locally cached summary info as invalid. (note. summary in this context means the envelope information of each item that we have cached locally in a file called "summary" and that, when loaded, loads into a integer-based index. The sequence number on the IMAP server minus one must correspond to that index for each item) As a result will the entire summary be re-retrieved. This is one of the many reasons why Evolution's IMAP feels slow for the users: they removed something and suddenly the next operation takes a very long time. That's the summary being re-retrieved ... sometimes. Luckily wont most people delete old items, so it wont occur often for most people's IMAP operations. Nevertheless it can occur. I figured it's a bit too-aggressive about this (it = the code). It can also assume that only a few items got removed, keep how many got removed, and shift the table's indexes according to the amount of removed items. If then we still see a significant mismatch, we can still nonetheless mark as invalid, remove and perform a re-retrieval of the summary. I do this in Tinymail's camel-lite code, and now ported it to upstream camel in the patch that I attached. I have not tested this porting and note that I do quite a lot of things different in Tinymail (it supports condstore, which is heavily related to this code). But for code review this might be interesting to post to you guys, so here it goes ... -- Philip Van Hoof, freelance software developer home: me at pvanhoof dot be gnome: pvanhoof at gnome dot org http://pvanhoof.be/blog http://codeminded.be Index: camel-imap-folder.c === --- camel-imap-folder.c (revision 8395) +++ camel-imap-folder.c (working copy) @@ -664,6 +664,7 @@ GArray *removed; gboolean ok; CamelFolderChangeInfo *changes = NULL; + gint tr = 0; imap_folder->need_rescan = FALSE; @@ -732,16 +733,17 @@ * from the summary. */ removed = g_array_new (FALSE, FALSE, sizeof (int)); - for (i = 0; i < summary_len && new[i].uid; i++) { - info = camel_folder_summary_index (folder->summary, i); + for (i = 0; i + tr < summary_len && new[i].uid; i++) { + info = camel_folder_summary_index (folder->summary, i + tr); iinfo = (CamelImapMessageInfo *)info; if (strcmp (camel_message_info_uid (info), new[i].uid) != 0) { + tr++; camel_message_info_free(info); seq = i + 1; g_array_append_val (removed, seq); i--; - summary_len--; + /* summary_len--; */ continue; } @@ -762,6 +764,7 @@ camel_message_info_free(info); g_free (new[i].uid); + new[i].uid = NULL; } if (changes) { @@ -772,6 +775,7 @@ seq = i + 1; /* Free remaining memory. */ + i = 0; while (i < summary_len && new[i].uid) g_free (new[i++].uid); g_free (new); ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm
On Sat, 2008-01-26 at 13:44 +0100, Philip Van Hoof wrote: > This is what happens if you try to open a truly large E-mail on a device > that has not as much memory available: > > Is there something we can do about this? Can we change the MIME parsing > algorithm to be less memory demanding for example? > > Note that GArray is not really very sparse with memory once you start > having a really large array. Perhaps we can in stead change this to a > normal pointer array of a fixed size (do we know the size before we > start parsing, so that we can allocate an exact size in stead, perhaps?) eh, why would you change it to a GPtrArray? It doesn't hold pointers, it holds message part content. Unfortunately we don't know the size ahead of time. I suppose you could use a custom byte array allocator so that you can force it to grow by larger chunks or something, dunno. The way GMime handles this is by not loading content into RAM, but that may be harder to do with Camel, especially in the mbox case. Jeff ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm
On Sat, 2008-01-26 at 22:12 -0500, Jeffrey Stedfast wrote: > On Sat, 2008-01-26 at 13:44 +0100, Philip Van Hoof wrote: > > This is what happens if you try to open a truly large E-mail on a device > > that has not as much memory available: > > > > Is there something we can do about this? Can we change the MIME parsing > > algorithm to be less memory demanding for example? > > > > Note that GArray is not really very sparse with memory once you start > > having a really large array. Perhaps we can in stead change this to a > > normal pointer array of a fixed size (do we know the size before we > > start parsing, so that we can allocate an exact size in stead, perhaps?) > > eh, why would you change it to a GPtrArray? It doesn't hold pointers, it > holds message part content. > > Unfortunately we don't know the size ahead of time. > > I suppose you could use a custom byte array allocator so that you can > force it to grow by larger chunks or something, dunno. > > > The way GMime handles this is by not loading content into RAM, but > that may be harder to do with Camel, especially in the mbox case. er, I should probably explain this: - writing the code should be relatively easy to do, but in the mbox case, the mbox may end up getting expunged or rewritten for some other reason which may cause problems, not sure how that would work. I think in Maildir, as long as the fd remains open, the file won't actually disappear after an unlink() until the fd gets closed, so that might work out ok assuming you can spare the fd (which might be the other problem with Evolution?). Jeff ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm
Something like the attached patch might work, tho it is untested. If this doesn't work, then I suspect the problem is that the seek position might get changed out from under the mime parser (assuming it is using either a CamelStreamFs or an fd). Note that camel_stream_fs_new_with_fd[_and_bounds]() calls lseek() on the fd passed in. >From the dup() man page: After a successful return from dup() or dup2(), the old and new file descriptors may be used interchangeably. They refer to the same open file description (see open(2)) and thus share file offset and file sta‐ tus flags; for example, if the file offset is modified by using lseek(2) on one of the descriptors, the offset is also changed for the other. So my guess is that this will break the parser :( It might break in the stream case as well, you'd have to follow the code paths a bit to know for sure. For instance, even if creating the seekable substream doesn't perform an underlying seek on the original stream, setting it in a data wrapper might call camel_stream_reset() which /might/ do an lseek() on the source fs stream. Not an insurmountable problem to solve, but it does make things a little more difficult and possibly touchy. Jeff On Sat, 2008-01-26 at 22:48 -0500, Jeffrey Stedfast wrote: > On Sat, 2008-01-26 at 22:12 -0500, Jeffrey Stedfast wrote: > > On Sat, 2008-01-26 at 13:44 +0100, Philip Van Hoof wrote: > > > This is what happens if you try to open a truly large E-mail on a device > > > that has not as much memory available: > > > > > > Is there something we can do about this? Can we change the MIME parsing > > > algorithm to be less memory demanding for example? > > > > > > Note that GArray is not really very sparse with memory once you start > > > having a really large array. Perhaps we can in stead change this to a > > > normal pointer array of a fixed size (do we know the size before we > > > start parsing, so that we can allocate an exact size in stead, perhaps?) > > > > eh, why would you change it to a GPtrArray? It doesn't hold pointers, it > > holds message part content. > > > > Unfortunately we don't know the size ahead of time. > > > > I suppose you could use a custom byte array allocator so that you can > > force it to grow by larger chunks or something, dunno. > > > > > > The way GMime handles this is by not loading content into RAM, but > > that may be harder to do with Camel, especially in the mbox case. > > er, I should probably explain this: > > - writing the code should be relatively easy to do, but in the mbox > case, the mbox may end up getting expunged or rewritten for some other > reason which may cause problems, not sure how that would work. > > I think in Maildir, as long as the fd remains open, the file won't > actually disappear after an unlink() until the fd gets closed, so that > might work out ok assuming you can spare the fd (which might be the > other problem with Evolution?). > > Jeff > > > ___ > Evolution-hackers mailing list > Evolution-hackers@gnome.org > http://mail.gnome.org/mailman/listinfo/evolution-hackers Index: ChangeLog === --- ChangeLog (revision 8425) +++ ChangeLog (working copy) @@ -1,3 +1,8 @@ +2008-01-26 Jeffrey Stedfast <[EMAIL PROTECTED]> + + * camel-mime-part-utils.c (simple_data_wrapper_construct_from_parser): + If possible, keep the content on disk. + 2008-01-24 Matthew Barnes <[EMAIL PROTECTED]> * camel-object.c (camel_object_cast): Index: camel-mime-part-utils.c === --- camel-mime-part-utils.c (revision 8425) +++ camel-mime-part-utils.c (working copy) @@ -57,25 +57,47 @@ static void simple_data_wrapper_construct_from_parser (CamelDataWrapper *dw, CamelMimeParser *mp) { + GByteArray *buffer = NULL; + CamelStream *stream; + off_t start, end; + int fd = -1; + size_t len; char *buf; - GByteArray *buffer; - CamelStream *mem; - size_t len; - + d(printf ("simple_data_wrapper_construct_from_parser()\n")); - - /* read in the entire content */ - buffer = g_byte_array_new (); + + if (!(stream = camel_mime_parser_stream (mp))) + fd = camel_mime_parser_fd (mp); + else if (!CAMEL_IS_SEEKABLE_SUBSTREAM (stream)) + stream = NULL; + + if ((stream || fd != -1) && (start = camel_mime_parser_tell (mp)) != -1) { + /* we can keep content on disk */ + } else { + /* need to load content into memory */ + buffer = g_byte_array_new (); + } + while (camel_mime_parser_step (mp, &buf, &len) != CAMEL_MIME_PARSER_STATE_BODY_END) { - d(printf("appending o/p data: %d: %.*s\n", len, len, buf)); - g_byte_array_append (buffer, (guint8 *) buf, len); + if (buffer != NULL) { + d(printf("appending o/p data: %d: %.*s\n", len, len, buf)); + g_byte_array_append (buffer, (guint8 *) buf, len); + } } - - d(printf("message part kept in memory!\n")); - - m