Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm

2008-03-04 Thread Philip Van Hoof

Hey Jeffrey,

I did some experimenting and after this it seems to work:

http://tinymail.org/trac/tinymail/changeset/3462

I had to get the value of folder_tell at the exact location where the
state is at CAMEL_MIME_PARSER_STATE_HEADER.

Then it worked.

I tested this against for example a testing E-mail that is floating
around the Lemonade test servers of 32 MB, various simpler E-mails and
one of 7 MB. Both big E-mails mostly had image attachments.

Tinymail has a mimepart viewer that uses a pixbuf loader, and it
succeeded just fine in loading the images.

I have a message open that is "40246320" bytes in size, this is my VmRSS
for Tinymail's demoui. 

VmRSS: 16428 kB

Those 16Mb is probably data in the GtkPixbuf and the summary.

lemonade.andrew.cmu.edu:143, testuser1, pass1 and pick the largest mail
on that server (~40 MB).


On Tue, 2008-03-04 at 18:40 +0100, Philip Van Hoof wrote:
> On Sat, 2008-01-26 at 23:22 -0500, Jeffrey Stedfast wrote:
> > Something like the attached patch might work, tho it is untested.
> 
> I had to change 
> 
>   else if (!CAMEL_IS_SEEKABLE_SUBSTREAM (stream))
> 
> into
> 
>   else if (!CAMEL_IS_SEEKABLE_STREAM (stream))
> 
> I don't know why you where testing for substream, as substream provides
> no extra functionality that seems to be related here ...
> 
> > So my guess is that this will break the parser :(
> > 
> > It might break in the stream case as well, you'd have to follow the code
> > paths a bit to know for sure. For instance, even if creating the
> > seekable substream doesn't perform an underlying seek on the original
> > stream, setting it in a data wrapper might call camel_stream_reset()
> > which /might/ do an lseek() on the source fs stream.
> 
> The problem with the patch is that it makes each MIME part's data start
> at the headers, in stead of at the actual content.
> 
> I tried determining the "start" right after the first call to
> camel_mime_parser_step but that just resulted in start == end.
> 
> 
> > Not an insurmountable problem to solve, but it does make things a little
> > more difficult and possibly touchy.
> 
> > 
> > 
> > On Sat, 2008-01-26 at 22:48 -0500, Jeffrey Stedfast wrote:
> > > On Sat, 2008-01-26 at 22:12 -0500, Jeffrey Stedfast wrote:
> > > > On Sat, 2008-01-26 at 13:44 +0100, Philip Van Hoof wrote:
> > > > > This is what happens if you try to open a truly large E-mail on a 
> > > > > device
> > > > > that has not as much memory available:
> > > > > 
> > > > > Is there something we can do about this? Can we change the MIME 
> > > > > parsing
> > > > > algorithm to be less memory demanding for example?
> > > > > 
> > > > > Note that GArray is not really very sparse with memory once you start
> > > > > having a really large array. Perhaps we can in stead change this to a
> > > > > normal pointer array of a fixed size (do we know the size before we
> > > > > start parsing, so that we can allocate an exact size in stead, 
> > > > > perhaps?)
> > > > 
> > > > eh, why would you change it to a GPtrArray? It doesn't hold pointers, it
> > > > holds message part content.
> > > > 
> > > > Unfortunately we don't know the size ahead of time.
> > > > 
> > > > I suppose you could use a custom byte array allocator so that you can
> > > > force it to grow by larger chunks or something, dunno.
> > > >
> > > >
> > > > The way GMime handles this is by not loading content into RAM, but 
> > > > that may be harder to do with Camel, especially in the mbox case.
> > > 
> > > er, I should probably explain this:
> > > 
> > > - writing the code should be relatively easy to do, but in the mbox
> > > case, the mbox may end up getting expunged or rewritten for some other
> > > reason which may cause problems, not sure how that would work.
> > > 
> > > I think in Maildir, as long as the fd remains open, the file won't
> > > actually disappear after an unlink() until the fd gets closed, so that
> > > might work out ok assuming you can spare the fd (which might be the
> > > other problem with Evolution?).
> > > 
> > > Jeff
> > > 
> > > 
> > > ___
> > > Evolution-hackers mailing list
> > > Evolution-hackers@gnome.org
> > > http://mail.gnome.org/mailman/listinfo/evolution-hackers
> > ___
> > Evolution-hackers mailing list
> > Evolution-hackers@gnome.org
> > http://mail.gnome.org/mailman/listinfo/evolution-hackers
-- 
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
http://pvanhoof.be/blog
http://codeminded.be




___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm

2008-03-04 Thread Philip Van Hoof

On Sat, 2008-01-26 at 23:22 -0500, Jeffrey Stedfast wrote:
> Something like the attached patch might work, tho it is untested.

I had to change 

else if (!CAMEL_IS_SEEKABLE_SUBSTREAM (stream))

into

else if (!CAMEL_IS_SEEKABLE_STREAM (stream))

I don't know why you where testing for substream, as substream provides
no extra functionality that seems to be related here ...

> So my guess is that this will break the parser :(
> 
> It might break in the stream case as well, you'd have to follow the code
> paths a bit to know for sure. For instance, even if creating the
> seekable substream doesn't perform an underlying seek on the original
> stream, setting it in a data wrapper might call camel_stream_reset()
> which /might/ do an lseek() on the source fs stream.

The problem with the patch is that it makes each MIME part's data start
at the headers, in stead of at the actual content.

I tried determining the "start" right after the first call to
camel_mime_parser_step but that just resulted in start == end.


> Not an insurmountable problem to solve, but it does make things a little
> more difficult and possibly touchy.

> 
> 
> On Sat, 2008-01-26 at 22:48 -0500, Jeffrey Stedfast wrote:
> > On Sat, 2008-01-26 at 22:12 -0500, Jeffrey Stedfast wrote:
> > > On Sat, 2008-01-26 at 13:44 +0100, Philip Van Hoof wrote:
> > > > This is what happens if you try to open a truly large E-mail on a device
> > > > that has not as much memory available:
> > > > 
> > > > Is there something we can do about this? Can we change the MIME parsing
> > > > algorithm to be less memory demanding for example?
> > > > 
> > > > Note that GArray is not really very sparse with memory once you start
> > > > having a really large array. Perhaps we can in stead change this to a
> > > > normal pointer array of a fixed size (do we know the size before we
> > > > start parsing, so that we can allocate an exact size in stead, perhaps?)
> > > 
> > > eh, why would you change it to a GPtrArray? It doesn't hold pointers, it
> > > holds message part content.
> > > 
> > > Unfortunately we don't know the size ahead of time.
> > > 
> > > I suppose you could use a custom byte array allocator so that you can
> > > force it to grow by larger chunks or something, dunno.
> > >
> > >
> > > The way GMime handles this is by not loading content into RAM, but 
> > > that may be harder to do with Camel, especially in the mbox case.
> > 
> > er, I should probably explain this:
> > 
> > - writing the code should be relatively easy to do, but in the mbox
> > case, the mbox may end up getting expunged or rewritten for some other
> > reason which may cause problems, not sure how that would work.
> > 
> > I think in Maildir, as long as the fd remains open, the file won't
> > actually disappear after an unlink() until the fd gets closed, so that
> > might work out ok assuming you can spare the fd (which might be the
> > other problem with Evolution?).
> > 
> > Jeff
> > 
> > 
> > ___
> > Evolution-hackers mailing list
> > Evolution-hackers@gnome.org
> > http://mail.gnome.org/mailman/listinfo/evolution-hackers
> ___
> Evolution-hackers mailing list
> Evolution-hackers@gnome.org
> http://mail.gnome.org/mailman/listinfo/evolution-hackers
-- 
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
http://pvanhoof.be/blog
http://codeminded.be




___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm

2008-01-27 Thread Philip Van Hoof

On Sun, 2008-01-27 at 11:27 -0500, Jeffrey Stedfast wrote:
> On Sun, 2008-01-27 at 13:44 +0100, Philip Van Hoof wrote:
> > This is very strange, though. It looks like stream=0x0 but the
> > mime-parser's stream ain't NULL.
> 
> that just means the stream the parser is using is not a subclass of
> CamelSeekableSubstream

The parser was parsing one that was opened by the maildir implementation
of camel_folder_get_message. So it's a file in a maildir, so I guess we
can make a stream for that file inherit CamelSeekableSubstream, right?


-- 
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
http://pvanhoof.be/blog
http://codeminded.be




___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm

2008-01-27 Thread Jeffrey Stedfast

On Sun, 2008-01-27 at 13:44 +0100, Philip Van Hoof wrote:
> This is very strange, though. It looks like stream=0x0 but the
> mime-parser's stream ain't NULL.

that just means the stream the parser is using is not a subclass of
CamelSeekableSubstream

Jeff


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm

2008-01-27 Thread Philip Van Hoof
This is very strange, though. It looks like stream=0x0 but the
mime-parser's stream ain't NULL.

(gdb) print buffer
$1 = (GByteArray *) 0x80e4dc0
(gdb) print stream
$2 = (CamelStream *) 0x0
(gdb) print *mp
$3 = {parent = {klass = 0x80def80, hooks = 0x0, ref_count = 1, flags = 0}, priv 
= 0x8272770}
(gdb) print *mp->priv
$4 = {state = CAMEL_MIME_PARSER_STATE_BODY, outbuf = 0x827e800 
"Content-Transfer-Encoding: quoted-printable", 
  outptr = 0x827e800 "Content-Transfer-Encoding: quoted-printable", outend = 
0x827ec00 "", fd = -1, stream = 0x826ab10, ioerrno = 0, 
  realbuf = 0x827ec08 "", 
  inbuf = 0x827ec88 
"ش\222�\177��\034\\\004�\225=�L<\2365gke�<-�\037p\024\233��\023\213~LJP~\225�/���O\002�Vtc\235gǦ�\215�\206\025-\231\"*ӱ\232Nz\205\036\n�\223�2U�A\237%Qn",
 inptr = 0x827fc5c 
"�~y[\017ʶ���\204\037�\213�l�Z�`Qh9\235\f�+�\224\024\\p���\n\226\"y�5��\220\n", 
  inend = 0x827fc88 "\n", atleast = 0, seek = 413944216, unstep = 0, midline = 
1, scan_from = 0, scan_pre_from = 0, eof = 0, 
  start_of_from = -1, start_of_boundary = 11048, start_of_headers = 11092, 
header_start = -1, filterid = 1, filters = 0x0, 
  parts = 0x828c210}
(gdb) print *mp->priv->stream
$5 = {parent_object = {klass = 0x80ffdc8, hooks = 0x0, ref_count = 2, flags = 
0}, eos = 0}
(gdb) 



#define _PRIVATE(o) (((CamelMimeParser *)(o))->priv)
CamelStream *
camel_mime_parser_stream (CamelMimeParser *parser)
{
struct _header_scan_state *s = _PRIVATE (parser);

return s->stream;
}

Maybe it's not a CamelSeekableSubstream? Else would parent_stream not be F?

(gdb) print * (CamelSeekableSubstream *)mp->priv->stream
$7 = {parent_object = {parent_object = {parent_object = {klass = 0x80ffdc8, 
hooks = 0x0, ref_count = 2, flags = 0}, eos = 0}, 
position = 413948312, bound_start = 0, bound_end = -1, some_stack = '\0' 
}, parent_stream = 0x16}
(gdb) 


On Sun, 2008-01-27 at 13:38 +0100, Philip Van Hoof wrote:
> Looks like the GByteArray is still being created.
> 
> (gdb) break camel-mime-part-utils.c:82
> Breakpoint 2 at 0xb6dd541e: file camel-mime-part-utils.c, line 82.
> (gdb) delete 1
> (gdb) cont
> Continuing.
> 
> Breakpoint 2, simple_data_wrapper_construct_from_parser (dw=0xb3f02800, 
> mp=0x827bcd0) at camel-mime-part-utils.c:82
> 82  if (buffer != NULL) {
> (gdb) print buffer
> $1 = (GByteArray *) 0x80e4dc0
> (gdb) 
> 
> 
> Breakpoint 1, camel_mime_parser_step (parser=0x827bcd0, 
> databuffer=0xb4882f3c, datalength=0xb4882f40) at camel-mime-parser.c:610
> 610 struct _header_scan_state *s = _PRIVATE (parser);
> (gdb) bt
> #0  camel_mime_parser_step (parser=0x827bcd0, databuffer=0xb4882f3c, 
> datalength=0xb4882f40) at camel-mime-parser.c:610
> #1  0xb6dd5456 in simple_data_wrapper_construct_from_parser (dw=0xb3f02800, 
> mp=0x827bcd0) at camel-mime-part-utils.c:81
> #2  0xb6dd55e9 in camel_mime_part_construct_content_from_parser 
> (dw=0x824e8c8, mp=0x827bcd0) at camel-mime-part-utils.c:127
> #3  0xb6dd6ff1 in construct_from_parser (mime_part=0x824e8c8, mp=0x827bcd0) 
> at camel-mime-part.c:968
> #4  0xb6dd70af in camel_mime_part_construct_from_parser (mime_part=0x824e8c8, 
> mp=0x827bcd0) at camel-mime-part.c:996
> #5  0xb6de1aab in construct_from_parser (multipart=0x8246f80, mp=0x827bcd0) 
> at camel-multipart.c:577
> #6  0xb6de1bea in camel_multipart_construct_from_parser (multipart=0x8246f80, 
> mp=0x827bcd0) at camel-multipart.c:609
> #7  0xb6dd5681 in camel_mime_part_construct_content_from_parser 
> (dw=0x8254570, mp=0x827bcd0) at camel-mime-part-utils.c:144
> #8  0xb6dd6ff1 in construct_from_parser (mime_part=0x8254570, mp=0x827bcd0) 
> at camel-mime-part.c:968
> #9  0xb6dd1de4 in construct_from_parser (dw=0x8254570, mp=0x827bcd0) at 
> camel-mime-message.c:597
> #10 0xb6dd70af in camel_mime_part_construct_from_parser (mime_part=0x8254570, 
> mp=0x827bcd0) at camel-mime-part.c:996
> #11 0xb6dd7122 in construct_from_stream (dw=0x8254570, s=0x826ab10) at 
> camel-mime-part.c:1012
> #12 0xb6dc2f63 in camel_data_wrapper_construct_from_stream 
> (data_wrapper=0x8254570, stream=0x826ab10) at camel-data-wrapper.c:270
> #13 0xb60fbd97 in maildir_get_message (folder=0x80def28, uid=0x8269dd0 
> "1192085835.11467_1.evergrey", 
> 
> 
> [EMAIL PROTECTED]:~/Current/mailtests/md/spam1/cur$ ls -alh 
> 1192085835.11467_1.evergrey\!2\,SH 
> -rw-r--r-- 1 pvanhoof pvanhoof 401M 2008-01-27 13:28 
> 1192085835.11467_1.evergrey!2,SH
> [EMAIL PROTECTED]:~/Current/mailtests/md/spam1/cur$ 
> 
> 
> 
> On Sat, 2008-01-26 at 23:22 -0500, Jeffrey Stedfast wrote:
> > Something like the attached patch might work, tho it is untested.
> > 
> > If this doesn't work, then I suspect the problem is that the seek
> > position might get changed out from under the mime parser (assuming it
> > is using either a CamelStreamFs or an fd).
> > 
> > Note that camel_stream_fs_new_with_fd[_and_bounds]() calls lseek() on
> > the fd passed in.
> > 
> > >From the dup() man page:
> > 
> >After  a  successful  return f

Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm

2008-01-27 Thread Philip Van Hoof
Looks like the GByteArray is still being created.

(gdb) break camel-mime-part-utils.c:82
Breakpoint 2 at 0xb6dd541e: file camel-mime-part-utils.c, line 82.
(gdb) delete 1
(gdb) cont
Continuing.

Breakpoint 2, simple_data_wrapper_construct_from_parser (dw=0xb3f02800, 
mp=0x827bcd0) at camel-mime-part-utils.c:82
82  if (buffer != NULL) {
(gdb) print buffer
$1 = (GByteArray *) 0x80e4dc0
(gdb) 


Breakpoint 1, camel_mime_parser_step (parser=0x827bcd0, databuffer=0xb4882f3c, 
datalength=0xb4882f40) at camel-mime-parser.c:610
610 struct _header_scan_state *s = _PRIVATE (parser);
(gdb) bt
#0  camel_mime_parser_step (parser=0x827bcd0, databuffer=0xb4882f3c, 
datalength=0xb4882f40) at camel-mime-parser.c:610
#1  0xb6dd5456 in simple_data_wrapper_construct_from_parser (dw=0xb3f02800, 
mp=0x827bcd0) at camel-mime-part-utils.c:81
#2  0xb6dd55e9 in camel_mime_part_construct_content_from_parser (dw=0x824e8c8, 
mp=0x827bcd0) at camel-mime-part-utils.c:127
#3  0xb6dd6ff1 in construct_from_parser (mime_part=0x824e8c8, mp=0x827bcd0) at 
camel-mime-part.c:968
#4  0xb6dd70af in camel_mime_part_construct_from_parser (mime_part=0x824e8c8, 
mp=0x827bcd0) at camel-mime-part.c:996
#5  0xb6de1aab in construct_from_parser (multipart=0x8246f80, mp=0x827bcd0) at 
camel-multipart.c:577
#6  0xb6de1bea in camel_multipart_construct_from_parser (multipart=0x8246f80, 
mp=0x827bcd0) at camel-multipart.c:609
#7  0xb6dd5681 in camel_mime_part_construct_content_from_parser (dw=0x8254570, 
mp=0x827bcd0) at camel-mime-part-utils.c:144
#8  0xb6dd6ff1 in construct_from_parser (mime_part=0x8254570, mp=0x827bcd0) at 
camel-mime-part.c:968
#9  0xb6dd1de4 in construct_from_parser (dw=0x8254570, mp=0x827bcd0) at 
camel-mime-message.c:597
#10 0xb6dd70af in camel_mime_part_construct_from_parser (mime_part=0x8254570, 
mp=0x827bcd0) at camel-mime-part.c:996
#11 0xb6dd7122 in construct_from_stream (dw=0x8254570, s=0x826ab10) at 
camel-mime-part.c:1012
#12 0xb6dc2f63 in camel_data_wrapper_construct_from_stream 
(data_wrapper=0x8254570, stream=0x826ab10) at camel-data-wrapper.c:270
#13 0xb60fbd97 in maildir_get_message (folder=0x80def28, uid=0x8269dd0 
"1192085835.11467_1.evergrey", 


[EMAIL PROTECTED]:~/Current/mailtests/md/spam1/cur$ ls -alh 
1192085835.11467_1.evergrey\!2\,SH 
-rw-r--r-- 1 pvanhoof pvanhoof 401M 2008-01-27 13:28 
1192085835.11467_1.evergrey!2,SH
[EMAIL PROTECTED]:~/Current/mailtests/md/spam1/cur$ 



On Sat, 2008-01-26 at 23:22 -0500, Jeffrey Stedfast wrote:
> Something like the attached patch might work, tho it is untested.
> 
> If this doesn't work, then I suspect the problem is that the seek
> position might get changed out from under the mime parser (assuming it
> is using either a CamelStreamFs or an fd).
> 
> Note that camel_stream_fs_new_with_fd[_and_bounds]() calls lseek() on
> the fd passed in.
> 
> >From the dup() man page:
> 
>After  a  successful  return from dup() or dup2(), the old and new file
>descriptors may be used interchangeably.  They refer to the  same  open
>file description (see open(2)) and thus share file offset and file sta‐
>tus flags; for example,  if  the  file  offset  is  modified  by  using
>lseek(2)  on one of the descriptors, the offset is also changed for the
>other.
> 
> So my guess is that this will break the parser :(
> 
> It might break in the stream case as well, you'd have to follow the code
> paths a bit to know for sure. For instance, even if creating the
> seekable substream doesn't perform an underlying seek on the original
> stream, setting it in a data wrapper might call camel_stream_reset()
> which /might/ do an lseek() on the source fs stream.
> 
> Not an insurmountable problem to solve, but it does make things a little
> more difficult and possibly touchy.
> 
> Jeff
> 
> 
> 
> On Sat, 2008-01-26 at 22:48 -0500, Jeffrey Stedfast wrote:
> > On Sat, 2008-01-26 at 22:12 -0500, Jeffrey Stedfast wrote:
> > > On Sat, 2008-01-26 at 13:44 +0100, Philip Van Hoof wrote:
> > > > This is what happens if you try to open a truly large E-mail on a device
> > > > that has not as much memory available:
> > > > 
> > > > Is there something we can do about this? Can we change the MIME parsing
> > > > algorithm to be less memory demanding for example?
> > > > 
> > > > Note that GArray is not really very sparse with memory once you start
> > > > having a really large array. Perhaps we can in stead change this to a
> > > > normal pointer array of a fixed size (do we know the size before we
> > > > start parsing, so that we can allocate an exact size in stead, perhaps?)
> > > 
> > > eh, why would you change it to a GPtrArray? It doesn't hold pointers, it
> > > holds message part content.
> > > 
> > > Unfortunately we don't know the size ahead of time.
> > > 
> > > I suppose you could use a custom byte array allocator so that you can
> > > force it to grow by larger chunks or something, dunno.
> > >
> > >
> > > The way GMim

Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm

2008-01-27 Thread Philip Van Hoof
We'll try this, and if it works for all mails that we wanted to test,
I'll let you know.

Thanks a lot! Adding Modest's project manager in CC

On Sat, 2008-01-26 at 23:22 -0500, Jeffrey Stedfast wrote:
> Something like the attached patch might work, tho it is untested.
> 
> If this doesn't work, then I suspect the problem is that the seek
> position might get changed out from under the mime parser (assuming it
> is using either a CamelStreamFs or an fd).
> 
> Note that camel_stream_fs_new_with_fd[_and_bounds]() calls lseek() on
> the fd passed in.
> 
> >From the dup() man page:
> 
>After  a  successful  return from dup() or dup2(), the old and new file
>descriptors may be used interchangeably.  They refer to the  same  open
>file description (see open(2)) and thus share file offset and file sta‐
>tus flags; for example,  if  the  file  offset  is  modified  by  using
>lseek(2)  on one of the descriptors, the offset is also changed for the
>other.
> 
> So my guess is that this will break the parser :(
> 
> It might break in the stream case as well, you'd have to follow the code
> paths a bit to know for sure. For instance, even if creating the
> seekable substream doesn't perform an underlying seek on the original
> stream, setting it in a data wrapper might call camel_stream_reset()
> which /might/ do an lseek() on the source fs stream.
> 
> Not an insurmountable problem to solve, but it does make things a little
> more difficult and possibly touchy.
> 
> Jeff
> 
> 
> 
> On Sat, 2008-01-26 at 22:48 -0500, Jeffrey Stedfast wrote:
> > On Sat, 2008-01-26 at 22:12 -0500, Jeffrey Stedfast wrote:
> > > On Sat, 2008-01-26 at 13:44 +0100, Philip Van Hoof wrote:
> > > > This is what happens if you try to open a truly large E-mail on a device
> > > > that has not as much memory available:
> > > > 
> > > > Is there something we can do about this? Can we change the MIME parsing
> > > > algorithm to be less memory demanding for example?
> > > > 
> > > > Note that GArray is not really very sparse with memory once you start
> > > > having a really large array. Perhaps we can in stead change this to a
> > > > normal pointer array of a fixed size (do we know the size before we
> > > > start parsing, so that we can allocate an exact size in stead, perhaps?)
> > > 
> > > eh, why would you change it to a GPtrArray? It doesn't hold pointers, it
> > > holds message part content.
> > > 
> > > Unfortunately we don't know the size ahead of time.
> > > 
> > > I suppose you could use a custom byte array allocator so that you can
> > > force it to grow by larger chunks or something, dunno.
> > >
> > >
> > > The way GMime handles this is by not loading content into RAM, but 
> > > that may be harder to do with Camel, especially in the mbox case.
> > 
> > er, I should probably explain this:
> > 
> > - writing the code should be relatively easy to do, but in the mbox
> > case, the mbox may end up getting expunged or rewritten for some other
> > reason which may cause problems, not sure how that would work.
> > 
> > I think in Maildir, as long as the fd remains open, the file won't
> > actually disappear after an unlink() until the fd gets closed, so that
> > might work out ok assuming you can spare the fd (which might be the
> > other problem with Evolution?).
> > 

-- 
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
http://pvanhoof.be/blog
http://codeminded.be




___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm

2008-01-26 Thread Jeffrey Stedfast
Something like the attached patch might work, tho it is untested.

If this doesn't work, then I suspect the problem is that the seek
position might get changed out from under the mime parser (assuming it
is using either a CamelStreamFs or an fd).

Note that camel_stream_fs_new_with_fd[_and_bounds]() calls lseek() on
the fd passed in.

>From the dup() man page:

   After  a  successful  return from dup() or dup2(), the old and new file
   descriptors may be used interchangeably.  They refer to the  same  open
   file description (see open(2)) and thus share file offset and file sta‐
   tus flags; for example,  if  the  file  offset  is  modified  by  using
   lseek(2)  on one of the descriptors, the offset is also changed for the
   other.

So my guess is that this will break the parser :(

It might break in the stream case as well, you'd have to follow the code
paths a bit to know for sure. For instance, even if creating the
seekable substream doesn't perform an underlying seek on the original
stream, setting it in a data wrapper might call camel_stream_reset()
which /might/ do an lseek() on the source fs stream.

Not an insurmountable problem to solve, but it does make things a little
more difficult and possibly touchy.

Jeff



On Sat, 2008-01-26 at 22:48 -0500, Jeffrey Stedfast wrote:
> On Sat, 2008-01-26 at 22:12 -0500, Jeffrey Stedfast wrote:
> > On Sat, 2008-01-26 at 13:44 +0100, Philip Van Hoof wrote:
> > > This is what happens if you try to open a truly large E-mail on a device
> > > that has not as much memory available:
> > > 
> > > Is there something we can do about this? Can we change the MIME parsing
> > > algorithm to be less memory demanding for example?
> > > 
> > > Note that GArray is not really very sparse with memory once you start
> > > having a really large array. Perhaps we can in stead change this to a
> > > normal pointer array of a fixed size (do we know the size before we
> > > start parsing, so that we can allocate an exact size in stead, perhaps?)
> > 
> > eh, why would you change it to a GPtrArray? It doesn't hold pointers, it
> > holds message part content.
> > 
> > Unfortunately we don't know the size ahead of time.
> > 
> > I suppose you could use a custom byte array allocator so that you can
> > force it to grow by larger chunks or something, dunno.
> >
> >
> > The way GMime handles this is by not loading content into RAM, but 
> > that may be harder to do with Camel, especially in the mbox case.
> 
> er, I should probably explain this:
> 
> - writing the code should be relatively easy to do, but in the mbox
> case, the mbox may end up getting expunged or rewritten for some other
> reason which may cause problems, not sure how that would work.
> 
> I think in Maildir, as long as the fd remains open, the file won't
> actually disappear after an unlink() until the fd gets closed, so that
> might work out ok assuming you can spare the fd (which might be the
> other problem with Evolution?).
> 
> Jeff
> 
> 
> ___
> Evolution-hackers mailing list
> Evolution-hackers@gnome.org
> http://mail.gnome.org/mailman/listinfo/evolution-hackers
Index: ChangeLog
===
--- ChangeLog	(revision 8425)
+++ ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2008-01-26  Jeffrey Stedfast  <[EMAIL PROTECTED]>
+
+	* camel-mime-part-utils.c (simple_data_wrapper_construct_from_parser):
+	If possible, keep the content on disk.
+
 2008-01-24  Matthew Barnes  <[EMAIL PROTECTED]>
 
 	* camel-object.c (camel_object_cast):
Index: camel-mime-part-utils.c
===
--- camel-mime-part-utils.c	(revision 8425)
+++ camel-mime-part-utils.c	(working copy)
@@ -57,25 +57,47 @@
 static void
 simple_data_wrapper_construct_from_parser (CamelDataWrapper *dw, CamelMimeParser *mp)
 {
+	GByteArray *buffer = NULL;
+	CamelStream *stream;
+	off_t start, end;
+	int fd = -1;
+	size_t len;
 	char *buf;
-	GByteArray *buffer;
-	CamelStream *mem;
-	size_t len;
-
+	
 	d(printf ("simple_data_wrapper_construct_from_parser()\n"));
-
-	/* read in the entire content */
-	buffer = g_byte_array_new ();
+	
+	if (!(stream = camel_mime_parser_stream (mp)))
+		fd = camel_mime_parser_fd (mp);
+	else if (!CAMEL_IS_SEEKABLE_SUBSTREAM (stream))
+		stream = NULL;
+	
+	if ((stream || fd != -1) && (start = camel_mime_parser_tell (mp)) != -1) {
+		/* we can keep content on disk */
+	} else {
+		/* need to load content into memory */
+		buffer = g_byte_array_new ();
+	}
+	
 	while (camel_mime_parser_step (mp, &buf, &len) != CAMEL_MIME_PARSER_STATE_BODY_END) {
-		d(printf("appending o/p data: %d: %.*s\n", len, len, buf));
-		g_byte_array_append (buffer, (guint8 *) buf, len);
+		if (buffer != NULL) {
+			d(printf("appending o/p data: %d: %.*s\n", len, len, buf));
+			g_byte_array_append (buffer, (guint8 *) buf, len);
+		}
 	}
-
-	d(printf("message part kept in memory!\n"));
-
-	m

Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm

2008-01-26 Thread Jeffrey Stedfast
On Sat, 2008-01-26 at 22:12 -0500, Jeffrey Stedfast wrote:
> On Sat, 2008-01-26 at 13:44 +0100, Philip Van Hoof wrote:
> > This is what happens if you try to open a truly large E-mail on a device
> > that has not as much memory available:
> > 
> > Is there something we can do about this? Can we change the MIME parsing
> > algorithm to be less memory demanding for example?
> > 
> > Note that GArray is not really very sparse with memory once you start
> > having a really large array. Perhaps we can in stead change this to a
> > normal pointer array of a fixed size (do we know the size before we
> > start parsing, so that we can allocate an exact size in stead, perhaps?)
> 
> eh, why would you change it to a GPtrArray? It doesn't hold pointers, it
> holds message part content.
> 
> Unfortunately we don't know the size ahead of time.
> 
> I suppose you could use a custom byte array allocator so that you can
> force it to grow by larger chunks or something, dunno.
>
>
> The way GMime handles this is by not loading content into RAM, but 
> that may be harder to do with Camel, especially in the mbox case.

er, I should probably explain this:

- writing the code should be relatively easy to do, but in the mbox
case, the mbox may end up getting expunged or rewritten for some other
reason which may cause problems, not sure how that would work.

I think in Maildir, as long as the fd remains open, the file won't
actually disappear after an unlink() until the fd gets closed, so that
might work out ok assuming you can spare the fd (which might be the
other problem with Evolution?).

Jeff


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm

2008-01-26 Thread Jeffrey Stedfast
On Sat, 2008-01-26 at 13:44 +0100, Philip Van Hoof wrote:
> This is what happens if you try to open a truly large E-mail on a device
> that has not as much memory available:
> 
> Is there something we can do about this? Can we change the MIME parsing
> algorithm to be less memory demanding for example?
> 
> Note that GArray is not really very sparse with memory once you start
> having a really large array. Perhaps we can in stead change this to a
> normal pointer array of a fixed size (do we know the size before we
> start parsing, so that we can allocate an exact size in stead, perhaps?)

eh, why would you change it to a GPtrArray? It doesn't hold pointers, it
holds message part content.

Unfortunately we don't know the size ahead of time.

I suppose you could use a custom byte array allocator so that you can
force it to grow by larger chunks or something, dunno.


The way GMime handles this is by not loading content into RAM, but that
may be harder to do with Camel, especially in the mbox case.


Jeff


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers