Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Philip Van Hoof
On Tue, 2006-02-14 at 10:47 +0530, Parthasarathi Susarla wrote:
 
  Will this branch have likewise functionality?
 
 The disk summaries branch does something on those lines, but it also
 means a lot of disk I/O than before, and only after prolonged
 testing/usage would we now how well it would work.

I have some ideas for reducing the disk I/O. Some of those might
influence the conceptual design, others will probably already been
initiated by !z.

Keep me involved. I can't allocate huge amounts of time to it, as I have
both a girlfriend and a daytime job, but nevertheless I'm very
interested and planning to help.

  Difficult is fun.
  
 :) Am glad this discussion has begun.

Same.

-- 
Philip Van Hoof, software developer at x-tend 
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
work: vanhoof at x-tend dot be 
http://www.pvanhoof.be - http://www.x-tend.be

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Lee Revell
On Tue, 2006-02-14 at 18:57 +0100, Philip Van Hoof wrote:
 On Tue, 2006-02-14 at 11:06 -0500, Jeffrey Stedfast wrote:
 
   Imagine spastic users that do nothing but scroll all day long at
   extremely rapid speeds .. multiply that with 10.000 such users, and you
   still wouldn't have any problems at all.
  
  Even for single-user, disk-summary-branch was slower than the current
  in-memory implementation.
 
 Which is of course nothing but pure logic. Memory will always be faster.
 But also more expensive. Using to much memory makes evolution less
 scalable.

I really don't think the message IDs are the main source of bloat in
Evo.

For starters, how about making glib use a sane thread stack size, like
POSIX says you should, rather than counting on the default to be sane?
Currently it defaults to RLIMIT_STACK which is usually 8MB per thread!

Lee

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Philip Van Hoof
On Tue, 2006-02-14 at 11:06 -0500, Jeffrey Stedfast wrote:

 Even for single-user, disk-summary-branch was slower than the current
 in-memory implementation.

Try this one:

Even for a single-user, running the entire system from disk was slower
than the current put-everything-in-a-ram-disk implementation.

Duh.


-- 
Philip Van Hoof, software developer at x-tend 
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
work: vanhoof at x-tend dot be 
http://www.pvanhoof.be - http://www.x-tend.be

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Philip Van Hoof
On Tue, 2006-02-14 at 13:02 -0500, Lee Revell wrote:
 On Tue, 2006-02-14 at 18:57 +0100, Philip Van Hoof wrote:
 
 I really don't think the message IDs are the main source of bloat in
 Evo.

I measured it ;-)

10.000 E-mails used +-8MB of memory.

This is how I quickly measured it.

gint s=0;

static void measure (gpointer data, gpointer user_data)
{
s += strlen ((const char*)data);
}

GPtrArray *uids = camel_folder_get_uids (folder);
g_ptr_array_foreach (uids, measure, NULL);
g_print (%d\n, s);

Luckily the format of those message ids is a small string version of the
follow-up number. So 0,1, 2, .. . So it's a lot small
strings. Which is of course better then a lot message-id headers

But if it's a follow-up, I wonder why not simply return the total amount
of messages in a folder, and let the developer use a simple loop like:

for (i=0; ithat_length; i++)
{
msg = camel_ folder_get_message_info (folder, i);
}

I didn't measure the size of the CamelMessageInfo after doing camel_
folder_get_message_info (folder, data) on each of those messages.

Given the fact that evolution seems to simply load all the header
information, I fear also that should be added to the count.




-- 
Philip Van Hoof, software developer at x-tend 
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
work: vanhoof at x-tend dot be 
http://www.pvanhoof.be - http://www.x-tend.be

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Philip Van Hoof
On Tue, 2006-02-14 at 14:03 -0500, Lee Revell wrote:

   I really don't think the message IDs are the main source of bloat in
   Evo.
  
  I measured it ;-)
  
  10.000 E-mails used +-8MB of memory.
 
 Do you think that's a problem?

For evolution, it might be a lesser problem than for camel. I'd like to
start using camel on mobile devices.


And more importantly, http://bugzilla.gnome.org/show_bug.cgi?id=331017
not only describes the problem about camel_folder_get_uids as a function
that returns a lot memory for large folders.

The more important issue is that it can't request the headers in a given
sort order. This makes it necessary to, when sorting a view, read every
single message header (the gtktreesortable stuff will do this behind
your back). Also the ones that weren't loaded yet.

Now read below, my theory about not having to load everything in memory
which I proved using tinymail as prove of concept. Please check it out
before thinking the theory is just a theory: I did successfully
implement this. It does work correctly.

So in case you sort your view, you do have to read everything in memory
just to read the message headers. 

A camel_folder_get_uids that can return a sorted list wouldn't require
me to do that.


The theory of not having to load the CamelMessageInfo instances:

I didn't add the CamelMessageInfo instances to the count. Those will add
a significant number to the total amount of used memory. Far more than
the uids I, as those headers are a lot longer in total length than the
uids.

However .. that can be fixed if evolution would call camel_message_
info_free on all the message_info instances that become invisible.

I quickly studied evolution by searching for invocations on this _free
method, and it doesn't look like evolution is doing that.


A prove of concept of this theory can be found here:

https://svn.cronos.be/svn/tinymail/trunk/libtinymail-camel/tny-msg-header.c
https://svn.cronos.be/svn/tinymail/trunk/libtinymailui/tny-msg-header-list-model.c

Check the methods tny_msg_header_list_model_unref_node and
tny_msg_header_uncache in those files. Feel free to tryout tinymail. I
did implement the theory and it does work. And I did measure it.
You can see the measurement happen as I added some debugging printfs
that will tell you when it's allocating and when it's deallocating a
CamelMessageInfo instance.

Try scrolling the view, you'll notice that it does both allocate and
deallocate. That's corresponding rows becoming visible and invisible.

...

That's CamelMessageInfo getting therefore allocated and deallocated.

Using my custom tree-model, I have very tight control over what I really
really need. And what I don't really need.

-- I don't need the invisible message headers. I simply don't.


The fact that evolution does, IS probably a serious memory problem. I'll
spare you guys the measurement. For certain (large) folders, I fear it's
going to be huge.


 USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
 rlrevell 11659  5.4 37.0 260516 162700 ?   Sl   Feb13  66:07 evolution 
 --component=mail

 Even if you're off by 4x, that's only 32MB of a 162MB footprint.  Surely
 we waste a lot more on the GUI...

You are measuring totally wrong. ps aux will not give you a correct
picture of the used memory, as a lot mmap'ed shared libraries are added
to the count. Please don't use it.

 Personally I think Evo is already slow enough, and we cannot afford to
 trade off any speed to save memory.

So we just load everything in memory?! Why not also load all the e-mails
of all your folders into memory!! That will speedup evolution! 

No it wont.

But it will slowdown the rest of your computer.


-- 
Philip Van Hoof, software developer at x-tend 
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
work: vanhoof at x-tend dot be 
http://www.pvanhoof.be - http://www.x-tend.be

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Philip Van Hoof
On Tue, 2006-02-14 at 21:15 +0100, Philip Van Hoof wrote:
 On Tue, 2006-02-14 at 14:03 -0500, Lee Revell wrote:

  Do you think that's a problem?
 
 For evolution, it might be a lesser problem than for camel. I'd like to
 start using camel on mobile devices.

Let me illustrate this specific case . . .

Imagine .. a mobile device as 50 MB ram (that's a lot for some devices).

Now your library has a function like this:

GPtrArray *give_ids (void);

It will return pointers to 7MB  of strings like this:

0, 1, 2, ... 1

Wouldn't it be better if I could simply do this then?

gint give_length (void);

for (int a=0; a  give_length(); a++)
{
/* or whatever fast implementation */
gchar *id = g_strdup_printf (%d, a);

/* my stuff */

g_free (id); 
}

No? Because, that would safe me a 7 MB allocation on a device that has
50 MB of ram it REALLY wants to use for other purposes (believe me, if
on such a device you don't have to waste it like that, you DON'T waste
it like that).

Now .. I payed money for the memory in my desktop. Imagine EVERY
application wasting 7MB of memory. At this moment I have 24 applications
running. If they would all start doing things like that only ONCE, it
would waste +- 168 MB of ram. That's +- 15 euros.

Now lets take a look at the One Laptop Per Child project. 

We can save some euros by fixing this flaw. We can make it possible to
give poor children a very good E-mail client that uses camel.

Is it still not worth fixing?


I strongly disagree.


-- 
Philip Van Hoof, software developer at x-tend 
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
work: vanhoof at x-tend dot be 
http://www.pvanhoof.be - http://www.x-tend.be

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Lee Revell
On Tue, 2006-02-14 at 21:30 +0100, Philip Van Hoof wrote:
 We can save some euros by fixing this flaw. We can make it possible to
 give poor children a very good E-mail client that uses camel.
 
 Is it still not worth fixing?
 
 
 I strongly disagree.
 

Yes it is worth fixing, sorry for not reading your proposal thoroughly.

Lee

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Jeffrey Stedfast
On Tue, 2006-02-14 at 19:40 +0100, Philip Van Hoof wrote:
 On Tue, 2006-02-14 at 13:02 -0500, Lee Revell wrote:
  On Tue, 2006-02-14 at 18:57 +0100, Philip Van Hoof wrote:
  
  I really don't think the message IDs are the main source of bloat in
  Evo.
 
 I measured it ;-)
 
 10.000 E-mails used +-8MB of memory.
 
 This is how I quickly measured it.
 
 gint s=0;
 
 static void measure (gpointer data, gpointer user_data)
 {
   s += strlen ((const char*)data);
 }
 
 GPtrArray *uids = camel_folder_get_uids (folder);
 g_ptr_array_foreach (uids, measure, NULL);
 g_print (%d\n, s);
 
 Luckily the format of those message ids is a small string version of the
 follow-up number. So 0,1, 2, .. . So it's a lot small
 strings. Which is of course better then a lot message-id headers
 
 But if it's a follow-up, I wonder why not simply return the total amount
 of messages in a folder, and let the developer use a simple loop like:
 
 for (i=0; ithat_length; i++)
 {
   msg = camel_ folder_get_message_info (folder, i);
 }

we don't do that because the uids are not necessarily contiguous. You
might have 1, 3, 4, 5, 7, 109, 110

Refer to the IMAP specification for further info on how UIDs work. Hint:
They are NOT message-ids nor are the sequence-ids.

-- 
Jeffrey Stedfast
Evolution Hacker - Novell, Inc.
[EMAIL PROTECTED]  - www.novell.com

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Jeffrey Stedfast
Have fun implementing this on your own. I guess you don't need my help.

On Tue, 2006-02-14 at 19:03 +0100, Philip Van Hoof wrote:
 On Tue, 2006-02-14 at 11:06 -0500, Jeffrey Stedfast wrote:
 
  Even for single-user, disk-summary-branch was slower than the current
  in-memory implementation.
 
 Try this one:
 
 Even for a single-user, running the entire system from disk was slower
 than the current put-everything-in-a-ram-disk implementation.
 
 Duh.
 
 
-- 
Jeffrey Stedfast
Evolution Hacker - Novell, Inc.
[EMAIL PROTECTED]  - www.novell.com

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Jeffrey Stedfast
This example is total bollocks, you don't even understand how a uid is
used or what it even is. I've explained it to you a number of times now
and you still don't get it. UIDs are not guaranteed to be contiguous.

Jeff

On Tue, 2006-02-14 at 21:30 +0100, Philip Van Hoof wrote:
 On Tue, 2006-02-14 at 21:15 +0100, Philip Van Hoof wrote:
  On Tue, 2006-02-14 at 14:03 -0500, Lee Revell wrote:
 
   Do you think that's a problem?
  
  For evolution, it might be a lesser problem than for camel. I'd like to
  start using camel on mobile devices.
 
 Let me illustrate this specific case . . .
 
 Imagine .. a mobile device as 50 MB ram (that's a lot for some devices).
 
 Now your library has a function like this:
 
 GPtrArray *give_ids (void);
 
 It will return pointers to 7MB  of strings like this:
 
 0, 1, 2, ... 1
 
 Wouldn't it be better if I could simply do this then?
 
 gint give_length (void);
 
 for (int a=0; a  give_length(); a++)
 {
   /* or whatever fast implementation */
   gchar *id = g_strdup_printf (%d, a);
 
   /* my stuff */
 
   g_free (id); 
 }
 
 No? Because, that would safe me a 7 MB allocation on a device that has
 50 MB of ram it REALLY wants to use for other purposes (believe me, if
 on such a device you don't have to waste it like that, you DON'T waste
 it like that).
 
 Now .. I payed money for the memory in my desktop. Imagine EVERY
 application wasting 7MB of memory. At this moment I have 24 applications
 running. If they would all start doing things like that only ONCE, it
 would waste +- 168 MB of ram. That's +- 15 euros.
 
 Now lets take a look at the One Laptop Per Child project. 
 
 We can save some euros by fixing this flaw. We can make it possible to
 give poor children a very good E-mail client that uses camel.
 
 Is it still not worth fixing?
 
 
 I strongly disagree.
 
 
-- 
Jeffrey Stedfast
Evolution Hacker - Novell, Inc.
[EMAIL PROTECTED]  - www.novell.com

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Philip Van Hoof
On Tue, 2006-02-14 at 15:36 -0500, Jeffrey Stedfast wrote:

 we don't do that because the uids are not necessarily contiguous. You
 might have 1, 3, 4, 5, 7, 109, 110

Which is why I proposed the iterator/cursor. Whether or not that will be
fast enough should be experimented with.

And again, the ids aren't the main problem I'm experiencing. The main
problem is that I can't receive the uids in a sorted order, and that I
can't specify in which header information I'm interested.


-- 
Philip Van Hoof, software developer at x-tend 
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
work: vanhoof at x-tend dot be 
http://www.pvanhoof.be - http://www.x-tend.be

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-14 Thread Philip Van Hoof
On Tue, 2006-02-14 at 19:40 +0100, Philip Van Hoof wrote:
 On Tue, 2006-02-14 at 13:02 -0500, Lee Revell wrote:

 I measured it ;-)
 
 10.000 E-mails used +-8MB of memory.

Correction. The uids don't use that many memory. You can more easily
measure this by assuming that the largest uid-size is \0 so 5
bytes multiplied by 10.000 gives you 50.000 bytes.

Nevertheless that doesn't take away the fact that you can't get the uids
in a sorted way, and that evolution loads all CamelMessageInfo instances

By the way ... tinymail loads evolution's hackers mailing list using 3
megabytes (measured with valgrind).

The most important object to create a proxy instance for is the
CamelFolder.

Try to avoid keeping the references coming from camel_store_get_folder.

Because tinymails design is heavily based on the proxy design pattern, I
managed to cut 12 megabytes of CamelFolder instances.


-- 
Philip Van Hoof, software developer at x-tend 
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
work: vanhoof at x-tend dot be 
http://www.pvanhoof.be - http://www.x-tend.be

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-13 Thread Philip Van Hoof
On Mon, 2006-02-13 at 17:05 -0500, Dave Richards wrote:
 I added this comment to the bugzilla report.  I'm not a developer, but
 I do have to *deploy* Evolution to hundreds of people.
 
 
 (snip)-
 I have been watching this thread on evolution-hackers.  Please
 remember when considering design of these things, that some of us are
 running multi-user systems with hundreds of users on at a time.  There
 might be cases where memory is better than disks in such cases.  I
 have hundreds of users and can easily get all of this into 16GB.  What
 will be the result of all of those people now using the disk?

Why wont it slowdown your evolution instances? Simple:

Your users typically don't need every single message header that can be
viewed using the summary view at every millisecond of the day. They
typically need 50 of them per second, perhaps 100 or 200 for insanely
extreme cases.

Their fingers can't scroll fast enough for a disk (that is largely
cached) to be to slow. Also note that the index which must be iterated
on that disk will be a continuous block of data. Such data can be read
almost instantly (how long does it take to read a few megabytes from a
modern disk? -- that would be the entire index, you only need lets say
100kb of that data per ten seconds IF the user is really fast).

If the users start to scroll lots of times through their summary view,
the Linux kernel will probably put some of these indexes in memory
buffers.

Imagine spastic users that do nothing but scroll all day long at
extremely rapid speeds .. multiply that with 10.000 such users, and you
still wouldn't have any problems at all.


-- 
Philip Van Hoof, software developer at x-tend 
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
work: vanhoof at x-tend dot be 
http://www.pvanhoof.be - http://www.x-tend.be

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] ... and how camel should be

2006-02-13 Thread Parthasarathi Susarla
Hiya,

On Mon, 2006-02-13 at 22:38 +0100, Philip Van Hoof wrote:
 On Mon, 2006-02-13 at 15:10 -0500, Jeffrey Stedfast wrote:
  This would take several years to implement - likely to require complete
  rewrites of at least some of the providers.
  
  I don't really see this happening anytime soon.
 
 What about the disk summary branch?
Hmm... Yes, it exists, as Michael wrote it, never been tested and worked
on since then. I have just started some work on it - to get it merged
with the HEAD (once we branch for 2-14), and i shall keep it posted
here, whenever it happens. 
The details of the branch are here
http://go-evolution.org/On-disk_summaries (written by NotZed)

 
 Will this branch have likewise functionality?

The disk summaries branch does something on those lines, but it also
means a lot of disk I/O than before, and only after prolonged
testing/usage would we now how well it would work.

 Difficult is fun.
 
:) Am glad this discussion has begun.

-partha
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers