Re: [Evolution-hackers] Moving the struct instance heap space to mmap

2006-09-07 Thread Philip Van Hoof
On Thu, 2006-09-07 at 21:14 +0200, Philip Van Hoof wrote:

 typedef struct {
 CamelFolderSummary *fs;
 int nth;
 MemoryMessageInfo *m;
 } CamelMessageInfo;
 
 
 #define camel_message_info_from (x)   \
 ((x)-m ? (x)-m-from : (x)-fs-sstart+*((x)-fs-istart\
  +(sizeof(int)*4*(x)-nth)+1))
 #define camel_message_info_subject (x)\
 ((x)-m ? (x)-m-subject : (x)-fs-sstart+*((x)-fs-istart \
  +(sizeof(int)*4*(x)-nth)+2))
 #define camel_message_info_from (x)   \
 ((x)-m ? (x)-m-to : (x)-fs-sstart+*((x)-fs-istart  \
  +(sizeof(int)*4*(x)-nth)+3))

Note of course that once the offsets are stored in the mmaped index
file, you can have a lot more than just the from, subject and from
members at almost no cost (only mmap  file size). Whereas today the
entire structure size for a camel-message-info grows to more than these
120 bytes.

That is the core idea here: reduce the size of the camel-message-info
type so that each instance needed is relatively small. Each instance
will do a two-stage lookup in two mmapped files. Once for knowing the
offset and once for getting a pointer (at that offset) with the actual
data. Each instance that isn't yet written to disk, has a normal memory
copy using the MemoryMessageInfo struct.

In stead of 120 bytes would this camel-message-info consume (aligned on
four bytes, as on a x86 i386) 12 bytes. That's 10 times less per item.

ps. I'm indeed interested in your input/ideas on this.

-- 
Philip Van Hoof, software developer at x-tend 
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
work: vanhoof at x-tend dot be 
http://www.pvanhoof.be - http://www.x-tend.be

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving the struct instance heap space to mmap

2006-09-07 Thread Philip Van Hoof
On Thu, 2006-09-07 at 21:14 +0200, Philip Van Hoof wrote:

 To read message n, you would simply do something like:
 
 from = sstart + *(istart + (sizeof (int) * 4 * n) + 1)
 subject = sstart + *(istart + (sizeof (int) * 4 * n) + 2)
 to = sstart + *(istart + (sizeof (int) * 4 * n) + 3)
 flags = sstart + *(istart + (sizeof (int) * 4 * n) + 4)
 

No no no no no. I meant (something like) this of course:


unsigned char *sstart = (uchar*)mmap(..), *istart = (uchar*)mmap(..);
#define AMOUNT_OF_OFFSETS_PER_RECORD 4
#define AOO AMOUNT_OF_OFFSETS_PER_RECORD

from = sstart + *(istart + ((sizeof (uint32_t) * AOO * n) + sizeof (uint32_t)))
subject = sstart + *(istart + ((sizeof (uint32_t) * AOO * n) + sizeof 
(uint32_t)))
to = sstart + *(istart + ((sizeof (uint32_t) * AOO * n) + sizeof (uint32_t)))
flags = sstart + *(istart + ((sizeof (uint32_t) * AOO * n) + sizeof (uint32_t)))

To much pointers in my head ;)

-- 
Philip Van Hoof, software developer at x-tend 
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
work: vanhoof at x-tend dot be 
http://www.pvanhoof.be - http://www.x-tend.be

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving the struct instance heap space to mmap

2006-09-07 Thread Federico Mena Quintero
On Thu, 2006-09-07 at 21:14 +0200, Philip Van Hoof wrote:

 The *new*/*extra* idea is to create a second index file which contains
 the offsets to the pointers in the camel summary file. Then mmap also
 that file. Extra because the idea will build on top of the existing

Ummm, but this won't reduce your working set by very much, will it?

I haven't looked at the details, but can't you just keep an array in
memory with pointers to the *start* of each summary block, and then
compute the other pointers on demand?

Sort of

gpointer *pointers_to_message_summaries;
gpointer base;

base = mmap (summary);

pointers_to_message_summaries = g_new (gpointer, num_messages);

pointers_to_message_summaries[0] = base;
pointers_to_message_summaries[1] = ...;

and then a set of functions

gpointer
gimme_ptr_to_subject_for_message_number (int n)
{
   return pointers_to_message_summaries[n] + SUBJECT_OFFSET;
}

etc.  Then, instead of 120 bytes per message, you'd have just
sizeof(pointer) and a little more code to compute the offsets on demand.

  Federico

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving the struct instance heap space to mmap

2006-09-07 Thread Michael Zucchi
Hmm, will it really reduce memory usage though? I'm not so sure. Remember that although the summary file contains all strings on disk, in memory they are actually unique-ized. For a lot of load-types (e.g. mailing lists), this saves a trememdous amount of memory - even after all the overhead of the hashtable to manage it. Address and subject strings in particular. And a mmap file isn't terribly different to performance to malloc'd memory - once the latter is setup, in both cases the kernel is the one loading it off disk (swap, or filesystem).
(an mmap file IS just memory, so if mmap'd file size + in-memory support tables  in-memory version, then you haven't generally won - even if 'ps' shows you have).Note also that all of the summary items, and all of the strings thus stored, are using special allocators which reduce - significantly - the overhead of malloc, on these small allocations. In earlier versions I tried a different string allocator for each messageinfo which mimicks much of what you're doing here, but in memory. Strings were stored in an array structure, but I only stored a single pointer to the base of the array, and packed the strings into a NUL separated right-sized buffer, and searched each time I needed to access them, overhead was only about 5 bytes (pointer to the 'array' plus an array 'limit'). Performance was fine, but memory usage was significantly larger than the current model, where strings are 'unique-ized' - like 30% or mo
 re iirc.
Having extra indices on disk - you're really just writing a datbase table - and all of the associated complexity, consistency and coherency issues that implies. And unless you do something 'proper' like a btree with a unique key, you're going to have to end up storing plenty of extra support stuff in memory anyway, 
e.g. a hashtable to find items by uid, the sorted list of messages for the view, and so on.We did a lot to reduce per-message overhead at the camel level, but its an area of diminishing returns - the ui table view will still take as much memory, etc.
On 08/09/06, Philip Van Hoof [EMAIL PROTECTED] wrote:
On Thu, 2006-09-07 at 21:14 +0200, Philip Van Hoof wrote: To read message n, you would simply do something like: from = sstart + *(istart + (sizeof (int) * 4 * n) + 1) subject = sstart + *(istart + (sizeof (int) * 4 * n) + 2)
 to = sstart + *(istart + (sizeof (int) * 4 * n) + 3) flags = sstart + *(istart + (sizeof (int) * 4 * n) + 4)No no no no no. I meant (something like) this of course:unsigned char *sstart = (uchar*)mmap(..), *istart = (uchar*)mmap(..);
#define AMOUNT_OF_OFFSETS_PER_RECORD 4#define AOO AMOUNT_OF_OFFSETS_PER_RECORDfrom = sstart + *(istart + ((sizeof (uint32_t) * AOO * n) + sizeof (uint32_t)))subject = sstart + *(istart + ((sizeof (uint32_t) * AOO * n) + sizeof (uint32_t)))
to = sstart + *(istart + ((sizeof (uint32_t) * AOO * n) + sizeof (uint32_t)))flags = sstart + *(istart + ((sizeof (uint32_t) * AOO * n) + sizeof (uint32_t)))To much pointers in my head ;)--Philip Van Hoof, software developer at x-tend
home: me at pvanhoof dot begnome: pvanhoof at gnome dot orgwork: vanhoof at x-tend dot behttp://www.pvanhoof.be - http://www.x-tend.be

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers