Hi there hackers from all over the world, I have an idea to move the struct instance heap size of the CamelMes- sageInfoBase instances in TnyCamelHeader to an mmap. It does happen that I code my ideas into implementations. This is likely going to become such an idea. The problem is that it has been in my head for too long :)
The problem with these struct allocations is that they consume around 120 bytes each. Even with the mmap patch, I had to for each summary item (which is a lot like a header, as most people receiving this mail already know) malloc a sizeof (CamelMessageInfoBase), assign the pointers (from, subject, to) to locations in the mmap memory area and etcetera (the mmap patch is indeed extremely simple, like most concepts and ideas in programming). The *new*/*extra* idea is to create a second index file which contains the offsets to the pointers in the camel summary file. Then mmap also that file. "Extra" because the idea will build on top of the existing and working mmap idea (it's not yet in Evolution HEAD, but that's not because the patch isn't working. It is working -- I've been running it for two months without a single crash or regression --). I'm not yet planning to put this in a Camel that will also work with Evolution. My test will indeed be tinymail. Evolution might not be designed for this. Probably it will be possible, after some adaptations, to get Evolution running with this new idea. Changing Evolution once the idea is implemented would indeed reduce a big amount of memory (feel free to calculate how much you would reduce it, you can easily make a theoretical calculation of this. And it's indeed as significant as your calculation will tell you, if you have a lot messages being maintained by Evolution). Add to those theoretical calculations the heap-admin needed per malloc call on a specific size (might be somewhat reduced by gslice-like magazine allocators) and you might get an idea why Evolution is eating your memory .. today. The reality of both this new/extra idea and the mmap patch is that the memory is not really reduced HOWEVER BUT the memory will be mmaped by the kernel. This means that the kernel will swap-it-in when you need it and swap-it-out when you don't need it. Yet the application developer behind Evolution doesn't have to care about this. Cheap! But effective. For example "sorting": Because the qsort algorithm doesn't take large steps (it's a recursive algorithm that compares near values in a list), a kernel that mmaps using four kilobytes pages will not have to swap-in often when this happens. The swap-ins will mostly be sequential. Also note that the very nature of a mailbox is that it's ALMOST sorted the way most people want it viewed already IF you append or prepend new messages (don't insert them at a random location). Riddle me this: for example the mbox format is append, the summary format of Evolution is also append. No problem here. The summary file of camel will be generated by the camel_folder_refresh method, which still consumes way too much memory. I would need to redesign Camel for that not to consume as much memory. I'm waiting for the crazy fuckers that are going to implement the disksummary ideas (I repeat .. with me, once you ghost guys start with it, because I will join and put some of my energy in this). My *new*/*extra* idea: (between [these] means 32 bits and 0 here = '\0') [istart] = mmap ("~/fs/index") | [sstart] = mmap ("~/fs/summary") ---------------------------------+----------------------------------- [from_o][sub_o][to_o][flgs] ... | ... [from_o][sub_o][to_o][flgs] ... | ... [from_o][sub_o][to_o][flgs]- + sstart -> [0x00 0x00 0x00 0x04] | | `--------- + sstart -> [Piet][ <[EMAIL PROTECTED]>0] | `------------------ + sstart -> [Yeah][ yea][h000] `--------------------------- + sstart -> [Hans][ <[EMAIL PROTECTED]>0] [from_o][sub_o][to_o][flgs]- + sstart -> [0x00 0x00 0x00 0x03] | | `--------- + sstart -> [Hans][ <[EMAIL PROTECTED]>0] | `------------------ + sstart -> [Oehh][ooee][h000] `--------------------------- + sstart -> [Piet][ <[EMAIL PROTECTED]>0] ... ... To read message n, you would simply do something like: from = sstart + *(istart + (sizeof (int) * 4 * n) + 1) subject = sstart + *(istart + (sizeof (int) * 4 * n) + 2) to = sstart + *(istart + (sizeof (int) * 4 * n) + 3) flags = sstart + *(istart + (sizeof (int) * 4 * n) + 4) Maybe I made a little mistake in calculating the offsets here, maybe I need to think some more about the pointer arithmetics before posting bullshit. But .. anyway, that's the idea. The idea is indeed to rewrite the index file as often as needed, and to keep the summary file as static as possible. For example local message flags (like "Important" and "Read or Unread") go in that index file and setting them will rewrite the entire index file followed by a mremap (or munmap and mmap on platforms that don't support mremap). When messages get added, the plan is to mremap both the summary and index file BUT ONLY after a bunch of them are received AND at the end of the camel_folder_refresh (or whatever internal thing in Camel). Because the defines don't store pointers, this will indeed work if only the "nth" variables in the structs are correct (cool, eh .. this solves the current problem of the mmap patch Jeffrey, the one about new messages). My plan is to make sure that both istart and sstart are written with four bytes aligned information (strings are padded with NULLs to align on the fourth byte). In istart there will only be 32bit integers, which makes it aligned on four bytes. So: typedef struct { void *sstart, *istart; } CamelFolderSummary; typedef struct { char *from, *subject, *to; int flags; } MemoryMessageInfo; typedef struct { CamelFolderSummary *fs; int nth; MemoryMessageInfo *m; } CamelMessageInfo; #define camel_message_info_from (x) \ ((x)->m ? (x)->m->from : (x)->fs->sstart+*((x)->fs->istart \ +(sizeof(int)*4*(x)->nth)+1)) #define camel_message_info_subject (x) \ ((x)->m ? (x)->m->subject : (x)->fs->sstart+*((x)->fs->istart \ +(sizeof(int)*4*(x)->nth)+2)) #define camel_message_info_from (x) \ ((x)->m ? (x)->m->to : (x)->fs->sstart+*((x)->fs->istart \ +(sizeof(int)*4*(x)->nth)+3)) CamelFolderSummary* camel_folder_summary_new (char *index_file, char *summary_file) { CamelFolderSummary *fs = malloc (sizeof *i); fs->istart = mmap (0, length, PROT_READ, MAP_SHARED, open (index_file, ...)); fs->istart = mmap (0, length, PROT_READ, MAP_SHARED, open (summary_file, ...)); return fs; } CamelMessageInfo* camel_message_info_get (CamelFolderSummary *fs, int nth) { CamelMessageInfo *i = malloc (sizeof *i); i->fs = fs; i->nth = nth; i->m = NULL; } -- Philip Van Hoof, software developer at x-tend home: me at pvanhoof dot be gnome: pvanhoof at gnome dot org work: vanhoof at x-tend dot be http://www.pvanhoof.be - http://www.x-tend.be _______________________________________________ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers