Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On Wed, 2009-12-16 at 19:56 -0500, Jeffrey Stedfast wrote: Does it really crash? It used to just regenerate the summary files. yes, on out of memory, as it tries to allocate a very large memory block due to misreading items. b) you cannot just drop it and regenerate, because it holds some information for local providers, like labels, tags and such. The point of the version info was so that you could do things like: if (summary-version CAMEL_64BIT_SUMMARY_VERSION) { off_t offset; camel_file_utils_load_off_t (file, offset); info-offset = offset; } else { camel_file_utils_load_int64 (file, info-offset); } If you do this, then you don't actually lose any information. I do not think the above will work together with defaulting to LARGEFILE compile flag, but the other way would, like defaulting to load_int32 for older summaries and reading off_t for new. I'm wondering whether some distro has the largefile support enabled these days, as if so, then the decision what to use as an off_t size isn't that easy. Maybe they have, or it's enough to compile under 64 bit to have the size changed. I didn't try this, I just suppose because of reported issues. Milan ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On Thu, 2009-12-17 at 17:11 +0530, chen wrote: So I think we can conclude the way to go as maildir. We will also have a preference option for sharing, which will be disabled by default for the local folders in evolution. As waiters and waitresses in the US would say on such an occasion: excellent choice, sir ;-) Thanks for bringing this up and taking the feedback into account. -- Bye, Patrick Ohly -- patrick.o...@gmx.de http://www.estamos.de/ ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
chenthill wrote: On Wed, 2009-12-16 at 09:56 -0500, Jeffrey Stedfast wrote: On 12/15/2009 02:46 PM, Chenthill wrote: Hi fellow hackers!! I have been working for a while during last week on one the blockers in evolution - https://bugzilla.gnome.org/show_bug.cgi?id=550414 - 'Folder and summary mismatch error'(old one - https://bugzilla.gnome.org/show_bug.cgi?id=213072). As a matter of fact we have been working as a team to get the blockers down. I have not been able to reproduce the issue or yet find the exact problematic area. The mismatch in the frompos index in the folder summary may be caused by either a threading issue or a crash while storing the indexes. I am still investigating it to find the real cause. I don't think it's a threading or crash issue. Looking through the comments from both the bugs and the fixes that have gone through, i had this thought. Any other clues ? I'd look into the situation where the user expunges a folder. When the mbox gets rewritten, maybe the from_offset values aren't updated or something to reflect the new offset. That's all I can think of at the moment. I'm sure this avenue has probably been explored before, but maybe something got missed. Thinking about the amount of time this bug has been there for (primarily with our mbox implementation) , I thought of making some change which could benefit more rather than trying to just fix this. This might not be an ideal way to think though :) Well, either way the bug should be fixed. Switching to Maildir is arguably a good choice regardless, though. Jeff ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On Wed, 2009-12-16 at 16:54 +0530, Srinivasa Ragavan wrote: I definitely won't switch away from maildir as my format of choice because it integrates nicely with offlineimap. Sure, I think users should have that freedom. Camel's local folder implementation has that built in. This new approach should be the default for new users, and as option for users to migrate to it for existing users. If users willingly stay with maildir or 1mbox-per-folder that should also be there. I don't really see the point of inventing a new file-per-message format when maildir already exists, is already implemented in evolution (albeit buggily), and is a very popular format. NIH seems a bit pointless really. Ross -- Ross Burton mail: r...@burtonini.com jabber: r...@burtonini.com www: http://burtonini.com signature.asc Description: This is a digitally signed message part ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On Wed, 2009-12-16 at 16:54 +0530, Srinivasa Ragavan wrote: On Wed, Dec 16, 2009 at 4:46 PM, Patrick Ohly patrick.o...@gmx.de wrote: On Wed, 2009-12-16 at 09:19 +0530, Chenthill wrote: On Tue, 2009-12-15 at 15:09 -0500, Reid Thompson wrote: On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote: * Not able to create subfolders under INBOX - https://bugzilla.gnome.org/show_bug.cgi?id=536240 . I hadn't noticed the above, so I guess it's a non-issue for me What is the second issue? Sorry missed to mention it here, with maildir we would need to rename files for unread/read flag changes which can be avoided in the later approach. So you expect renaming a file to be slower than rewriting the whole file content? Somehow my gut feeling says that it will be the other way around. But I don't have hard data, of course. I fell it will be slower compared to the other approach. You dont rewrite the file entirely at all in normal usage. Setting mail flags was mentioned as the reason for not using maildir. Adding a mail flag to an mbox mail requires rewriting the whole file. Or do you assume that you can overwrite just some bytes in an existing mail header? That will still lead to writing a complete sector to disk, in contrast to renaming a file which I expect to be implemented more intelligently by the file system. Actually, writing a micro-benchmark for this is doable. Before you seriously consider investing effort into this, I'd really prefer to see some hard data for a rename vs. rewrite comparison. May be when you expunge folder or export it, the summary data could be updated with the mail's mbox. But its debatable at some level, I would say. We are debating the merits of the actual mail storage, not the summary data. I have wiped out folders.db often enough that I won't use Evolution when it switches to storing valuable, unrecoverable information like the mail was read flag there. I definitely won't switch away from maildir as my format of choice because it integrates nicely with offlineimap. Sure, I think users should have that freedom. Camel's local folder implementation has that built in. This new approach should be the default for new users, and as option for users to migrate to it for existing users. If users willingly stay with maildir or 1mbox-per-folder that should also be there. In case it wasn't obvious, I don't see the point of diverting resources away from an established format in favor of something new. It's mbox doesn't count, you would have to write the complete directory tree handling from scratch. Of course, it is your time. I'm just expression my concerns as a user of the somewhat neglected maildir format. -- Bye, Patrick Ohly -- patrick.o...@gmx.de http://www.estamos.de/ ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On Wed, 2009-12-16 at 17:34 +0530, Srinivasa Ragavan wrote: Really, we aren't inventing a new format. Its mbox, but organized a bit differently, like how some providers store, (Exchange, GW, (IMAP4 ?) store. Perhaps a naive question, but does it have to be stored as mbox? Could we not just store the raw message content and skip the From_ line (and especially From quoting), since the only purpose it serves is to delimit multiple messages in the same file? With a file-per-message approach there's no need for a delimiter line, and an mbox can be constructed trivially by passing the content through a CamelMimeFilterFrom. Right? Matthew Barnes ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On Wed, 2009-12-16 at 16:54 +0530, Srinivasa Ragavan wrote: On Wed, Dec 16, 2009 at 4:46 PM, Patrick Ohly patrick.o...@gmx.de wrote: On Wed, 2009-12-16 at 09:19 +0530, Chenthill wrote: On Tue, 2009-12-15 at 15:09 -0500, Reid Thompson wrote: On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote: * Not able to create subfolders under INBOX - https://bugzilla.gnome.org/show_bug.cgi?id=536240 . I hadn't noticed the above, so I guess it's a non-issue for me What is the second issue? Sorry missed to mention it here, with maildir we would need to rename files for unread/read flag changes which can be avoided in the later approach. So you expect renaming a file to be slower than rewriting the whole file content? Somehow my gut feeling says that it will be the other way around. But I don't have hard data, of course. I fell it will be slower compared to the other approach. You dont rewrite the file entirely at all in normal usage. May be when you expunge folder or export it, the summary data could be updated with the mail's mbox. But its debatable at some level, I would say. I don't think the rename triggers rewrite of a file. It isn a costly operation. But just wonder why we need to do that at all ? Could it be costly in distributed environments ? (not sure how significant this case would be for us) I also come across another issue, even if we start using maildir format, we cannot assume that multiple applications would access the data especially since local folders belong to evolution and would be used frequently. (see https://bugzilla.gnome.org/show_bug.cgi?id=592310 ) I definitely won't switch away from maildir as my format of choice because it integrates nicely with offlineimap. Sure, I think users should have that freedom. Camel's local folder implementation has that built in. This new approach should be the default for new users, and as option for users to migrate to it for existing users. If users willingly stay with maildir or 1mbox-per-folder that should also be there. Looking at the information gathered, am favoring Approach #2 - mboxfile-per-mail. I would be starting the work this week if I don't see any reasons to change the approach. Just want to put in the best possible solution :) - Chenthill. -Srini ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On Wed, 2009-12-16 at 07:50 -0500, Matthew Barnes wrote: On Wed, 2009-12-16 at 13:18 +0100, Patrick Ohly wrote: We are debating the merits of the actual mail storage, not the summary data. I have wiped out folders.db often enough that I won't use Evolution when it switches to storing valuable, unrecoverable information like the mail was read flag there. Valuable, unrecoverable message meta-data (flags, tags, labels, etc.) should really be split off as a separate database: folders.db and, say, metadata.db. Perhaps it is just me or the way how SQLite was used so far in Evolution, but given the past experience with folders.db, I also have doubts about the reliability of such a metadata.db. I'd prefer a standard format that can be access by other tools. But you are right, in general separating the different kinds of data into different physical files is certainly an improvement over the current situation. -- Bye, Patrick Ohly -- patrick.o...@gmx.de http://www.estamos.de/ ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote: One advantage which I see with #1 is that its a standard way. One thing about both approaches, is that they will consume more space; eg. on my 'Sent' folder with 21k messages - on average (on ext3) we will chew ~2k of space for each of these; which is ~40Mb - around 2%. For my cvs commits mail archive, perhaps the worst case, of 350Mb, (22334 mails) - this would also be around 43Mb - at ~12%. That's not as bad as I was worried about; though of course there is some overhead in terms of inodes and directory entries to worry about that will crank up the overall size - but it doesn't seem horrible even on ext3. Of course - ext4 / btrfs will do a much better job here too; so - less to worry about in future. HTH, Michael. -- michael.me...@novell.com , Pseudo Engineer, itinerant idiot ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
Reid Thompson wrote: Patrick Ohly wrote: On Wed, 2009-12-16 at 18:55 +0530, Srinivasa Ragavan wrote: Maildir is good, none denies it. But maildir is already there, but not sure how many use it. I do, and I know several other people who do +1 The local default mbox account on my system is empty. I filter some of the email from my Exchange OWA acct to maildir. I process my isp and gmail mail into maildir accts(along with some other small accounts). another reason i vote to use/'fix' maildir.. working from home this morning, rather than pipe evo over the network I usually just ssh into my desktop and fire up mutt (already configured to know where the evo local directories are, as well as imap to the exchange server). with this setup I can move emails wherever needed between local/server accounts fairly easily. I also tend to use this method if I have to login while on vacation or traveling. ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
Matthew Barnes wrote: On Wed, 2009-12-16 at 09:56 -0500, Jeffrey Stedfast wrote: This just means the proper LARGEFILE flags are not being used at compile time. Either EDS's configure isn't doing proper checks or else Evolution itself isn't doing proper checks and there is some sort of clash. An easy way to fix this is to do what I did with GMime, which is to simply make all public stream APIs that use off_t use goffset instead (I'm sure Matthew will want to do this anyway). Then the problem is much simpler to solve - just make sure that Camel uses the proper CFLAGS for LARGEFILE support (which you can steal from GMime's configure scripts). IIRC, the issue is LARGEFILE support is still disabled by default, and Ah. I'd still suggest switching over to goffset rather than using off_t in the public API (and for internal state on indexes and wherever else). there was concern that simply turning it on would somehow break existing installs. I'm fuzzy on the details, but vaguely recall it being about a field size in some binary file being dependent on sizeof(off_t), which would change with LARGEFILE support enabled and thus break the binary format. The summary files would have had this problem, but they would have just been regenerated, so not really an issue. Unfortunately I don't remember which file that was an issue in. It may have already been addressed by the move to a summary database, or I may just be propagating false rumors. There may have been other files, but the summary files would definitely have been affected (tho it would not have been problematic). Jeff ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
Jeffrey Stedfast wrote: Matthew Barnes wrote: there was concern that simply turning it on would somehow break existing installs. I'm fuzzy on the details, but vaguely recall it being about a field size in some binary file being dependent on sizeof(off_t), which would change with LARGEFILE support enabled and thus break the binary format. The summary files would have had this problem, but they would have just been regenerated, so not really an issue. Also, this is why the summary files had versioning info. Jeff ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On Wed, 2009-12-16 at 11:35 -0500, Jeffrey Stedfast wrote: The summary files would have had this problem, but they would have just been regenerated, so not really an issue. Hi, a) it's similar as moving from 32bit to 64bit architecture or the other way; evo crashes for these situations, because the version is fine, but it doesn't know whether the previous one was a 32bit or 64bit machine, aka whether it should do some translation or not. (and doing translation is not as that simple for usage of functions which are doing sizeof(...); not a problem with db-summary, but still might be with indexes and store summaries, I didn't check that.) b) you cannot just drop it and regenerate, because it holds some information for local providers, like labels, tags and such. Silly reasons, but that's why I think those bugs are still opened (no numbers handy). Bye, Milan ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On 12/16/09 5:18 AM, Patrick Ohly wrote: I fell it will be slower compared to the other approach. You dont rewrite the file entirely at all in normal usage. Setting mail flags was mentioned as the reason for not using maildir. Adding a mail flag to an mbox mail requires rewriting the whole file. Or do you assume that you can overwrite just some bytes in an existing mail header? That will still lead to writing a complete sector to disk, in contrast to renaming a file which I expect to be implemented more intelligently by the file system. Actually, writing a micro-benchmark for this is doable. Before you seriously consider investing effort into this, I'd really prefer to see some hard data for a rename vs. rewrite comparison. Ever since the early controversy in ext4 over lost data in KDE configuration files, file renames force an ordered journal commit in ext4. So file rename is more expensive than you may think. This is of course still cheaper than fsync, if Evolution makes a habit of fsync'ing its email files. -- Zan Lynx zl...@acm.org Knowledge is Power. Power Corrupts. Study Hard. Be Evil. ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On 12/16/2009 02:40 PM, Milan Crha wrote: On Wed, 2009-12-16 at 11:35 -0500, Jeffrey Stedfast wrote: The summary files would have had this problem, but they would have just been regenerated, so not really an issue. Hi, a) it's similar as moving from 32bit to 64bit architecture or the other way; evo crashes for these situations, because the version is fine, but it doesn't know whether the previous one was a 32bit or 64bit machine, aka whether it should do some translation or not. (and doing translation is not as that simple for usage of functions which are doing sizeof(...); not a problem with db-summary, but still might be with indexes and store summaries, I didn't check that.) Does it really crash? It used to just regenerate the summary files. b) you cannot just drop it and regenerate, because it holds some information for local providers, like labels, tags and such. The point of the version info was so that you could do things like: if (summary-version CAMEL_64BIT_SUMMARY_VERSION) { off_t offset; camel_file_utils_load_off_t (file, offset); info-offset = offset; } else { camel_file_utils_load_int64 (file, info-offset); } If you do this, then you don't actually lose any information. Jeff ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
Hello everyone, On Wed, Dec 16, 2009 at 1:16 AM, Chenthill pchenth...@novell.com wrote: Hi fellow hackers!! I have been working for a while during last week on one the blockers in evolution - https://bugzilla.gnome.org/show_bug.cgi?id=550414 - 'Folder and summary mismatch error'(old one - https://bugzilla.gnome.org/show_bug.cgi?id=213072). As a matter of fact we have been working as a team to get the blockers down. I have not been able to reproduce the issue or yet find the exact problematic area. The mismatch in the frompos index in the folder summary may be caused by either a threading issue or a crash while storing the indexes. I am still investigating it to find the real cause. Looking at other issues such as, https://bugzilla.gnome.org/show_bug.cgi?id=522433 - 'Fails opening mbox 2GB', just got a thought if we could solve both the issues by, Approach #1, migrating local storage from mbox to maildir format. With maildir I have heard about two issues, * Not able to create subfolders under INBOX - https://bugzilla.gnome.org/show_bug.cgi?id=536240 . Approach #2, Migrate from a single mbox file per folder to mbox per email. Srini mentioned an advantage that this would avoid the file renames that maildir does. I think this is much like how other remote providers in evo store the email. It should be rocket fast!. Expunge is just unlink one file. Change of flags etc rewrites just that file when upsync happens. No rewrite 2gb of a file, to expunge 10 mails. Startup/shutdown faster etc etc. I'm a fan of this, what so ever! To overcome 100K mbox files in one folder, distribute under multi-level subdirs and let summary know that. I thought of bring this in this list to gather more opinions to choose the right one. The approach #2 seems a better one as we are choosing a way for storing the messages internally in evo. Are we missing to see anything while we choose the second one ? One advantage which I see with #1 is that its a standard way. You store everything as mbox only still. But one mbox file per mail, distributed in multiple subdirs. Ofcourse you dont want 100K files in one folder, which could move the bottleneck to a different place. Distribute under multi-level subfolders or something like that. -Srini ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote: Approach #1, migrating local storage from mbox to maildir format. With maildir I have heard about two issues, * Not able to create subfolders under INBOX - https://bugzilla.gnome.org/show_bug.cgi?id=536240 . I hadn't noticed the above, so I guess it's a non-issue for me What is the second issue? Approach #2, Migrate from a single mbox file per folder to mbox per email. Srini mentioned an advantage that this would avoid the file renames that maildir does. I think this is much like how other remote providers in evo store the email. You still have a filename per email right? What naming convention would be used? I thought of bring this in this list to gather more opinions to choose the right one. The approach #2 seems a better one as we are choosing a way for storing the messages internally in evo. Are we missing to see anything while we choose the second one ? One advantage which I see with #1 is that its a standard way. Would Evo provide a mechanism to migrate/convert a mailbox/folder with this format to a mailbox/folder with a standard format? I.E. Currently, dragging a folder in one format to an account in a different format performs the proper migration to the new account's folder format. Or would it be up to the user to do something like for file in `ls *mbox` do movemail $file maildir:///tmp/maildr done if they wanted to migrate to standard format? Thanks, Chenthill. ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers
Re: [Evolution-hackers] Moving from the single mbox file format for the local folders
On Tue, 2009-12-15 at 15:09 -0500, Reid Thompson wrote: On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote: Approach #1, migrating local storage from mbox to maildir format. With maildir I have heard about two issues, * Not able to create subfolders under INBOX - https://bugzilla.gnome.org/show_bug.cgi?id=536240 . I hadn't noticed the above, so I guess it's a non-issue for me What is the second issue? Sorry missed to mention it here, with maildir we would need to rename files for unread/read flag changes which can be avoided in the later approach. Approach #2, Migrate from a single mbox file per folder to mbox per email. Srini mentioned an advantage that this would avoid the file renames that maildir does. I think this is much like how other remote providers in evo store the email. You still have a filename per email right? yes. What naming convention would be used? We would be using the uid of the message . uid would be a 32 bit unsigned integer. I thought of bring this in this list to gather more opinions to choose the right one. The approach #2 seems a better one as we are choosing a way for storing the messages internally in evo. Are we missing to see anything while we choose the second one ? One advantage which I see with #1 is that its a standard way. Would Evo provide a mechanism to migrate/convert a mailbox/folder with this format to a mailbox/folder with a standard format? Yes we would provide an UI option for exporting data. While going with the second approach, I would prefer exporting in maildir format. I.E. Currently, dragging a folder in one format to an account in a different format performs the proper migration to the new account's folder format. Or would it be up to the user to do something like for file in `ls *mbox` do movemail $file maildir:///tmp/maildr done if they wanted to migrate to standard format? It would be the same behavior as before with the mbox-file-per-email if we choose to do that. - Chenthill. Thanks, Chenthill. ___ Evolution-hackers mailing list Evolution-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/evolution-hackers