Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Ross Burton
On Wed, 2009-12-16 at 16:54 +0530, Srinivasa Ragavan wrote:
  I definitely won't switch away from maildir as my format of choice
  because it integrates nicely with offlineimap.
 
 Sure, I think users should have that freedom. Camel's local folder
 implementation has that built in. This new approach should be the
 default for new users, and as option for users to migrate to it for
 existing users. If users willingly stay with maildir or
 1mbox-per-folder that should also be there.

I don't really see the point of inventing a new file-per-message format
when maildir already exists, is already implemented in evolution (albeit
buggily), and is a very popular format.  NIH seems a bit pointless
really.

Ross
-- 
Ross Burton mail: r...@burtonini.com
  jabber: r...@burtonini.com
   www: http://burtonini.com


signature.asc
Description: This is a digitally signed message part
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Patrick Ohly
On Wed, 2009-12-16 at 16:54 +0530, Srinivasa Ragavan wrote:
 On Wed, Dec 16, 2009 at 4:46 PM, Patrick Ohly patrick.o...@gmx.de wrote:
  On Wed, 2009-12-16 at 09:19 +0530, Chenthill wrote:
  On Tue, 2009-12-15 at 15:09 -0500, Reid Thompson wrote:
   On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote:
* Not able to create subfolders under INBOX -
https://bugzilla.gnome.org/show_bug.cgi?id=536240 .
   I hadn't noticed the above, so I guess it's a non-issue for me
  
   What is the second issue?
  Sorry missed to mention it here, with maildir we would need to rename
  files for unread/read flag changes which can be avoided in the later
  approach.
 
  So you expect renaming a file to be slower than rewriting the whole file
  content? Somehow my gut feeling says that it will be the other way
  around. But I don't have hard data, of course.
 
 I fell it will be slower compared to the other approach. You dont
 rewrite the file entirely at all in normal usage.

Setting mail flags was mentioned as the reason for not using maildir.
Adding a mail flag to an mbox mail requires rewriting the whole file. Or
do you assume that you can overwrite just some bytes in an existing mail
header?

That will still lead to writing a complete sector to disk, in contrast
to renaming a file which I expect to be implemented more intelligently
by the file system. Actually, writing a micro-benchmark for this is
doable. Before you seriously consider investing effort into this, I'd
really prefer to see some hard data for a rename vs. rewrite
comparison.

 May be when you
 expunge folder or export it, the summary data could be updated with
 the mail's mbox. But its debatable at some level, I would say.

We are debating the merits of the actual mail storage, not the summary
data. I have wiped out folders.db often enough that I won't use
Evolution when it switches to storing valuable, unrecoverable
information like the mail was read flag there.

  I definitely won't switch away from maildir as my format of choice
  because it integrates nicely with offlineimap.
 
 Sure, I think users should have that freedom. Camel's local folder
 implementation has that built in. This new approach should be the
 default for new users, and as option for users to migrate to it for
 existing users. If users willingly stay with maildir or
 1mbox-per-folder that should also be there.

In case it wasn't obvious, I don't see the point of diverting resources
away from an established format in favor of something new. It's mbox
doesn't count, you would have to write the complete directory tree
handling from scratch.

Of course, it is your time. I'm just expression my concerns as a user of
the somewhat neglected maildir format.

-- 
Bye, Patrick Ohly
--  
patrick.o...@gmx.de
http://www.estamos.de/


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Matthew Barnes
On Wed, 2009-12-16 at 17:34 +0530, Srinivasa Ragavan wrote:
 Really, we aren't inventing a new format. Its mbox, but organized a
 bit differently, like how some providers store, (Exchange, GW, (IMAP4
 ?) store.

Perhaps a naive question, but does it have to be stored as mbox?  Could
we not just store the raw message content and skip the From_ line (and
especially From quoting), since the only purpose it serves is to
delimit multiple messages in the same file?

With a file-per-message approach there's no need for a delimiter line,
and an mbox can be constructed trivially by passing the content through
a CamelMimeFilterFrom.  Right?

Matthew Barnes

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread chen
On Wed, 2009-12-16 at 16:54 +0530, Srinivasa Ragavan wrote:
 On Wed, Dec 16, 2009 at 4:46 PM, Patrick Ohly patrick.o...@gmx.de
 wrote:
  On Wed, 2009-12-16 at 09:19 +0530, Chenthill wrote:
  On Tue, 2009-12-15 at 15:09 -0500, Reid Thompson wrote:
   On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote:
* Not able to create subfolders under INBOX -
https://bugzilla.gnome.org/show_bug.cgi?id=536240 .
   I hadn't noticed the above, so I guess it's a non-issue for me
  
   What is the second issue?
  Sorry missed to mention it here, with maildir we would need to
 rename
  files for unread/read flag changes which can be avoided in the
 later
  approach.
 
  So you expect renaming a file to be slower than rewriting the whole
 file
  content? Somehow my gut feeling says that it will be the other way
  around. But I don't have hard data, of course.
 
 I fell it will be slower compared to the other approach. You dont
 rewrite the file entirely at all in normal usage. May be when you
 expunge folder or export it, the summary data could be updated with
 the mail's mbox. But its debatable at some level, I would say.
I don't think the rename triggers rewrite of a file. It isn a costly
operation. But just wonder why we need to do that at all ? Could it be
costly in distributed environments ? (not sure how significant this case
would be for us)

I also come across another issue, even if we start using maildir format,
we cannot assume that multiple applications would access the data
especially since local folders belong to evolution and would be used
frequently. (see https://bugzilla.gnome.org/show_bug.cgi?id=592310 )

 
 
  I definitely won't switch away from maildir as my format of choice
  because it integrates nicely with offlineimap.
 
 Sure, I think users should have that freedom. Camel's local folder
 implementation has that built in. This new approach should be the
 default for new users, and as option for users to migrate to it for
 existing users. If users willingly stay with maildir or
 1mbox-per-folder that should also be there.
Looking at the information gathered, am favoring Approach #2 -
mboxfile-per-mail. I would be starting the work this week if I don't see
any reasons to change the approach. Just want to put in the best
possible solution :)

- Chenthill.
 
 -Srini


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Patrick Ohly
On Wed, 2009-12-16 at 07:50 -0500, Matthew Barnes wrote:
 On Wed, 2009-12-16 at 13:18 +0100, Patrick Ohly wrote:
  We are debating the merits of the actual mail storage, not the summary
  data. I have wiped out folders.db often enough that I won't use
  Evolution when it switches to storing valuable, unrecoverable
  information like the mail was read flag there.
 
 Valuable, unrecoverable message meta-data (flags, tags, labels, etc.)
 should really be split off as a separate database: folders.db and, say,
 metadata.db.

Perhaps it is just me or the way how SQLite was used so far in
Evolution, but given the past experience with folders.db, I also have
doubts about the reliability of such a metadata.db. I'd prefer a
standard format that can be access by other tools.

But you are right, in general separating the different kinds of data
into different physical files is certainly an improvement over the
current situation.

-- 
Bye, Patrick Ohly
--  
patrick.o...@gmx.de
http://www.estamos.de/


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Michael Meeks

On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote:
 One advantage which I see with #1 is that its a standard way.

One thing about both approaches, is that they will consume more space;
eg. on my 'Sent' folder with 21k messages - on average (on ext3) we will
chew ~2k of space for each of these; which is ~40Mb - around 2%.

For my cvs commits mail archive, perhaps the worst case, of 350Mb,
(22334 mails) - this would also be around 43Mb - at ~12%.

That's not as bad as I was worried about; though of course there is
some overhead in terms of inodes and directory entries to worry about
that will crank up the overall size - but it doesn't seem horrible even
on ext3.

Of course - ext4 / btrfs will do a much better job here too; so - less
to worry about in future.

HTH,

Michael.

-- 
 michael.me...@novell.com  , Pseudo Engineer, itinerant idiot


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Reid Thompson

Reid Thompson wrote:

Patrick Ohly wrote:

On Wed, 2009-12-16 at 18:55 +0530, Srinivasa Ragavan wrote:

Maildir is good, none denies it. But maildir is already there, but not
sure how many use it.


I do, and I know several other people who do

+1
The local default mbox account on my system is empty.
I filter some of the email from my Exchange OWA acct to maildir.
I process my isp and gmail mail into maildir accts(along with some other 
small accounts).



another reason i vote to use/'fix' maildir..
working from home this morning, rather than pipe evo over the network I usually just ssh into my desktop and 
fire up mutt (already configured to know where the evo local directories are, as well as imap to the exchange 
server).  with this setup I can move emails wherever needed between local/server accounts fairly easily.  I 
also tend to use this method if I have to login while on vacation or traveling.

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Jeffrey Stedfast
Matthew Barnes wrote:
 On Wed, 2009-12-16 at 09:56 -0500, Jeffrey Stedfast wrote:
   
 This just means the proper LARGEFILE flags are not being used at
 compile time. Either EDS's configure isn't doing proper checks or else
 Evolution itself isn't doing proper checks and there is some sort of clash.

 An easy way to fix this is to do what I did with GMime, which is to
 simply make all public stream APIs that use off_t use goffset instead
 (I'm sure Matthew will want to do this anyway). Then the problem is
 much simpler to solve - just make sure that Camel uses the proper CFLAGS
 for LARGEFILE support (which you can steal from GMime's configure
 scripts).
 

 IIRC, the issue is LARGEFILE support is still disabled by default, and
   

Ah. I'd still suggest switching over to goffset rather than using off_t
in the public API (and for internal state on indexes and wherever else).

 there was concern that simply turning it on would somehow break existing
 installs.  I'm fuzzy on the details, but vaguely recall it being about a
 field size in some binary file being dependent on sizeof(off_t), which
 would change with LARGEFILE support enabled and thus break the binary
 format.
   

The summary files would have had this problem, but they would have just
been regenerated, so not really an issue.

 Unfortunately I don't remember which file that was an issue in.  It may
 have already been addressed by the move to a summary database, or I may
 just be propagating false rumors.
   

There may have been other files, but the summary files would definitely
have been affected (tho it would not have been problematic).

Jeff

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Jeffrey Stedfast
Jeffrey Stedfast wrote:
 Matthew Barnes wrote:
   
 there was concern that simply turning it on would somehow break existing
 installs.  I'm fuzzy on the details, but vaguely recall it being about a
 field size in some binary file being dependent on sizeof(off_t), which
 would change with LARGEFILE support enabled and thus break the binary
 format.
   
 

 The summary files would have had this problem, but they would have just
 been regenerated, so not really an issue.
   

Also, this is why the summary files had versioning info.

Jeff

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Milan Crha
On Wed, 2009-12-16 at 11:35 -0500, Jeffrey Stedfast wrote:
 The summary files would have had this problem, but they would have
 just been regenerated, so not really an issue. 

Hi,
a) it's similar as moving from 32bit to 64bit architecture or the other
way; evo crashes for these situations, because the version is fine, but
it doesn't know whether the previous one was a 32bit or 64bit machine,
aka whether it should do some translation or not. (and doing
translation is not as that simple for usage of functions which are doing
sizeof(...); not a problem with db-summary, but still might be with
indexes and store summaries, I didn't check that.)

b) you cannot just drop it and regenerate, because it holds some
information for local providers, like labels, tags and such.

Silly reasons, but that's why I think those bugs are still opened (no
numbers handy).
Bye,
Milan

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Zan Lynx

On 12/16/09 5:18 AM, Patrick Ohly wrote:

I fell it will be slower compared to the other approach. You dont
rewrite the file entirely at all in normal usage.


Setting mail flags was mentioned as the reason for not using maildir.
Adding a mail flag to an mbox mail requires rewriting the whole file. Or
do you assume that you can overwrite just some bytes in an existing mail
header?

That will still lead to writing a complete sector to disk, in contrast
to renaming a file which I expect to be implemented more intelligently
by the file system. Actually, writing a micro-benchmark for this is
doable. Before you seriously consider investing effort into this, I'd
really prefer to see some hard data for a rename vs. rewrite
comparison.


Ever since the early controversy in ext4 over lost data in KDE 
configuration files, file renames force an ordered journal commit in ext4.


So file rename is more expensive than you may think.

This is of course still cheaper than fsync, if Evolution makes a habit 
of fsync'ing its email files.

--
Zan Lynx
zl...@acm.org

Knowledge is Power.  Power Corrupts.  Study Hard.  Be Evil.
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Jeffrey Stedfast
On 12/16/2009 02:40 PM, Milan Crha wrote:
 On Wed, 2009-12-16 at 11:35 -0500, Jeffrey Stedfast wrote:
   
 The summary files would have had this problem, but they would have
 just been regenerated, so not really an issue. 
 
   Hi,
 a) it's similar as moving from 32bit to 64bit architecture or the other
 way; evo crashes for these situations, because the version is fine, but
 it doesn't know whether the previous one was a 32bit or 64bit machine,
 aka whether it should do some translation or not. (and doing
 translation is not as that simple for usage of functions which are doing
 sizeof(...); not a problem with db-summary, but still might be with
 indexes and store summaries, I didn't check that.)
   

Does it really crash? It used to just regenerate the summary files.

 b) you cannot just drop it and regenerate, because it holds some
 information for local providers, like labels, tags and such.
   

The point of the version info was so that you could do things like:

if (summary-version  CAMEL_64BIT_SUMMARY_VERSION) {
   off_t offset;
   camel_file_utils_load_off_t (file, offset);
   info-offset = offset;
} else {
   camel_file_utils_load_int64 (file, info-offset);
}

If you do this, then you don't actually lose any information.


Jeff
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers