Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Jeffrey Stedfast
On 12/16/2009 02:40 PM, Milan Crha wrote:
> On Wed, 2009-12-16 at 11:35 -0500, Jeffrey Stedfast wrote:
>   
>> The summary files would have had this problem, but they would have
>> just been regenerated, so not really an issue. 
>> 
>   Hi,
> a) it's similar as moving from 32bit to 64bit architecture or the other
> way; evo crashes for these situations, because the version is fine, but
> it doesn't know whether the previous one was a 32bit or 64bit machine,
> aka whether it should do some "translation" or not. (and doing
> translation is not as that simple for usage of functions which are doing
> sizeof(...); not a problem with db-summary, but still might be with
> indexes and store summaries, I didn't check that.)
>   

Does it really crash? It used to just regenerate the summary files.

> b) you cannot just drop it and regenerate, because it holds some
> information for local providers, like labels, tags and such.
>   

The point of the version info was so that you could do things like:

if (summary->version < CAMEL_64BIT_SUMMARY_VERSION) {
   off_t offset;
   camel_file_utils_load_off_t (file, &offset);
   info->offset = offset;
} else {
   camel_file_utils_load_int64 (file, &info->offset);
}

If you do this, then you don't actually lose any information.


Jeff
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Jeffrey Stedfast
On 12/16/2009 02:50 PM, Milan Crha wrote:
> On Wed, 2009-12-16 at 17:34 +0530, Srinivasa Ragavan wrote:
>   
>> Btw, just don't remember well, but Milan did a research of the same,
>> moving from mbox to maildir. Milan do you remember the points to
>> consider? It will be helpful
>> 
>   Hi,
> I'm sorry, I forgot those, it's quite long time ago. Some of them were
> mentioned in this thread, like:
>  - cannot use ':' in a file name for Windows
>  - cannot create a subfolder of an Inbox
>   

The standard way to nest Maildir folders is such:

Maildir/
   cur/
   new/
   tmp/
   .GNOME/
  cur/
  new/
  tmp/
   .GNOME.Evolution/
  cur/
  new/
  tmp/
   .GNOME.Evolution.Hackers/
  cur/
  new/
  tmp/
   .Xorg/
  cur/
  new/
  tmp/

this will give you the following folder tree:

Inbox
  GNOME
Evolution
  Hackers
  Xorg

>  - cannot use folder names 'new'/'cur'/'tmp' as those are maildir's
>   

see above.

>  - should choose folder hierarchy model (there is some already, but it
>has some issue, but I'm not sure what it is)
>   

see above.

>  - recently also some slowness for refresh of large folders (should be
>partially fixed, but not fully, if I recall correctly)
>   

is this related to readdir() performance?

> I've a feeling there were more, but I forgot them. :(
>
> As others in this thread I would also prefer to use maildir instead of
> creating new provider for this. The maildir would be fixed and changed
> slightly to satisfy evo needs for those above issues, but otherwise
> there's no difference for mbox-per-file, as maildir does pretty the same
> thing (message-per-file).
>   

I agree.

Jeff

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Zan Lynx

On 12/16/09 5:18 AM, Patrick Ohly wrote:

I fell it will be slower compared to the other approach. You dont
rewrite the file entirely at all in normal usage.


Setting mail flags was mentioned as the reason for not using maildir.
Adding a mail flag to an mbox mail requires rewriting the whole file. Or
do you assume that you can overwrite just some bytes in an existing mail
header?

That will still lead to writing a complete sector to disk, in contrast
to renaming a file which I expect to be implemented more intelligently
by the file system. Actually, writing a micro-benchmark for this is
doable. Before you seriously consider investing effort into this, I'd
really prefer to see some hard data for a "rename vs. rewrite"
comparison.


Ever since the early controversy in ext4 over lost data in KDE 
configuration files, file renames force an ordered journal commit in ext4.


So file rename is more expensive than you may think.

This is of course still cheaper than fsync, if Evolution makes a habit 
of fsync'ing its email files.

--
Zan Lynx
zl...@acm.org

"Knowledge is Power.  Power Corrupts.  Study Hard.  Be Evil."
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Milan Crha
On Wed, 2009-12-16 at 17:34 +0530, Srinivasa Ragavan wrote:
> Btw, just don't remember well, but Milan did a research of the same,
> moving from mbox to maildir. Milan do you remember the points to
> consider? It will be helpful

Hi,
I'm sorry, I forgot those, it's quite long time ago. Some of them were
mentioned in this thread, like:
 - cannot use ':' in a file name for Windows
 - cannot create a subfolder of an Inbox
 - cannot use folder names 'new'/'cur'/'tmp' as those are maildir's
 - should choose folder hierarchy model (there is some already, but it
   has some issue, but I'm not sure what it is)
 - recently also some slowness for refresh of large folders (should be
   partially fixed, but not fully, if I recall correctly)
I've a feeling there were more, but I forgot them. :(

As others in this thread I would also prefer to use maildir instead of
creating new provider for this. The maildir would be fixed and changed
slightly to satisfy evo needs for those above issues, but otherwise
there's no difference for mbox-per-file, as maildir does pretty the same
thing (message-per-file).

My two cents.
Bye,
Milan

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Milan Crha
On Wed, 2009-12-16 at 11:35 -0500, Jeffrey Stedfast wrote:
> The summary files would have had this problem, but they would have
> just been regenerated, so not really an issue. 

Hi,
a) it's similar as moving from 32bit to 64bit architecture or the other
way; evo crashes for these situations, because the version is fine, but
it doesn't know whether the previous one was a 32bit or 64bit machine,
aka whether it should do some "translation" or not. (and doing
translation is not as that simple for usage of functions which are doing
sizeof(...); not a problem with db-summary, but still might be with
indexes and store summaries, I didn't check that.)

b) you cannot just drop it and regenerate, because it holds some
information for local providers, like labels, tags and such.

Silly reasons, but that's why I think those bugs are still opened (no
numbers handy).
Bye,
Milan

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Jeffrey Stedfast
Jeffrey Stedfast wrote:
> Matthew Barnes wrote:
>   
>> there was concern that simply turning it on would somehow break existing
>> installs.  I'm fuzzy on the details, but vaguely recall it being about a
>> field size in some binary file being dependent on sizeof(off_t), which
>> would change with LARGEFILE support enabled and thus break the binary
>> format.
>>   
>> 
>
> The summary files would have had this problem, but they would have just
> been regenerated, so not really an issue.
>   

Also, this is why the summary files had versioning info.

Jeff

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Jeffrey Stedfast
Matthew Barnes wrote:
> On Wed, 2009-12-16 at 09:56 -0500, Jeffrey Stedfast wrote:
>   
>> This just means the proper LARGEFILE flags are not being used at
>> compile time. Either EDS's configure isn't doing proper checks or else
>> Evolution itself isn't doing proper checks and there is some sort of clash.
>>
>> An easy way to fix this is to do what I did with GMime, which is to
>> simply make all public stream APIs that use off_t use goffset instead
>> (I'm sure Matthew will want to do this anyway). Then the problem is
>> much simpler to solve - just make sure that Camel uses the proper CFLAGS
>> for LARGEFILE support (which you can steal from GMime's configure
>> scripts).
>> 
>
> IIRC, the issue is LARGEFILE support is still disabled by default, and
>   

Ah. I'd still suggest switching over to goffset rather than using off_t
in the public API (and for internal state on indexes and wherever else).

> there was concern that simply turning it on would somehow break existing
> installs.  I'm fuzzy on the details, but vaguely recall it being about a
> field size in some binary file being dependent on sizeof(off_t), which
> would change with LARGEFILE support enabled and thus break the binary
> format.
>   

The summary files would have had this problem, but they would have just
been regenerated, so not really an issue.

> Unfortunately I don't remember which file that was an issue in.  It may
> have already been addressed by the move to a summary database, or I may
> just be propagating false rumors.
>   

There may have been other files, but the summary files would definitely
have been affected (tho it would not have been problematic).

Jeff

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Reid Thompson

Reid Thompson wrote:

Patrick Ohly wrote:

On Wed, 2009-12-16 at 18:55 +0530, Srinivasa Ragavan wrote:

Maildir is good, none denies it. But maildir is already there, but not
sure how many use it.


I do, and I know several other people who do

+1
The local default mbox account on my system is empty.
I filter some of the email from my Exchange OWA acct to maildir.
I process my isp and gmail mail into maildir accts(along with some other 
small accounts).



another reason i vote to use/'fix' maildir..
working from home this morning, rather than pipe evo over the network I usually just ssh into my desktop and 
fire up mutt (already configured to know where the evo local directories are, as well as imap to the exchange 
server).  with this setup I can move emails wherever needed between local/server accounts fairly easily.  I 
also tend to use this method if I have to login while on vacation or traveling.

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Matthew Barnes
On Wed, 2009-12-16 at 09:56 -0500, Jeffrey Stedfast wrote:
> This just means the proper LARGEFILE flags are not being used at
> compile time. Either EDS's configure isn't doing proper checks or else
> Evolution itself isn't doing proper checks and there is some sort of clash.
> 
> An easy way to fix this is to do what I did with GMime, which is to
> simply make all public stream APIs that use off_t use goffset instead
> (I'm sure Matthew will want to do this anyway). Then the problem is
> much simpler to solve - just make sure that Camel uses the proper CFLAGS
> for LARGEFILE support (which you can steal from GMime's configure
> scripts).

IIRC, the issue is LARGEFILE support is still disabled by default, and
there was concern that simply turning it on would somehow break existing
installs.  I'm fuzzy on the details, but vaguely recall it being about a
field size in some binary file being dependent on sizeof(off_t), which
would change with LARGEFILE support enabled and thus break the binary
format.

Unfortunately I don't remember which file that was an issue in.  It may
have already been addressed by the move to a summary database, or I may
just be propagating false rumors.

I'm mightily tempted to just enable it by default and see what breaks.
Can't be much worse than the issues we're already haunted by.

Matthew Barnes

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Jeffrey Stedfast
On 12/15/2009 02:46 PM, Chenthill wrote:
> Hi fellow hackers!!
> I have been working for a while during last week on one the blockers
> in evolution - https://bugzilla.gnome.org/show_bug.cgi?id=550414 -
> 'Folder and summary mismatch error'(old one -
> https://bugzilla.gnome.org/show_bug.cgi?id=213072). As a matter of fact
> we have been working as a team to get the blockers down. I have not been
> able to reproduce the issue or yet find the exact problematic area.
>
>   The mismatch in the frompos index in the folder summary may be caused
> by either a threading issue or a crash while storing the indexes. I am
> still investigating it to find the real cause.
>   

I don't think it's a threading or crash issue.

>   Looking at other issues such as, 
> https://bugzilla.gnome.org/show_bug.cgi?id=522433 - 'Fails opening mbox
>   
>> 2GB', just got a thought if we could solve both the issues by,
>> 
>   

This just means the proper LARGEFILE flags are not being used at compile
time. Either EDS's configure isn't doing proper checks or else Evolution
itself isn't doing proper checks and there is some sort of clash.

An easy way to fix this is to do what I did with GMime, which is to
simply make all public stream APIs that use off_t use goffset instead
(I'm sure Matthew will want to do this anyway). Then the problem is much
simpler to solve - just make sure that Camel uses the proper CFLAGS for
LARGEFILE support (which you can steal from GMime's configure scripts).

> Approach #1,
> migrating local storage from mbox to maildir format. With maildir I have
> heard about two issues,
>
> * Not able to create subfolders under INBOX -
> https://bugzilla.gnome.org/show_bug.cgi?id=536240 .
>   

This is just a bug and it should be fixed regardless.

> Approach #2,
> Migrate from a single mbox file per folder to mbox per email. Srini
> mentioned an advantage that this would avoid the file renames that
> maildir does. I think this is much like how other remote providers in
> evo store the email.
>   

I'm not sure if you mean the CamelImapMessageCache way or CamelDataCache
(as someone else mentioned in this thread).

Either way, it seems a messy way of organizing messages as well as
costly in terms of inodes (and possibly in wasted disk space, although
Meeks' email seems to suggest file-per-message might not be that bad).
Then there's the problem of getting mail into and out of this storage
scheme. (Note: Maildir would be less inode-intensive)

I think that Evolution should always choose to use a standard mailbox
format rather than make up its own. So if the consensus is to move away
from mbox, then my vote would be with Maildir.

We originally chose mbox (when I say originally, I mean for 2.x), it was
largely done because most other popular clients also sued mbox. One of
the things we had to do was to figure out a way to structure an mbox
folder tree, and the way I did that was to mimic Thunderbird's folder
layout (which was quite nice). IMHO, this was an added bonus because I
think that Thunderbird was the most likely candidate for people to be
switching to/from as it was the other major mail client at the time -
making migration as simple as a cp -r (or an mv) is pretty slick.

Performance gotchas with Maildir:

There were some comments earlier in this thread about not wanting to use
Maildir because of performance problems with rename(). It's not the
rename() which is the performance problem, but rather the fact that once
a message file is renamed, the client must scan the directory listing(s)
looking for the new name (strcmp()ing everything up to the ':' iirc). If
the volume of mail gets large enough, this could potentially be
problematic. I believe it was problematic on ext2, but things may have
changed since then. I should point out that this is ONLY a problem if
other clients are involved because Evo could (should) keep track of the
name changes in its summary files.

Hope this helps,

Jeff
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Michael Meeks

On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote:
> One advantage which I see with #1 is that its a standard way.

One thing about both approaches, is that they will consume more space;
eg. on my 'Sent' folder with 21k messages - on average (on ext3) we will
chew ~2k of space for each of these; which is ~40Mb - around 2%.

For my cvs commits mail archive, perhaps the worst case, of 350Mb,
(22334 mails) - this would also be around 43Mb - at ~12%.

That's not as bad as I was worried about; though of course there is
some overhead in terms of inodes and directory entries to worry about
that will crank up the overall size - but it doesn't seem horrible even
on ext3.

Of course - ext4 / btrfs will do a much better job here too; so - less
to worry about in future.

HTH,

Michael.

-- 
 michael.me...@novell.com  <><, Pseudo Engineer, itinerant idiot


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Reid Thompson

Patrick Ohly wrote:

On Wed, 2009-12-16 at 18:55 +0530, Srinivasa Ragavan wrote:

Maildir is good, none denies it. But maildir is already there, but not
sure how many use it.


I do, and I know several other people who do

+1
The local default mbox account on my system is empty.
I filter some of the email from my Exchange OWA acct to maildir.
I process my isp and gmail mail into maildir accts(along with some other small 
accounts).

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Srinivasa Ragavan
On Wed, Dec 16, 2009 at 7:05 PM, Patrick Ohly  wrote:
> On Wed, 2009-12-16 at 18:55 +0530, Srinivasa Ragavan wrote:
>> Maildir is good, none denies it. But maildir is already there, but not
>> sure how many use it.
>
> I do, and I know several other people who do. The question how to enable
> maildir for an account is a question that comes up often on the users
> mailing list, so the demand exists. If not that many people actually is
> it, that's probably because switching to it requires quite a bit of
> fiddling.
>
>> I remember multiple problems, some subdirs,
>> windows support etc. Milan did an analysis, some time back, dont
>> remember that very well tbh.
>
> Deciding to move away from mbox as the default format would be the
> perfect opportunity to address these problems and make more users happy:
> those who already use maildir and those who can start to use it and then
> benefit from its advantages.
>

I really like the above statement honestly ;-). Fix up or (Re)Invent?
Its the choice that needs to be made with the options and conditions
at the current scenario. I would like to summarize like that.

-Srini.
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Patrick Ohly
On Wed, 2009-12-16 at 18:55 +0530, Srinivasa Ragavan wrote:
> Maildir is good, none denies it. But maildir is already there, but not
> sure how many use it.

I do, and I know several other people who do. The question how to enable
maildir for an account is a question that comes up often on the users
mailing list, so the demand exists. If not that many people actually is
it, that's probably because switching to it requires quite a bit of
fiddling.

> I remember multiple problems, some subdirs,
> windows support etc. Milan did an analysis, some time back, dont
> remember that very well tbh.

Deciding to move away from mbox as the default format would be the
perfect opportunity to address these problems and make more users happy:
those who already use maildir and those who can start to use it and then
benefit from its advantages.

> Cool. Its about when you write and how you schedule it. In current
> mbox design, expunge, rewrites the flags back to mails from summary.
> Its not about keeping it permanent in the summary.

With maildir, applying flag changes would be so cheap that they wouldn't
have to go through the summary.

-- 
Bye, Patrick Ohly
--  
patrick.o...@gmx.de
http://www.estamos.de/


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Srinivasa Ragavan
On Wed, Dec 16, 2009 at 5:48 PM, Patrick Ohly  wrote:
> On Wed, 2009-12-16 at 16:54 +0530, Srinivasa Ragavan wrote:
>> On Wed, Dec 16, 2009 at 4:46 PM, Patrick Ohly  wrote:
>> > On Wed, 2009-12-16 at 09:19 +0530, Chenthill wrote:
>> >> On Tue, 2009-12-15 at 15:09 -0500, Reid Thompson wrote:
>> >> > On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote:
>> >> > > * Not able to create subfolders under INBOX -
>> >> > > https://bugzilla.gnome.org/show_bug.cgi?id=536240 .
>> >> > I hadn't noticed the above, so I guess it's a non-issue for me
>> >> >
>> >> > What is the second issue?
>> >> Sorry missed to mention it here, with maildir we would need to rename
>> >> files for unread/read flag changes which can be avoided in the later
>> >> approach.
>> >
>> > So you expect renaming a file to be slower than rewriting the whole file
>> > content? Somehow my gut feeling says that it will be the other way
>> > around. But I don't have hard data, of course.
>>
>> I fell it will be slower compared to the other approach. You dont
>> rewrite the file entirely at all in normal usage.
>
> Setting mail flags was mentioned as the reason for not using maildir.
> Adding a mail flag to an mbox mail requires rewriting the whole file. Or
> do you assume that you can overwrite just some bytes in an existing mail
> header?
>
> That will still lead to writing a complete sector to disk, in contrast
> to renaming a file which I expect to be implemented more intelligently
> by the file system. Actually, writing a micro-benchmark for this is
> doable. Before you seriously consider investing effort into this, I'd
> really prefer to see some hard data for a "rename vs. rewrite"
> comparison.
>

Maildir is good, none denies it. But maildir is already there, but not
sure how many use it. I remember multiple problems, some subdirs,
windows support etc. Milan did an analysis, some time back, dont
remember that very well tbh.

>> May be when you
>> expunge folder or export it, the summary data could be updated with
>> the mail's mbox. But its debatable at some level, I would say.
>
> We are debating the merits of the actual mail storage, not the summary
> data. I have wiped out folders.db often enough that I won't use
> Evolution when it switches to storing valuable, unrecoverable
> information like the "mail was read" flag there.

Cool. Its about when you write and how you schedule it. In current
mbox design, expunge, rewrites the flags back to mails from summary.
Its not about keeping it permanent in the summary.

>
>> > I definitely won't switch away from maildir as my format of choice
>> > because it integrates nicely with offlineimap.
>>
>> Sure, I think users should have that freedom. Camel's local folder
>> implementation has that built in. This new approach should be the
>> default for new users, and as option for users to migrate to it for
>> existing users. If users willingly stay with maildir or
>> 1mbox-per-folder that should also be there.
>
> In case it wasn't obvious, I don't see the point of diverting resources
> away from an established format in favor of something new. "It's mbox"
> doesn't count, you would have to write the complete directory tree
> handling from scratch.
>
> Of course, it is your time. I'm just expression my concerns as a user of
> the somewhat neglected maildir format.

I appreciate your feedback. Its not a decison. We want to start a
discussion to see how we can improve the existing situation. I really
hate 1mbox per folder design. Maildir isn't the best backend written
in Evo. Given that, I preferred this data cache scheme that other
backends use. Nothing otherwise! We would choose the best that comes
out of this discussion.

-Srini
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Martin Owens
On Wed, 2009-12-16 at 13:56 +0100, Patrick Ohly wrote:
> On Wed, 2009-12-16 at 18:07 +0530, chen wrote:
> > I also come across another issue, even if we start using maildir format,
> > we cannot assume that multiple applications would access the data
> > especially since local folders belong to evolution and would be used
> > frequently. (see https://bugzilla.gnome.org/show_bug.cgi?id=592310 )
> 
> So you intend to go to a proprietary format because it prevents other
> tools from meddling with the internal data while Evolution runs? Sure,
> that's one way of solving this "problem". But remember, for several of
> your users being able to have one maildir storage shared between apps is
> a valuable *feature*.

I quite agree, the last thing we need is more proprietary data formats.

If there is a problem with other applications accessing the data, then
you must bring the access API lower. Instead of insisting that Evolution
client be the only privileged user of the information.

Think of emails like pictures taken with your webcam (cheese) you
wouldn't want cheese storing your photos in some configuration directory
~/.cheese/nih_database.db where no outside application could possibly
get at them. Even when cheese stored photos in it's config folder, it
still used the jpeg format.

Emails are just text files with field data. They don't need to be stored
in some database, they just need to be indexed properly by a file system
service so searching and listing is quick enough.

Think about making these systems less complex.

Regards, Martin Owens

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Patrick Ohly
On Wed, 2009-12-16 at 07:50 -0500, Matthew Barnes wrote:
> On Wed, 2009-12-16 at 13:18 +0100, Patrick Ohly wrote:
> > We are debating the merits of the actual mail storage, not the summary
> > data. I have wiped out folders.db often enough that I won't use
> > Evolution when it switches to storing valuable, unrecoverable
> > information like the "mail was read" flag there.
> 
> Valuable, unrecoverable message meta-data (flags, tags, labels, etc.)
> should really be split off as a separate database: folders.db and, say,
> metadata.db.

Perhaps it is just me or the way how SQLite was used so far in
Evolution, but given the past experience with folders.db, I also have
doubts about the reliability of such a metadata.db. I'd prefer a
standard format that can be access by other tools.

But you are right, in general separating the different kinds of data
into different physical files is certainly an improvement over the
current situation.

-- 
Bye, Patrick Ohly
--  
patrick.o...@gmx.de
http://www.estamos.de/


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Patrick Ohly
On Wed, 2009-12-16 at 18:07 +0530, chen wrote:
> I also come across another issue, even if we start using maildir format,
> we cannot assume that multiple applications would access the data
> especially since local folders belong to evolution and would be used
> frequently. (see https://bugzilla.gnome.org/show_bug.cgi?id=592310 )

So you intend to go to a proprietary format because it prevents other
tools from meddling with the internal data while Evolution runs? Sure,
that's one way of solving this "problem". But remember, for several of
your users being able to have one maildir storage shared between apps is
a valuable *feature*.

To address your performance concerns regarding maildir and rescanning,
the same can be achieved by declaring that directory Evolution-internal
and not allow users to touch it with other apps - pretty much the
SKIP_LSUMMARY_CHECK env variable which is already in the code.

> > > I definitely won't switch away from maildir as my format of choice
> > > because it integrates nicely with offlineimap.
> > 
> > Sure, I think users should have that freedom. Camel's local folder
> > implementation has that built in. This new approach should be the
> > default for new users, and as option for users to migrate to it for
> > existing users. If users willingly stay with maildir or
> > 1mbox-per-folder that should also be there.
> Looking at the information gathered, am favoring Approach #2 -
> mboxfile-per-mail. I would be starting the work this week if I don't see
> any reasons to change the approach. Just want to put in the best
> possible solution :)

I still don't see the advantage and fear that Evolution will degrade
further because effort are directed towards writing new code instead of
fixing the known problems in the existing code base.

-- 
Bye, Patrick Ohly
--  
patrick.o...@gmx.de
http://www.estamos.de/


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Matthew Barnes
On Wed, 2009-12-16 at 13:18 +0100, Patrick Ohly wrote:
> We are debating the merits of the actual mail storage, not the summary
> data. I have wiped out folders.db often enough that I won't use
> Evolution when it switches to storing valuable, unrecoverable
> information like the "mail was read" flag there.

Valuable, unrecoverable message meta-data (flags, tags, labels, etc.)
should really be split off as a separate database: folders.db and, say,
metadata.db.

Martin Owens had a good point earlier in the thread about this being a
good opportunity to start using XDG directories.  folders.db then would
live under $XDG_CACHE_HOME and could be safely destroyed, whereas
metadata.db would live under $XDG_DATA_HOME.

Matthew Barnes

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread chen
On Wed, 2009-12-16 at 16:54 +0530, Srinivasa Ragavan wrote:
> On Wed, Dec 16, 2009 at 4:46 PM, Patrick Ohly 
> wrote:
> > On Wed, 2009-12-16 at 09:19 +0530, Chenthill wrote:
> >> On Tue, 2009-12-15 at 15:09 -0500, Reid Thompson wrote:
> >> > On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote:
> >> > > * Not able to create subfolders under INBOX -
> >> > > https://bugzilla.gnome.org/show_bug.cgi?id=536240 .
> >> > I hadn't noticed the above, so I guess it's a non-issue for me
> >> >
> >> > What is the second issue?
> >> Sorry missed to mention it here, with maildir we would need to
> rename
> >> files for unread/read flag changes which can be avoided in the
> later
> >> approach.
> >
> > So you expect renaming a file to be slower than rewriting the whole
> file
> > content? Somehow my gut feeling says that it will be the other way
> > around. But I don't have hard data, of course.
> 
> I fell it will be slower compared to the other approach. You dont
> rewrite the file entirely at all in normal usage. May be when you
> expunge folder or export it, the summary data could be updated with
> the mail's mbox. But its debatable at some level, I would say.
I don't think the rename triggers rewrite of a file. It isn a costly
operation. But just wonder why we need to do that at all ? Could it be
costly in distributed environments ? (not sure how significant this case
would be for us)

I also come across another issue, even if we start using maildir format,
we cannot assume that multiple applications would access the data
especially since local folders belong to evolution and would be used
frequently. (see https://bugzilla.gnome.org/show_bug.cgi?id=592310 )

> 
> >
> > I definitely won't switch away from maildir as my format of choice
> > because it integrates nicely with offlineimap.
> 
> Sure, I think users should have that freedom. Camel's local folder
> implementation has that built in. This new approach should be the
> default for new users, and as option for users to migrate to it for
> existing users. If users willingly stay with maildir or
> 1mbox-per-folder that should also be there.
Looking at the information gathered, am favoring Approach #2 -
mboxfile-per-mail. I would be starting the work this week if I don't see
any reasons to change the approach. Just want to put in the best
possible solution :)

- Chenthill.
> 
> -Srini


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Matthew Barnes
On Wed, 2009-12-16 at 17:34 +0530, Srinivasa Ragavan wrote:
> Really, we aren't inventing a new format. Its mbox, but organized a
> bit differently, like how some providers store, (Exchange, GW, (IMAP4
> ?) store.

Perhaps a naive question, but does it have to be stored as mbox?  Could
we not just store the raw message content and skip the From_ line (and
especially >From quoting), since the only purpose it serves is to
delimit multiple messages in the same file?

With a file-per-message approach there's no need for a delimiter line,
and an mbox can be constructed trivially by passing the content through
a CamelMimeFilterFrom.  Right?

Matthew Barnes

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Patrick Ohly
On Wed, 2009-12-16 at 16:54 +0530, Srinivasa Ragavan wrote:
> On Wed, Dec 16, 2009 at 4:46 PM, Patrick Ohly  wrote:
> > On Wed, 2009-12-16 at 09:19 +0530, Chenthill wrote:
> >> On Tue, 2009-12-15 at 15:09 -0500, Reid Thompson wrote:
> >> > On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote:
> >> > > * Not able to create subfolders under INBOX -
> >> > > https://bugzilla.gnome.org/show_bug.cgi?id=536240 .
> >> > I hadn't noticed the above, so I guess it's a non-issue for me
> >> >
> >> > What is the second issue?
> >> Sorry missed to mention it here, with maildir we would need to rename
> >> files for unread/read flag changes which can be avoided in the later
> >> approach.
> >
> > So you expect renaming a file to be slower than rewriting the whole file
> > content? Somehow my gut feeling says that it will be the other way
> > around. But I don't have hard data, of course.
> 
> I fell it will be slower compared to the other approach. You dont
> rewrite the file entirely at all in normal usage.

Setting mail flags was mentioned as the reason for not using maildir.
Adding a mail flag to an mbox mail requires rewriting the whole file. Or
do you assume that you can overwrite just some bytes in an existing mail
header?

That will still lead to writing a complete sector to disk, in contrast
to renaming a file which I expect to be implemented more intelligently
by the file system. Actually, writing a micro-benchmark for this is
doable. Before you seriously consider investing effort into this, I'd
really prefer to see some hard data for a "rename vs. rewrite"
comparison.

> May be when you
> expunge folder or export it, the summary data could be updated with
> the mail's mbox. But its debatable at some level, I would say.

We are debating the merits of the actual mail storage, not the summary
data. I have wiped out folders.db often enough that I won't use
Evolution when it switches to storing valuable, unrecoverable
information like the "mail was read" flag there.

> > I definitely won't switch away from maildir as my format of choice
> > because it integrates nicely with offlineimap.
> 
> Sure, I think users should have that freedom. Camel's local folder
> implementation has that built in. This new approach should be the
> default for new users, and as option for users to migrate to it for
> existing users. If users willingly stay with maildir or
> 1mbox-per-folder that should also be there.

In case it wasn't obvious, I don't see the point of diverting resources
away from an established format in favor of something new. "It's mbox"
doesn't count, you would have to write the complete directory tree
handling from scratch.

Of course, it is your time. I'm just expression my concerns as a user of
the somewhat neglected maildir format.

-- 
Bye, Patrick Ohly
--  
patrick.o...@gmx.de
http://www.estamos.de/


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Srinivasa Ragavan
On Wed, Dec 16, 2009 at 5:03 PM, Ross Burton  wrote:
> On Wed, 2009-12-16 at 16:54 +0530, Srinivasa Ragavan wrote:
>> > I definitely won't switch away from maildir as my format of choice
>> > because it integrates nicely with offlineimap.
>>
>> Sure, I think users should have that freedom. Camel's local folder
>> implementation has that built in. This new approach should be the
>> default for new users, and as option for users to migrate to it for
>> existing users. If users willingly stay with maildir or
>> 1mbox-per-folder that should also be there.
>
> I don't really see the point of inventing a new file-per-message format
> when maildir already exists, is already implemented in evolution (albeit
> buggily), and is a very popular format.  NIH seems a bit pointless
> really.

Really, we aren't inventing a new format. Its mbox, but organized a
bit differently, like how some providers store, (Exchange, GW, (IMAP4
?) store.

Btw, just don't remember well, but Milan did a research of the same,
moving from mbox to maildir. Milan do you remember the points to
consider? It will be helpful

-Srini
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Sankar P
>>> On 12/16/2009 at 01:16 AM, in message
<1260906365.615.858.ca...@linux-e1q4.site>, Chenthill 
wrote: 
> Hi fellow hackers!!
> I have been working for a while during last week on one the blockers
> in evolution - https://bugzilla.gnome.org/show_bug.cgi?id=550414 -
> 'Folder and summary mismatch error'(old one -
> https://bugzilla.gnome.org/show_bug.cgi?id=213072). As a matter of fact
> we have been working as a team to get the blockers down. I have not been
> able to reproduce the issue or yet find the exact problematic area.
> 
>   The mismatch in the frompos index in the folder summary may be caused
> by either a threading issue or a crash while storing the indexes. I am
> still investigating it to find the real cause.
> 
>   Looking at other issues such as, 
> https://bugzilla.gnome.org/show_bug.cgi?id=522433 - 'Fails opening mbox
>> 2GB', just got a thought if we could solve both the issues by,
> 
> Approach #1,
> migrating local storage from mbox to maildir format. With maildir I have
> heard about two issues,
> 
> * Not able to create subfolders under INBOX -
> https://bugzilla.gnome.org/show_bug.cgi?id=536240 .
> 
> Approach #2,
> Migrate from a single mbox file per folder to mbox per email. Srini
> mentioned an advantage that this would avoid the file renames that
> maildir does. I think this is much like how other remote providers in
> evo store the email.
> 
> I thought of bring this in this list to gather more opinions to choose
> the right one. The approach #2 seems a better one as we are choosing a
> way for storing the messages internally in evo. Are we missing to see
> anything while we choose the second one ? 
> 
> One advantage which I see with #1 is that its a standard way.
> 
> Thanks, Chenthill.
> 

Last I checked, maildir is known to have problems in windows.  

If so, Is Evo-on-Win a priority, etc. ? 

The mail storage part used by the GroupWise provider is same as what you 
suggest (afaics)
IIRC, I was told that it was derived from some cyrus storage style. 

Thanks.




--
Sankar 
http://psankar.blogspot.com 


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Adam Tauno Williams
> > I thought of bring this in this list to gather more opinions to choose
> > the right one. The approach #2 seems a better one as we are choosing a
> > way for storing the messages internally in evo. Are we missing to see
> > anything while we choose the second one ?
> > One advantage which I see with #1 is that its a standard way.
> You store everything as mbox only still. But one mbox file per mail,
> distributed in multiple subdirs.

This sounds very much like the format the Cyrus IMAPd server uses
internally.

>  Ofcourse you dont want 100K files inone folder

Since the advent of dirindex in ext3 this isn't such a big issue
anymore.  Large directories are pretty fast.  It used to be a *huge*
problem for Cyrus server admins.

> , which could move the bottleneck to a different place.
> Distribute under multi-level subfolders or something like that.

Cyrus calls that 'directory hashing'.


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Ross Burton
On Wed, 2009-12-16 at 16:54 +0530, Srinivasa Ragavan wrote:
> > I definitely won't switch away from maildir as my format of choice
> > because it integrates nicely with offlineimap.
> 
> Sure, I think users should have that freedom. Camel's local folder
> implementation has that built in. This new approach should be the
> default for new users, and as option for users to migrate to it for
> existing users. If users willingly stay with maildir or
> 1mbox-per-folder that should also be there.

I don't really see the point of inventing a new file-per-message format
when maildir already exists, is already implemented in evolution (albeit
buggily), and is a very popular format.  NIH seems a bit pointless
really.

Ross
-- 
Ross Burton mail: r...@burtonini.com
  jabber: r...@burtonini.com
   www: http://burtonini.com


signature.asc
Description: This is a digitally signed message part
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Srinivasa Ragavan
On Wed, Dec 16, 2009 at 4:46 PM, Patrick Ohly  wrote:
> On Wed, 2009-12-16 at 09:19 +0530, Chenthill wrote:
>> On Tue, 2009-12-15 at 15:09 -0500, Reid Thompson wrote:
>> > On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote:
>> > > * Not able to create subfolders under INBOX -
>> > > https://bugzilla.gnome.org/show_bug.cgi?id=536240 .
>> > I hadn't noticed the above, so I guess it's a non-issue for me
>> >
>> > What is the second issue?
>> Sorry missed to mention it here, with maildir we would need to rename
>> files for unread/read flag changes which can be avoided in the later
>> approach.
>
> So you expect renaming a file to be slower than rewriting the whole file
> content? Somehow my gut feeling says that it will be the other way
> around. But I don't have hard data, of course.

I fell it will be slower compared to the other approach. You dont
rewrite the file entirely at all in normal usage. May be when you
expunge folder or export it, the summary data could be updated with
the mail's mbox. But its debatable at some level, I would say.

>
> I definitely won't switch away from maildir as my format of choice
> because it integrates nicely with offlineimap.

Sure, I think users should have that freedom. Camel's local folder
implementation has that built in. This new approach should be the
default for new users, and as option for users to migrate to it for
existing users. If users willingly stay with maildir or
1mbox-per-folder that should also be there.

-Srini
___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Patrick Ohly
On Wed, 2009-12-16 at 09:19 +0530, Chenthill wrote:
> On Tue, 2009-12-15 at 15:09 -0500, Reid Thompson wrote:
> > On Wed, 2009-12-16 at 01:16 +0530, Chenthill wrote:
> > > * Not able to create subfolders under INBOX -
> > > https://bugzilla.gnome.org/show_bug.cgi?id=536240 .
> > I hadn't noticed the above, so I guess it's a non-issue for me
> > 
> > What is the second issue?
> Sorry missed to mention it here, with maildir we would need to rename
> files for unread/read flag changes which can be avoided in the later
> approach.

So you expect renaming a file to be slower than rewriting the whole file
content? Somehow my gut feeling says that it will be the other way
around. But I don't have hard data, of course.

I definitely won't switch away from maildir as my format of choice
because it integrates nicely with offlineimap.

-- 
Bye, Patrick Ohly
--  
patrick.o...@gmx.de
http://www.estamos.de/


___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Moving from the single mbox file format for the local folders

2009-12-16 Thread Martin Owens
On Wed, 2009-12-16 at 01:10 -0500, Matthew Barnes wrote:
> Did you have a scheme in mind for how to partition the mbox files into
> subdirectories?  One possibility might be to use a similar approach as
> CamelDataCache.  That is, take the last two (or three?) digits of the
> MD5 checksum of the Message-ID and file the message into a subdirectory
> of that name.  That should give you a relatively even distribution and
> the mbox file can be easily be located once you have its Message-ID.

If work is done for this (which I would absolutely love to see) then
could the root location of emails/account folders be made configurable
so I can tie mine up to an XDG Emails directory?

The more user data out of configuration directories, the better. User
visible, indexable by outside services, etc.

Martin Owens,

___
Evolution-hackers mailing list
Evolution-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/evolution-hackers