Re: Maildir vs. mbox in Debian

2012-11-29 Thread brian m. carlson
On Thu, Nov 29, 2012 at 05:52:06PM +0100, Adam Borowski wrote:
> Outside of dpkg, sqlite in non-WAL mode, other databases and virtualbox/
> qemu, btrfs is pretty fast.

That may be true, but it glosses over how awful performance is on those
workloads on btrfs.  A single Berkeley DB transaction can literally take
minutes.  btrfs in its default configuration is completely unusable for
any system that uses databases *at all*, which is essentially everything
but tiny embedded systems.  I won't even use it on /tmp.

-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187


signature.asc
Description: Digital signature


Re: Maildir vs. mbox in Debian

2012-11-29 Thread Adam Borowski
On Thu, Nov 29, 2012 at 04:32:33PM +0100, Vincent Lefevre wrote:
> On 2012-11-29 16:16:25 +0100, Adam Borowski wrote:
> > *cough* btrfs -ocompress=lzo.  Small files are packed inline in metadata
> > blocks, and you get compression you wanted.  Using lzo is faster than no
> > compression for most loads, adding negligible cost for incompressible data
> > (especially if not all cores are at 100% usage).
> 
> Great! Nice to know.
> 
> This should be the default in Debian. :)

Not while dpkg calls fsync() every, approximately, 0.1 bits written.
Btrfs has transactions one could use to wrap around a whole dpkg operation,
avoiding fsync entirely -- at the cost of having filesystem specific code.

But for now you can even think of btrfs only if either you're on stable and
don't mind upgrades taking forever, or you use eatmydata.  The latter is
actually safe if you use btrfs snapshots and revert if power fails during
a dpkg run, but sadly, it currently requires quite a bit of manual work,
especially to build an appropriate filesystem layout.

A machine that's primarily a mail server could use btrfs for the filesystem
that holds the mail, of course.

Outside of dpkg, sqlite in non-WAL mode, other databases and virtualbox/
qemu, btrfs is pretty fast.

-- 
How to squander your resources: those silly Swedes have a sauce named
"hovmästarsås", the best thing ever to put on cheese, yet they waste it
solely on mere salmon.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20121129165206.ga23...@angband.pl



Re: Maildir vs. mbox in Debian

2012-11-29 Thread Vincent Lefevre
On 2012-11-29 16:16:25 +0100, Adam Borowski wrote:
> *cough* btrfs -ocompress=lzo.  Small files are packed inline in metadata
> blocks, and you get compression you wanted.  Using lzo is faster than no
> compression for most loads, adding negligible cost for incompressible data
> (especially if not all cores are at 100% usage).

Great! Nice to know.

This should be the default in Debian. :)

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20121129153233.gk5...@xvii.vinc17.org



Re: Maildir vs. mbox in Debian

2012-11-29 Thread Andrey Rahmatullin
On Thu, Nov 29, 2012 at 04:16:25PM +0100, Adam Borowski wrote:
> *cough* btrfs -ocompress=lzo.  Small files are packed inline in metadata
> blocks, and you get compression you wanted.  
It's nice to see more features from '93 Windows NT implemented for Linux
at last.

-- 
WBR, wRAR


signature.asc
Description: Digital signature


Re: Maildir vs. mbox in Debian

2012-11-29 Thread Adam Borowski
On Thu, Nov 29, 2012 at 03:20:34PM +0100, Vincent Lefevre wrote:
> On 2012-11-29 01:28:37 +0100, Christoph Anton Mitterer wrote:
> > But it also has disadvantages to the mbox formats which may be crucial
> > for some people:
> > - wasting a lot of storage, which can be significant even if you use
> > small file systems block sizes...
> 
> This is a problem with the file system, not with maildir. Here's
> an example of file system (though not for Unix) that was partly
> optimized to avoid these block size problems:
> 
>   http://www.chiark.greenend.org.uk/~theom/riscos/docs/ultimate/a252efmt.txt
> 
> (just for the pleasure of citing mail from 1990 :)
> 
> Now, I would say that in general, the wasted space is small compared
> to large attachments. And if you have only text and care about disk
> space, you should consider a compressed format, not "pure mbox".

*cough* btrfs -ocompress=lzo.  Small files are packed inline in metadata
blocks, and you get compression you wanted.  Using lzo is faster than no
compression for most loads, adding negligible cost for incompressible data
(especially if not all cores are at 100% usage).

-- 
How to squander your resources: those silly Swedes have a sauce named
"hovmästarsås", the best thing ever to put on cheese, yet they waste it
solely on mere salmon.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20121129151625.ga20...@angband.pl



Re: Maildir vs. mbox in Debian

2012-11-29 Thread Vincent Lefevre
On 2012-11-29 15:30:47 +0100, Christoph Anton Mitterer wrote:
> On Thu, 2012-11-29 at 15:20 +0100, Vincent Lefevre wrote:
> > Now, I would say that in general, the wasted space is small compared
> > to large attachments. And if you have only text and care about disk
> > space, you should consider a compressed format, not "pure mbox".
> Well it's not that small:
> http://dovecot.org/pipermail/dovecot/2012-October/069130.html

So, around 10% on this example. Not really significant. If these 10%
are a problem, I would also consider other means, such as compressing
the mailbox (one can gain 50% or more) and/or removing "useless"
headers.

> > This depends. I index my archive mailbox with mairix, and with mairix,
> > it is better to use maildir as a search result is built with symlinks
> > instead of copying the individual mail messages.
> Do these tools (mairix, notmuch, etc.) also help with real full text
> search? I just though they'd index some stuff.

If by real full text, you mean a sequence of words, a solution is
to search one or two meaningful words with mairix (this should be
very fast), then do a full search on the resulting mailbox (which
should be small enough). This might also be scripted...

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20121129150527.gg5...@xvii.vinc17.org



Re: Maildir vs. mbox in Debian

2012-11-29 Thread Ryan Kavanagh
On Thu, Nov 29, 2012 at 03:30:47PM +0100, Christoph Anton Mitterer wrote:
> Do these tools (mairix, notmuch, etc.) also help with real full text
> search? I just though they'd index some stuff.

I can't speak for mairix, etc., but notmuch can handle full text search.
To quote from notmuch-search-terms(7),

The search terms can consist of free-form text (and quoted phrases)
which will match all messages that contain all of the given
terms/phrases in the body, the subject, or any of the sender or
recipient headers.

Ryan

-- 
|_)|_/  Ryan Kavanagh |  GnuPG key
| \| \  http://ryanak.ca/ |  4A11C97A


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20121129143706.gh32...@upsilon.ryanak.ca



Re: Maildir vs. mbox in Debian

2012-11-29 Thread Christoph Anton Mitterer
On Thu, 2012-11-29 at 15:20 +0100, Vincent Lefevre wrote:
> On 2012-11-29 01:28:37 +0100, Christoph Anton Mitterer wrote:
> > But it also has disadvantages to the mbox formats which may be crucial
> > for some people:
> > - wasting a lot of storage, which can be significant even if you use
> > small file systems block sizes...
> This is a problem with the file system, not with maildir.
Well I don't think you can say that so easily... or at least not in
practise... cause all the major filesystems seem to have this "problem".
Even if you use Reiser3 - which has anyway issues - that packed mode has
it's other drawbacks...



>   http://www.chiark.greenend.org.uk/~theom/riscos/docs/ultimate/a252efmt.txt
> (just for the pleasure of citing mail from 1990 :)
:D


> Now, I would say that in general, the wasted space is small compared
> to large attachments. And if you have only text and care about disk
> space, you should consider a compressed format, not "pure mbox".
Well it's not that small:
http://dovecot.org/pipermail/dovecot/2012-October/069130.html


> This depends. I index my archive mailbox with mairix, and with mairix,
> it is better to use maildir as a search result is built with symlinks
> instead of copying the individual mail messages.
Do these tools (mairix, notmuch, etc.) also help with real full text
search? I just though they'd index some stuff.


Chris.


smime.p7s
Description: S/MIME cryptographic signature


Re: Maildir vs. mbox in Debian

2012-11-29 Thread Vincent Lefevre
On 2012-11-29 01:28:37 +0100, Christoph Anton Mitterer wrote:
> But it also has disadvantages to the mbox formats which may be crucial
> for some people:
> - wasting a lot of storage, which can be significant even if you use
> small file systems block sizes...

This is a problem with the file system, not with maildir. Here's
an example of file system (though not for Unix) that was partly
optimized to avoid these block size problems:

  http://www.chiark.greenend.org.uk/~theom/riscos/docs/ultimate/a252efmt.txt

(just for the pleasure of citing mail from 1990 :)

Now, I would say that in general, the wasted space is small compared
to large attachments. And if you have only text and care about disk
space, you should consider a compressed format, not "pure mbox".

> - full text search will typically be slower, as one has to open/close
> many files

This depends. I index my archive mailbox with mairix, and with mairix,
it is better to use maildir as a search result is built with symlinks
instead of copying the individual mail messages.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20121129142033.gd5...@xvii.vinc17.org



Re: Maildir vs. mbox in Debian

2012-11-29 Thread Andrei POPESCU
On Jo, 29 nov 12, 11:35:44, Ivan Shmakov wrote:
> 
>   What are the estimates?  And wouldn't it be better to use some
>   kind of a specialized search engine if searching is deemed
>   “crucial”?  I guess that it may render the difference between
>   the formats somewhat irrelevant.

Something like notmuch for example.

Kind regards,
Andrei
-- 
Offtopic discussions among Debian users and developers:
http://lists.alioth.debian.org/mailman/listinfo/d-community-offtopic


signature.asc
Description: Digital signature


Re: Maildir vs. mbox in Debian

2012-11-28 Thread jidanni
I wouldn't put all my eggs in the same single file.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/87mwy1djt5@jidanni.org



Re: Maildir vs. mbox in Debian

2012-11-28 Thread Ivan Shmakov
> Christoph Anton Mitterer  writes:

[…]

 > But it also has disadvantages to the mbox formats which may be
 > crucial for some people:

 > - wasting a lot of storage, which can be significant even if you use
 > small file systems block sizes...

Only as long as static mbox files are considered.  However,
removing a message from an mbox requires an amount of free space
equal to the size of the mbox in question, sans the message to
be removed.  OTOH, removing a message from Maildir requires no
additional filesystem space.

 > - full text search will typically be slower, as one has to open/close
 > many files

What are the estimates?  And wouldn't it be better to use some
kind of a specialized search engine if searching is deemed
“crucial”?  I guess that it may render the difference between
the formats somewhat irrelevant.

-- 
FSF associate member #7257


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/86d2yxyqjj@gray.siamics.net



Re: Maildir vs. mbox in Debian

2012-11-28 Thread Christoph Anton Mitterer
On Wed, 2012-11-28 at 14:32 +0700, Ivan Shmakov wrote:
>  > # With the advent and now widespread adoption of the superior Maildir
>  > # format over the past several years, the entire "mbox" family of
>  > # mailbox formats is gradually becoming irrelevant, and of only
>  > # historical interest.

Just throwing somehing in here:
While mbox is frowned upon by many people and maildir declared as
generally superior... this is not totally true, IMHO.

Maildir has of course the following advantages to the mbox family of
formats:
- less or even no locking issues
- no obscurities of different subformats
- no need for and message separation and therefore no need for any
quoting as in mbox
- messages need not to be "modified" in a sense that the message flags
(sent, forwarded, etc.) or other meta-data (IMAP's UID, etc.) are stored
as pseudo headers
- given that each mail is a single file one has easier life with things
like backuping or integrity protection...


But it also has disadvantages to the mbox formats which may be crucial
for some people:
- wasting a lot of storage, which can be significant even if you use
small file systems block sizes...
- full text search will typically be slower, as one has to open/close
many files



Cheers,
Chris.


smime.p7s
Description: S/MIME cryptographic signature


Maildir vs. mbox in Debian

2012-11-27 Thread Ivan Shmakov
> Adam Borowski  writes:

[…]

 > Quoting from that page:

 > # With the advent and now widespread adoption of the superior Maildir
 > # format over the past several years, the entire "mbox" family of
 > # mailbox formats is gradually becoming irrelevant, and of only
 > # historical interest.

 > which is no news.  And you can't really run a mail server in mbox if
 > you ever receive mail from business users: for them, sending the text
 > as an image wrapped in a Word document is the rule rather than an
 > exception[1].

Unfortunately, it's not just the business users.  The so-called
“office productivity suites” are seemingly widespread in
academia and science, for instance.

[…]

 > So, what's the reason mbox is still the default in Debian?

That's what I wonder about, too.

[…]

 > With current disk sizes, no one should care about a few gigs here, a
 > few gigs there.  Unless you need to read a mbox linearly, that is.

Seconded.

JFTR: I've switched my mailservers to Maildir c. 2006, for much
improved performance and manageability, and never had an issue
with that.

-- 
FSF associate member #7257


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/86haoa18tg.fsf...@gray.siamics.net