Joseph Tam <jtam.h...@gmail.com> wrote:
> Sven Hartge <s...@svenhartge.de> wrote:

>> Interesting datapoint: NetApp Deduplication did only recover about 1%
>> of storage space with mdbox-based mail storage, while on an
>> maildir-based mail storage, the rate was about 15%. (This was tested
>> with a copy of real user data, so is accurate for my workload.)

> Just a guess, but I expect the difference is because NetApp de-dupes
> by checksumming blocks and mark whole blocks as duplicates if they
> have the same checksum.

> The message body has the same block offset in maildir (i.e. the start
> of a message is at byte 0), whereas mdbox might align message body
> anywhere in a block, so you might have 512 different block
> configurations for the same message.

True, the start of the message is always at byte 0, but because of
different header length per user for the same message (different mail
address with different lengths) the body will never start at the same
byte.

In the end, a slight compression (gzip 3) via Dovecot resulted in better
space savings than compression and deduplication via NetApp.

The most space can obviously saved via SiS of attachements in dovecot,
but to be frank, this feature scares me a bit.

Grüße,
Sven.

-- 
Sigmentation fault. Core dumped.

Reply via email to