Re: [Dovecot] Doubts about dsync, mdbox, SIS

2012-02-03 Thread Jan-Frode Myklebust
On Thu, Feb 02, 2012 at 02:41:11PM +0200, Timo Sirainen wrote:
> 
> >> That is most likely related to your troubles. If the dsync runs crash,
> >> the result could leave extra files lying around etc..
> > 
> > If dsync backup is supposed to be a viable backup solution, I think it
> > should fail much better. If it see errors on the target side it should
> > clear the target and do a full sync. Manually cleaning up after it's
> > problems is too much work.
> 
> Of course. But if no one gives me enough information to reproduce problems, I 
> can't really fix anything. I don't really have time to spend guessing ways to 
> make it break. I've been using dsync to backup my own mails for over a year, 
> with zero problems.

I'm reducing the complexity now, removing SIS and starting the backups
from scratch again. I'll start posting the problems I see over the
weekend..

> 
> >>>   Error: Mailboxes don't have unique GUIDs: 
> >>> 08b46439069d3d4db049e671bf84 is shared by INBOX and INBOX
> 
> What about:
> 
> doveadm mailbox status -u user@domain guid '*'
> 
> in source server?

INBOX   guid=08b46439069d3d4db049e671bf84
INBOX.Sent  guid=e8f6e431bf6e014f2d78e671bf84
INBOX.Trash guid=c858f2234a1d5d4e154758d3d19f
INBOX.Draftsguid=e9f6e431bf6e014f2d78e671bf84
INBOX.Spam  guid=eaf6e431bf6e014f2d78e671bf84
INBOX.Sent Messages guid=d837512bed7d674e685c58d3d19f
INBOX.INBOX.Sent Messages guid=ebf6e431bf6e014f2d78e671bf84
INBOX.Notes guid=c0d2250109645e4eed5c58d3d19f

> in dest server? Does one list show two INBOXes or otherwise duplicate GUIDs? 
> Perhaps this was a bug in v2.0.14..

Scratched dest server before I replied.. sorry. 


> 
> >>>   Error: Failed to sync mailbox INBOX.ferie 2006.: Invalid mailbox name
> >> 
> >> Is this a namespace prefix? It shouldn't be trying to sync a mailbox
> >> named this (there's an extra "." suffix).
> > 
> > I believe it's a folder named "INBOX.ferie 2006.", with the user using
> > the namespace separator in the folder name..  I believe dovecot allows
> > this, so it should also handle backing it up.
> 
> It has never been possible to create such folder via Dovecot. IMAP protocol 
> itself prevents that. "CREATE foo." will end up creating "foo", not "foo." If 
> you manually mkdir that, it's not possible to access the mailbox in any way 
> via Dovecot. Everything will simply fail as:

Oh, sorry.. then this is a problem created by @mail, which poked
directly in the filesystem. Guess we'll have to clean these up manually.


  -jf


Re: [Dovecot] Doubts about dsync, mdbox, SIS

2012-02-02 Thread Timo Sirainen
On 2.2.2012, at 13.31, Jan-Frode Myklebust wrote:

>>> "/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-057274283bb51f4f917ebf34f6ab"
>>>  is
>>> missing, but there are 205 other copies of this file named
>>> /srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-*
>>>  with
>>> identical sha1sum.
>> 
>> All of them have a link count of 2, with the other link being in hashes/
>> directory?
> 
> No, these has link count=207.

OK, so they aren't actual copies, they are links to the same file.

> I don't know what you mean by link being in hashes directory.

If you have e.g. aa/bb/aabbccdd- file, there should be a matching 
aa/bb/hashes/aabbccdd file.

>> That is most likely related to your troubles. If the dsync runs crash,
>> the result could leave extra files lying around etc..
> 
> If dsync backup is supposed to be a viable backup solution, I think it
> should fail much better. If it see errors on the target side it should
> clear the target and do a full sync. Manually cleaning up after it's
> problems is too much work.

Of course. But if no one gives me enough information to reproduce problems, I 
can't really fix anything. I don't really have time to spend guessing ways to 
make it break. I've been using dsync to backup my own mails for over a year, 
with zero problems.

>>> Error: Mailboxes don't have unique GUIDs: 
>>> 08b46439069d3d4db049e671bf84 is shared by INBOX and INBOX

What about:

doveadm mailbox status -u user@domain guid '*'

in source server? in dest server? Does one list show two INBOXes or otherwise 
duplicate GUIDs? Perhaps this was a bug in v2.0.14..

>>> Error: Failed to sync mailbox INBOX.ferie 2006.: Invalid mailbox name
>> 
>> Is this a namespace prefix? It shouldn't be trying to sync a mailbox
>> named this (there's an extra "." suffix).
> 
> I believe it's a folder named "INBOX.ferie 2006.", with the user using
> the namespace separator in the folder name..  I believe dovecot allows
> this, so it should also handle backing it up.

It has never been possible to create such folder via Dovecot. IMAP protocol 
itself prevents that. "CREATE foo." will end up creating "foo", not "foo." If 
you manually mkdir that, it's not possible to access the mailbox in any way via 
Dovecot. Everything will simply fail as:

a select foo.
a NO [CANNOT] Invalid mailbox name

Re: [Dovecot] Doubts about dsync, mdbox, SIS

2012-02-02 Thread Jan-Frode Myklebust
On Thu, Feb 02, 2012 at 12:31:20PM +0100, Jan-Frode Myklebust wrote:
> 
> and numlinks=4:
> 
>   # ls -al 
> /srv/mailbackup/attachments/c3/1b/c31beb42ef78810f7fb81a7086144034fb0fd794*|wc
>  -l
>   3
> 
> is dovecot somehow creating numlinks+1 copies of every file it
> hardlinks?? Would explain my diskusage :-)
> 

Sorry, brainfart.. Yes, these are hardlinks to the same inode..


# ls -i  c31beb42ef78810f7fb81a7086144034fb0fd794* 
../c31beb42ef78810f7fb81a7086144034fb0fd794*
2422693 c31beb42ef78810f7fb81a7086144034fb0fd794
2422693 
../c31beb42ef78810f7fb81a7086144034fb0fd794-13b405342e24284f6153bf34f6ab
2422693 
../c31beb42ef78810f7fb81a7086144034fb0fd794-1cb405342e24284f6153bf34f6ab
2422693 
../c31beb42ef78810f7fb81a7086144034fb0fd794-4eb405342e24284f6153bf34f6ab


  -jf


Re: [Dovecot] Doubts about dsync, mdbox, SIS

2012-02-02 Thread Jan-Frode Myklebust
On Thu, Feb 02, 2012 at 12:23:01PM +0200, Timo Sirainen wrote:
> 
> Note that with SIS the attachments aren't compressed.

Yes, I know. 

> 
> > "/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-057274283bb51f4f917ebf34f6ab"
> >  is
> > missing, but there are 205 other copies of this file named
> > /srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-*
> >  with
> > identical sha1sum.
> 
> All of them have a link count of 2, with the other link being in hashes/
> directory?

No, these has link count=207. I don't know what you mean by link being
in hashes directory.

# ls -l 
/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-*|head
-rw--- 207 mailbackup mailbackup 149265 Jan  9 23:31 
/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-0069222e0c080f4f754abf34f6ab
-rw--- 207 mailbackup mailbackup 149265 Jan  9 23:31 
/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-00ffb9312a370e4f6b61bf34f6ab
-rw--- 207 mailbackup mailbackup 149265 Jan  9 23:31 
/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-0442c5163ad3114fb478bf34f6ab
-rw--- 207 mailbackup mailbackup 149265 Jan  9 23:31 
/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-04f288390052144f012dbf34f6ab
-rw--- 207 mailbackup mailbackup 149265 Jan  9 23:31 
/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-053b6c0f185a0d4fc421bf34f6ab
-rw--- 207 mailbackup mailbackup 149265 Jan  9 23:31 
/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-06c98213c3b30e4fac3cbf34f6ab
-rw--- 207 mailbackup mailbackup 149265 Jan  9 23:31 
/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-076573234fbd0b4fa862bf34f6ab

This is just one example, I can provide tons of other examples.. Hmm,
I see now that there are 206 files of that first example with the 207
links, and here's another other example with numlinks=7:

# ls -l  
/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-*|wc 
-l
206

and numlinks=4:

# ls -al 
/srv/mailbackup/attachments/c3/1b/c31beb42ef78810f7fb81a7086144034fb0fd794*|wc 
-l
3

is dovecot somehow creating numlinks+1 copies of every file it
hardlinks?? Would explain my diskusage :-)



> That is most likely related to your troubles. If the dsync runs crash,
> the result could leave extra files lying around etc..

If dsync backup is supposed to be a viable backup solution, I think it
should fail much better. If it see errors on the target side it should
clear the target and do a full sync. Manually cleaning up after it's
problems is too much work.

> 
> > Some samples:
> > 
> > Error: Mailboxes don't have unique GUIDs: 
> > 08b46439069d3d4db049e671bf84 is shared by INBOX and INBOX
> 
> This is a little bit strange. What is the doveconf -n output of the
> source server?


# 2.0.14: /etc/dovecot/dovecot.conf
# OS: Linux 2.6.18-194.26.1.el5 x86_64 Red Hat Enterprise Linux Server
# release 5.5 (Tikanga) 
auth_cache_size = 100 M
auth_verbose = yes
auth_verbose_passwords = sha1
disable_plaintext_auth = no
login_trusted_networks = 192.168.0.0/16
mail_gid = 3000
mail_home = /srv/mailstore/%256RHu/%d/%n
mail_location = maildir:~/:INDEX=/indexes/%1u/%1.1u/%u
mail_max_userip_connections = 20
mail_plugins = quota zlib
mail_uid = 3000
maildir_stat_dirs = yes
maildir_very_dirty_syncs = yes
managesieve_notify_capability = mailto
managesieve_sieve_capability = fileinto reject envelope
encoded-character vacation subaddress comparator-i;ascii-numeric
relational regex imap4flags copy include variables body enotify
environment mailbox date
mmap_disable = yes
namespace {
  inbox = yes
  location = 
  prefix = INBOX.
  separator = .
  type = private
}
passdb {
  args = /etc/dovecot/dovecot-ldap.conf.ext
  driver = ldap
}
plugin {
  quota = dict:UserQuota::file:%h/dovecot-quota
  sieve = /sieve/%1u/%1.1u/%u/.dovecot.sieve
  sieve_dir = /sieve/%1u/%1.1u/%u
  sieve_max_script_size = 1M
  zlib_save = gz
  zlib_save_level = 6
}
postmaster_address = postmas...@example.net
protocols = imap pop3 lmtp sieve
service auth-worker {
  user = $default_internal_user
}
service auth {
  client_limit = 4521
  unix_listener auth-userdb {
group = 
mode = 0600
user = atmail
  }
}
service imap-login {
  inet_listener imap {
address = *
port = 143
  }
  process_min_avail = 4
  service_count = 0
  vsz_limit = 1 G
}
service imap-postlogin {
  executable = script-login /usr/local/sbin/imap-postlogin.sh
}
service imap {
  executable = imap imap-postlogin
  process_limit = 2048
}
service lmtp {
  client_limit = 1
  inet_listener lmtp {
address = *
port = 24
  }
  process_limit = 25
}
service managesieve-login {
  inet_listener sieve {
  

Re: [Dovecot] Doubts about dsync, mdbox, SIS

2012-02-02 Thread Timo Sirainen
On Wed, 2012-02-01 at 13:29 +0100, Jan-Frode Myklebust wrote:

> I'm surprised that the destination server is so large, was expecting zlib and
> mdbox and SIS would compress it down to much less than what we're seeing 
> (12TB -> 5TB):

Note that with SIS the attachments aren't compressed.

> Lots and lots of the attachement storage is duplicated into identical files,
> instead of hard linked.

Something's wrong then.

> When running "doveadm purge -u $user", we're seeing lots of 
> 
>   Error: 
> unlink(/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-057274283bb51f4f917ebf34f6ab)
>  failed: No such file or directory

Something's wrong.

> "/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-057274283bb51f4f917ebf34f6ab"
>  is
> missing, but there are 205 other copies of this file named
> /srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-* 
> with
> identical sha1sum.

All of them have a link count of 2, with the other link being in hashes/
directory?

> Also on the source side, during dsync, we see too many problems. 

That is most likely related to your troubles. If the dsync runs crash,
the result could leave extra files lying around etc..

> Some samples:
> 
>   Error: Mailboxes don't have unique GUIDs: 
> 08b46439069d3d4db049e671bf84 is shared by INBOX and INBOX

This is a little bit strange. What is the doveconf -n output of the
source server?

>   Error: Failed to sync mailbox INBOX.ferie 2006.: Invalid mailbox name

Is this a namespace prefix? It shouldn't be trying to sync a mailbox
named this (there's an extra "." suffix).

>   Error: read() from proxy client failed: EOF

I guess the remote dsync crashes or otherwise aborted.

>   Error: Failed to sync mailbox INBOX.INBOX.Gerda: Mailbox doesn't exist: 
> INBOX/Gerda

I guess some kind of mismatch related to namespace configuration.

>   Error: read() failed: Broken pipe
>   Panic: file dsync-worker-local.c: line 1678 
> (local_worker_save_msg_continue): assertion failed: (ret == -1)

Probably can't handle properly when remote dsync dies. Of course it
still shouldn't crash. There seems to be some bugs left when dsyncing to
a remote host (instead of locally).

It would help if I could reproduce the errors that you're seeing. Can
you easily reproduce them with some accounts? If so, if you can give
enough details for me to reproduce the problems I can fix them. (Except
for the "file not found" issues, since that problems occurred earlier
already. I should probably somehow make Dovecot fix those missing files
though..)



Re: [Dovecot] Doubts about dsync, mdbox, SIS

2012-02-02 Thread Jan-Frode Myklebust
On Thu, Feb 02, 2012 at 08:46:55AM +0100, Alessio Cecchi wrote:
> 
> How many users there are in this installation?

Quite a few :-) This is for an ISP.

> >The active servers are using Maildir, and has:
> >
> > $ df -h /usr/local/atmail/users/
> > FilesystemSize  Used Avail Use% Mounted on
> > /dev/atmailusers   14T   12T  2.2T  85% /usr/local/atmail/users
> > $ df -hi /usr/local/atmail/users/
> > FilesystemInodes   IUsed   IFree IUse% Mounted on
> > /dev/atmailusers145M113M 33M   78% 
> > /usr/local/atmail/users
> >
> >very little of this is compressed (zlib plugin enabled during christmas).
> 
> This is the old storage in Maildir format?

Correct.

> 
> >I'm surprised that the destination server is so large, was expecting zlib and
> >mdbox and SIS would compress it down to much less than what we're seeing
> >(12TB ->  5TB):
> >
> > $ df -h /srv/mailbackup
> > FilesystemSize  Used Avail Use% Mounted on
> > /dev/mapper/mailbackupvg-mailbackuplv
> >   5.7T  4.8T  882G  85% /srv/mailbackup
> 
> This is the new storage in mdbox format?

Correct.

> What size you would expect?

With Maildir I see message-files shrink to about 20%* of original size
after turning on zlib with zlib_save_level=6. I was expecting better
compression with mdbox (and zlib_save_level=9), and I would expect SIS to
help even further.

mdbox+SIS+zlib_save_level=9 variant taking up 40% the space of a mixed**
compressed/non-compressed Maildir storage isn't very impressive to me --
and the mdbox backup isn't even complete (it's only the 25% most active users).

Yes, I see there might be holes in my logic, expecting compressed messages to
compress further after move to mdbox. But also I have expectation that
most of the messages are not already compressed on the Maildir side.
Sorry, expectations and guesses, not hard facts.

[*] based on a couple of samples, not thourough research
[**] Only messages saved after we enabled zlib on December 25. are compressed.


  -jf


Re: [Dovecot] Doubts about dsync, mdbox, SIS

2012-02-01 Thread Alessio Cecchi

Il 01/02/2012 13:29, Jan-Frode Myklebust ha scritto:

I've been running continous dsync backups of our Maildirs for a few
weeks now, with the destination dsync server using mdbox and SIS. The
idea was that the destination server would act as a warm copy of
all our active users data.


How many users there are in this installation?


The active servers are using Maildir, and has:

$ df -h /usr/local/atmail/users/
FilesystemSize  Used Avail Use% Mounted on
/dev/atmailusers   14T   12T  2.2T  85% /usr/local/atmail/users
$ df -hi /usr/local/atmail/users/
FilesystemInodes   IUsed   IFree IUse% Mounted on
/dev/atmailusers145M113M 33M   78% 
/usr/local/atmail/users

very little of this is compressed (zlib plugin enabled during christmas).


This is the old storage in Maildir format?


I'm surprised that the destination server is so large, was expecting zlib and
mdbox and SIS would compress it down to much less than what we're seeing
(12TB ->  5TB):

$ df -h /srv/mailbackup
FilesystemSize  Used Avail Use% Mounted on
/dev/mapper/mailbackupvg-mailbackuplv
   5.7T  4.8T  882G  85% /srv/mailbackup


This is the new storage in mdbox format?

What size you would expect?
--
Alessio Cecchi is:
@ ILS -> http://www.linux.it/~alessice/
on LinkedIn -> http://www.linkedin.com/in/alessice
Assistenza Sistemi GNU/Linux -> http://www.cecchi.biz/
@ PLUG -> ex-Presidente, adesso senatore a vita, http://www.prato.linux.it
@ LOLUG -> Socio http://www.lolug.net


[Dovecot] Doubts about dsync, mdbox, SIS

2012-02-01 Thread Jan-Frode Myklebust
I've been running continous dsync backups of our Maildirs for a few
weeks now, with the destination dsync server using mdbox and SIS. The
idea was that the destination server would act as a warm copy of 
all our active users data.

The active servers are using Maildir, and has:

$ df -h /usr/local/atmail/users/
FilesystemSize  Used Avail Use% Mounted on
/dev/atmailusers   14T   12T  2.2T  85% /usr/local/atmail/users
$ df -hi /usr/local/atmail/users/
FilesystemInodes   IUsed   IFree IUse% Mounted on
/dev/atmailusers145M113M 33M   78% 
/usr/local/atmail/users

very little of this is compressed (zlib plugin enabled during christmas).

I'm surprised that the destination server is so large, was expecting zlib and
mdbox and SIS would compress it down to much less than what we're seeing 
(12TB -> 5TB):

$ df -h /srv/mailbackup
FilesystemSize  Used Avail Use% Mounted on
/dev/mapper/mailbackupvg-mailbackuplv
  5.7T  4.8T  882G  85% /srv/mailbackup

Lots and lots of the attachement storage is duplicated into identical files,
instead of hard linked.

When running "doveadm purge -u $user", we're seeing lots of 

Error: 
unlink(/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-057274283bb51f4f917ebf34f6ab)
 failed: No such file or directory

"/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-057274283bb51f4f917ebf34f6ab"
 is
missing, but there are 205 other copies of this file named
/srv/mailbackup/attachments/c3/17/c317b32b97688c16859956f11b803e3bba434349-* 
with
identical sha1sum.

Also we see corrupted indexes during the purge. This makes me quite uncertain
if dsync is a workable backup solution.. or if we can trust mdboxes.  

Also on the source side, during dsync, we see too many problems. Some samples:

Error: Mailboxes don't have unique GUIDs: 
08b46439069d3d4db049e671bf84 is shared by INBOX and INBOX
Error: command BOX-LIST failed
Error: Worker server's mailbox iteration failed
Error: read() from worker server failed: EOF

Error: Failed to sync mailbox INBOX.ferie 2006.: Invalid mailbox name
Error: read() from proxy client failed: EOF

Error: Unexpected finish reply: 1  596fec275888dbd89f6d1f5356c22db6 
   37200   \dsync-expunged 0
Error: Unexpected reply from server: 1 
12200572a70726fca946da6f9378dc0337210   \dsync-expunged 0

Error: Failed to sync mailbox INBOX.INBOX.Gerda: Mailbox doesn't exist: 
INBOX/Gerda
Error: command BOX-LIST failed

Error: read() failed: Broken pipe
Panic: file dsync-worker-local.c: line 1678 
(local_worker_save_msg_continue): assertion failed: (ret == -1)
Error: Raw backtrace: /usr/lib64/dovecot/libdovecot.so.0 [0x367703c680] 
-> /usr/lib64/dovecot/libdovecot.so.0(default_fatal_handler+0x35) 
[0x367703c765] -> /usr/lib64/dovecot/libdovecot.so.0 [0x367703bb93] -> 
/usr/bin/dsync [0x40f48d] -> /usr/bin/dsync [0x40f589] -> 
/usr/bin/dsync(dsync_worker_msg_save+0x8e) [0x40eb3e] -> /usr/bin/dsync 
[0x40d71a] -> /usr/bin/dsync [0x40cdbf] -> /usr/bin/dsync [0x40d105] -> 
/usr/lib64/dovecot/libdovecot.so.0(io_loop_call_io+0x48) [0x3677047278] -> 
/usr/lib64/dovecot/libdovecot.so.0(io_loop_handler_run+0xd5) [0x36770485c5] -> 
/usr/lib64/dovecot/libdovecot.so.0(io_loop_run+0x2d) [0x367704720d] -> 
/usr/lib64/dovecot/libdovecot.so.0(master_service_run+0x13) [0x3677035a83] -> 
/usr/bin/dsync(main+0x71e) [0x406c4e] -> 
/lib64/libc.so.6(__libc_start_main+0xf4) [0x3e3941d994] -> /usr/bin/dsync 
[0x406369]


Do you have any idea for what our problems might be? Should we:

avoid SIS ?
avoid doing Maildir on one side and mdbox on the other?
try other dovecot version for dsync?
anything else?


   -jf

- destination server, running dovecot v2.0.14 
mail_attachment_dir = /srv/mailbackup/attachments
mail_location = mdbox:~/mdbox
mail_plugins = zlib
mdbox_rotate_size = 5 M
namespace {
  inbox = yes
  location = 
  prefix = INBOX.
  separator = .
  type = private
}
passdb {
  driver = static
}
plugin {
  zlib_save = gz
  zlib_save_level = 9
}
protocols = 
service auth-worker {
  user = $default_internal_user
}
service auth {
  unix_listener auth-userdb {
mode = 0600
user = mailbackup
  }
}
ssl = no
userdb {
  args = home=/srv/mailbackup/%256Hu/%d/%n
  driver = static
}
-/destination server 


  -jf