Re: warning: NFS hangs with dovecot 2.3.8 on Debian buster

2019-11-01 Thread Jan-Pieter Cornet via dovecot

On 25-10-19 19:41, Jan-Pieter Cornet via dovecot wrote:

We are in the process of contacting the linux-nfs folks about this, but I'm 
trying to reproduce this on a test-cluster first, to be able to present a 
well-documented case. Since this hang doesn't happen immediately, but takes a 
few hours to a day to occur in the wild, or a few thousand writes to the same 
mailbox, it's a bit hard to debug.


Just FTR, I finally sent mail to the linux-nfs list about this. See eg 
https://marc.info/?l=linux-nfs=157260601632323=2

No replies yet - if^H^Hwhen this gets resolved I'll report back to this list.

--
Jan-Pieter Cornet 
Systeembeheer XS4ALL Internet bv
www.xs4all.nl




signature.asc
Description: OpenPGP digital signature


warning: NFS hangs with dovecot 2.3.8 on Debian buster

2019-10-25 Thread Jan-Pieter Cornet via dovecot

A warning to those considering to upgrade to Debian 10 (buster): we have seen 
occasional NFS hangs with dovecot when using the stock debian buster kernel 
(4.19.67-2+deb10u1).

When we downgrade to the debian stretch kernel (4.9.189-3+deb9u1), the issue 
does not occur. Note that we *only* downgraded the kernel, the rest of the OS 
is still debian buster. Dovecot 2.3.8.

A little more info: we have a dovecot cluster, using mdbox for storage, on an 
NFS server (netapp Cmode version 9.6P2). We use a dovecot director layer, so a 
user is always connected to the same back-end dovecot server. The NFS hang 
occurs on the back-end server.

Once the process hangs, other processes trying to write to the same mailbox, 
will get an error like this:

Timeout (180s) while waiting for lock for transaction log file 
/var/mail/.../index/storage/dovecot.map.index.log (WRITE lock held by pid )

The stuck process itself doesn't seem to do anything, is stuck in "D" disk state, 
"strace" doesn't show anything (and after attaching, strace itself needs a kill -KILL to 
stop). The only way to unwedge the process seems to be to do a kill -KILL of the stuck process. 
Reading from the mailbox is still possible.

We are in the process of contacting the linux-nfs folks about this, but I'm 
trying to reproduce this on a test-cluster first, to be able to present a 
well-documented case. Since this hang doesn't happen immediately, but takes a 
few hours to a day to occur in the wild, or a few thousand writes to the same 
mailbox, it's a bit hard to debug.

--
Jan-Pieter Cornet 
Systeembeheer XS4ALL Internet bv
www.xs4all.nl




signature.asc
Description: OpenPGP digital signature


Re: sieve scripts not synching for 2.3.5.1 pre-built

2019-04-03 Thread Jan-Pieter Cornet via dovecot

On 2-4-19 21:51, Timo Sirainen via dovecot wrote:

Looks like this is trivial to reproduce. It used to work still in v2.3.1, but 
then something broke it. Tracking internally in DOP-1062.


Reverting 
https://github.com/dovecot/pigeonhole/commit/479c5e57046dec76078597df844daccbfc0eb75f
 fixes this.


Looks like that last patch segments puts the arguments to str_begin in the 
wrong order...
strncmp(prefix, MAILBOX_ATTRIBUTE_PREFIX_SIEVE, strlen(prefix)) == 0
should be translated to:
str_begins(MAILBOX_ATTRIBUTE_PREFIX_SIEVE, prefix)

--
Jan-Pieter Cornet 
Systeembeheer XS4ALL Internet bv
www.xs4all.nl



signature.asc
Description: OpenPGP digital signature


sieve scripts not synching for 2.3.5.1 pre-built

2019-04-02 Thread Jan-Pieter Cornet via dovecot
ug: Module loaded: 
/usr/lib/dovecot/modules/doveadm/lib10_doveadm_sieve_plugin.so
[... skipping some lines again ..]
dsync-local(xtra30): Debug: brain M: Mailbox Spam: 
local=88b39d12998d9e5cbb70436edcce/0/1, 
remote=/0/0: mailbox not selectable yet
dsync-local(xtra30): Debug: brain M: Mailbox Test: 
local=801bb932e98d9e5c3d6d436edcce/0/1, 
remote=/0/0: mailbox not selectable yet
dsync-local(xtra30): Debug: brain M: Mailbox Trash: 
local=fc9b162b86479f5cc85a436edcce/0/1, 
remote=/0/0: mailbox not selectable yet
dsync-remote(xtra30): Debug: brain S: Skipping 
unchanged mailbox e7c1961d84479f5c885a436edcce
dsync-remote(xtra30): Debug: brain S: Import INBOX: 
Last common UID=0. Delayed expunges=
dsync-remote(xtra30): Debug: brain S: Import INBOX: 
Import change type=save 
GUID=1553894809.M254847P28859.userimap11.xs4all.net,S=1311,W=1342 UID=1 hdr_hash= 
result=Mail's UID is above local UIDNEXT - No more local mails found
[...]

In the above output from 2.3.5.1, all lines starting with "Debug: doveadm-sieve: 
Iterating Sieve mailbox attributes" are absent. I have no idea why.

We can "work around" it by explicitly doing an rsync of the sieve storage after 
the migration, but that's a bit of a hack.

Attached is a doveconf -n output for 2.3.5.1.

--
Jan-Pieter Cornet 
Systeembeheer XS4ALL Internet bv
www.xs4all.nl


doveconf-n.txt.gz
Description: GNU Zip compressed data


signature.asc
Description: OpenPGP digital signature


Re: lazy expunge folder delete bug

2018-08-21 Thread Jan-Pieter Cornet

Answering my own post...

On 22-6-18 16:52, Jan-Pieter Cornet wrote:

There's a bug in "folder delete" for lazy expunge, type "1 namespace", as 
descibed on https://wiki2.dovecot.org/Plugins/Lazyexpunge
When trying to delete a mailbox that still has messages in it, but that has no 
EXPUNGED/ counterpart, the process hangs after the imap "DELETE" command, 
and the following appears in the log file after a 60s timeout:

Jun 22 15:48:15 userimap6 dovecot: imap(xtra30): Error: Couldn't create mailbox 
list lock /var/mail/.8d1/index/4/03/xtra30/index/mailboxes.lock: 
file_create_locked(/var/mail/.8d1/index/4/03/xtra30/index/mailboxes.lock) 
failed: flock(/var/mail/.8d1/index/4/03/xtra30/index/mailboxes.lock, 
write-lock) failed: Timed out after 60 seconds (BUG: lock is held by our own 
process)

[...detailed example deleted...]

Seems the delete operation locks the mailbox list, and then the expunge create 
hits the same lock. Is this something we can fix by changing settings? Eg use 
another location for the expunge lock?


Yes, that is indeed the solution.

I now set the VOLATILEDIR in the Expunged namespace to another directory, and 
it's no longer a problem to delete an entire folder that still contains mails.

The full setting of the location for the Expunged namespace is now:

location = 
mdbox:$home/:INDEX=$index/:MAILBOXDIR=expunged:LISTINDEX=expunged.list.index:SUBSCRIPTIONS=expunged.subscriptions:VOLATILEDIR=$index/expunged-volatile

(The variables $home and $index are expanded by the userdb before it returns 
this setting to dovecot).

--
Jan-Pieter Cornet 
Systeembeheer XS4ALL Internet bv
www.xs4all.nl



signature.asc
Description: OpenPGP digital signature


variable forwarding buglet

2018-06-22 Thread Jan-Pieter Cornet

I wanted to forward information from the director to the backend dovecot 
(original login name), so I had the userdb on the director return a 
forward_ologin variable.

However, when I tried to use that variable in the "password_key" query on the 
backend dovecot, ${forward_ologin} was expanded to UNSUPPORTED_VARIABLE_forward_ologin.

After testing a bit and looking around in the source a bit, and turning on debugging 
logging, it appeared that the forwarded variables actually appear as "passdb/userdb 
extra fields". So in my case, the forwarded information is available as 
%{passdb:forward_ologin}.

So, problem solved, but this is either an implementation bug, or a 
documentation bug, or oversight.

--
Jan-Pieter Cornet 
Systeembeheer XS4ALL Internet bv
www.xs4all.nl



signature.asc
Description: OpenPGP digital signature


lazy expunge folder delete bug

2018-06-22 Thread Jan-Pieter Cornet

There's a bug in "folder delete" for lazy expunge, type "1 namespace", as 
descibed on https://wiki2.dovecot.org/Plugins/Lazyexpunge

When trying to delete a mailbox that still has messages in it, but that has no 
EXPUNGED/ counterpart, the process hangs after the imap "DELETE" command, 
and the following appears in the log file after a 60s timeout:

Jun 22 15:48:15 userimap6 dovecot: imap(xtra30): Error: Couldn't create mailbox 
list lock /var/mail/.8d1/index/4/03/xtra30/index/mailboxes.lock: 
file_create_locked(/var/mail/.8d1/index/4/03/xtra30/index/mailboxes.lock) 
failed: flock(/var/mail/.8d1/index/4/03/xtra30/index/mailboxes.lock, 
write-lock) failed: Timed out after 60 seconds (BUG: lock is held by our own 
process)
Jun 22 15:48:15 userimap6 dovecot: imap(xtra30): Error: lazy_expunge: Couldn't 
open expunge mailbox: Failed to create mailbox EXPUNGED/Test: Internal error 
occurred. Refer to server log for more information. [2018-06-22 15:47:15]
Jun 22 15:48:15 userimap6 dovecot: imap(xtra30): Error: Lazy-expunge 
transaction failed: Internal error occurred. Refer to server log for more 
information. [2018-06-22 15:47:15]

very quick summage of settings:
8<-
mail_plugins = $mail_plugins lazy_expunge
mail_location = 
mdbox:/var/mail/.8d1/mail/4/03/xtra30:INDEX=/var/mail/.8d1/index/4/03/xtra30/index
  ### from userdb
namespace {
  inbox = yes
  list = yes
  prefix =
  separator = /
}
namespace expunged {
  hidden = yes
  inbox = no
  list = no
  prefix = EXPUNGED/
  separator = /
  location = 
mdbox:/var/mail/.8d1/mail/4/03/xtra30:INDEX=/var/mail/.8d1/index/4/03/xtra30/index:MAILBOXDIR=expunged:LISTINDEX=expunged.list.index:SUBSCRIPTIONS=expunged.subscriptions
 ### also from userdb
}
plugin {
  lazy_expunge = EXPUNGED/
}
8<-
(full doveconf available on request)

State of the account before delete: mailbox "Test" exists, 20 messages in it:

x LIST * *
* LIST (\HasNoChildren \UnMarked \Trash) "/" Trash
* LIST (\HasNoChildren \Marked) "/" Test
* LIST (\HasNoChildren \UnMarked \Junk) "/" Spam
* LIST (\HasNoChildren \UnMarked) "/" Sent
* LIST (\HasNoChildren \UnMarked) "/" Drafts
* LIST (\HasNoChildren) "/" INBOX
x OK List completed (0.009 + 0.000 + 0.008 secs).
x LIST EXPUNGED/* *
x OK List completed (0.001 + 0.000 secs).
x DELETE Test
x NO [SERVERBUG] Internal error occurred. Refer to server log for more 
information. [2018-06-22 15:47:15] (60.064 + 0.000 + 60.063 secs).

... resulting in the above log lines.

However, if you first delete one message (causing EXPUNGED/Test to be created), 
and then remove the folder, it works fine:

x SELECT Test
* FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
* OK [PERMANENTFLAGS (\Answered \Flagged \Deleted \Seen \Draft \*)] Flags 
permitted.
* 20 EXISTS
* 2 RECENT
* OK [UNSEEN 1] First unseen.
* OK [UIDVALIDITY 1352910749] UIDs valid
* OK [UIDNEXT 21] Predicted next UID
* OK [HIGHESTMODSEQ 21] Highest
x OK [READ-WRITE] Select completed (0.010 + 0.000 + 0.009 secs).
x FETCH 1:* (FLAGS)
* 1 FETCH (FLAGS ())
[...boring...]
* 20 FETCH (FLAGS (\Recent))
x OK Fetch completed (0.002 + 0.000 + 0.001 secs).
x STORE 1 +FLAGS (\Deleted)
* 1 FETCH (FLAGS (\Deleted))
x OK Store completed (0.011 + 0.000 + 0.010 secs).
x EXPUNGE
* 1 EXPUNGE
x OK Expunge completed (0.131 + 0.000 + 0.130 secs).
X LIST * *
* LIST (\HasNoChildren \UnMarked \Trash) "/" Trash
* LIST (\HasNoChildren \UnMarked) "/" Test
* LIST (\HasNoChildren \UnMarked \Junk) "/" Spam
* LIST (\HasNoChildren \UnMarked) "/" Sent
* LIST (\HasNoChildren \UnMarked) "/" Drafts
* LIST (\HasNoChildren) "/" INBOX
X OK List completed (0.005 + 0.000 + 0.004 secs).
x LIST EXPUNGED/* *
* LIST (\HasNoChildren \Marked) "/" EXPUNGED/Test
x OK List completed (0.002 + 0.000 + 0.001 secs).
x SELECT INBOX
* OK [CLOSED] Previous mailbox closed.
* FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
[...INBOX opening msgs...]
x OK [READ-WRITE] Select completed (0.005 + 0.000 + 0.004 secs).
x DELETE Test
x OK Delete completed (0.129 + 0.000 + 0.128 secs).
x LIST * *
* LIST (\HasNoChildren \UnMarked \Trash) "/" Trash
* LIST (\HasNoChildren \UnMarked \Junk) "/" Spam
* LIST (\HasNoChildren \UnMarked) "/" Sent
* LIST (\HasNoChildren \UnMarked) "/" Drafts
* LIST (\HasNoChildren) "/" INBOX
x OK List completed (0.005 + 0.000 + 0.004 secs).
x LIST EXPUNGED/* *
* LIST (\HasNoChildren \Marked) "/" EXPUNGED/Test
x OK List completed (0.003 + 0.000 + 0.002 secs).

Seems the delete opeation locks the mailbox list, and then the expunge create 
hits the same lock. Is this something we can fix by changing settings? Eg use 
another location for the expunge lock?

--
Jan-Pieter Cornet 
Systeembeheer XS4ALL Internet bv
www.xs4all.nl



signature.asc
Description: OpenPGP digital signature


director stuck in inifite loop on 2.2.35

2018-04-26 Thread Jan-Pieter Cornet
x7fb4ab489963 in io_loop_handler_run_internal ()
   from target:/usr/lib/dovecot/libdovecot.so.0
#9  0x7fb4ab48839c in io_loop_handler_run () from 
target:/usr/lib/dovecot/libdovecot.so.0
#10 0x7fb4ab488548 in io_loop_run () from 
target:/usr/lib/dovecot/libdovecot.so.0
#11 0x7fb4ab40c0a3 in master_service_run () from 
target:/usr/lib/dovecot/libdovecot.so.0
#12 0x55c8c350b46d in main ()

For completeness, doveconf -n output attached.

--
Jan-Pieter Cornet <joh...@xs4all.net>
Systeembeheer XS4ALL Internet bv
www.xs4all.nl


doveconf-n.txt.gz
Description: GNU Zip compressed data


signature.asc
Description: OpenPGP digital signature


destuser setting useless on LMTP proxy

2018-03-26 Thread Jan-Pieter Cornet

I tried setting the "destuser" setting on the LMTP director as follows, to 
preserve the original envelope rcpt:

protocol lmtp {
  auth_socket_path = director-userdb
  passdb {
driver = ...
override_fields = destuser=%{orig_user}
  }
}

The passdb driver would return the appropriate "user" for each alias. Suppose, 
for example, user1 has emails us...@domain.tld, but also ali...@domain.tld.

Now, it turns out that setting the destuser *changes* the backend. It seems that when the 
passdb returns "destuser", that username is completely ignored and the hashing 
of the destuser determines the backend chosen.

This is incorrect, the backend should be chosen based on the returned "user", and the 
"destuser" should only be used for the remote login (or rcpt, in case of LMTP).

I'm using version 2.2.35. The problem seems to be in lmtp/commands.c, in 
client_proxy_rcpt_parse_fields, line 281-285 says:
} else if (strcmp(key, "user") == 0 ||
   strcmp(key, "destuser") == 0) {
/* changing the username */
*address = value;
} ...

So it looks as if "user" and "destuser" are treated equally in the LMTP proxy.

--
Jan-Pieter Cornet <joh...@xs4all.net>
Systeembeheer XS4ALL Internet bv
www.xs4all.nl



signature.asc
Description: OpenPGP digital signature


migrating from maildir to mdbox, preserving pop3 UIDL

2017-11-03 Thread Jan-Pieter Cornet

Hi,

I'm trying to migrate from maildir to mdbox while preserving the pop3 UIDL (and 
the imap UID).

The problem is, for maildir we use (for compatiblity with qpopper):

 pop3_uidl_format = %f

Problem is, as soon as I convert that to mdbox, then whenever a client issues 
the UIDL command via a POP connection, the connection is closed and this error 
is displayed in the log:

Error: UIDL: File name not found (pop3_uidl_format=%f not supported by storage?)

As a workaround, I enabled "pop3_reuse_xuidl = yes", and before conversion I put an X-UIDL: 
header in the mails. However, since the UIDL is the filename and the filename also contains 
S=,W=, I need to change the header in such a way that the size and 
number of lines stay the same. That's... tedious and error-prone.

Is there an easier solution, eg teaching 'doveadm sync' to preserve pop3 UIDL 
somehow?

For the record, I currently convert maildir to mdbox using:

doveadm -D backup -u user@domain 'doveadm -o 
mail=mdbox:/path/to/storage:INDEX=/path/to/fast/storage -o plugin/zlib_save=gz 
-o mail_uid=$UID -o mail_gid=$GID dsync-server'

Thanks for any help!

--
Jan-Pieter Cornet <joh...@xs4all.net>
Systeembeheer XS4ALL Internet bv
www.xs4all.nl



signature.asc
Description: OpenPGP digital signature