Re: replication timeout

2024-03-28 Thread Andy Balholm

I'm not using any scheduler.
It's just being activated by the standard replication-notify mechanism 
in Dovecot.


Andy

On Thursday, March 28, 2024 12:56:18 AM PDT, Aki Tuomi wrote:

Hi!

We received some more information to this. Are you by chance 
running these from some scheduler? It seems that if dovecot is 
logging, some schedulers can actually start blocking on log 
writes to stdout/stderr, which can lead to this problem.


I wonder if this could be the case for you?

Aki


On 21/01/2024 17:36 EET Aki Tuomi via dovecot  wrote:

 
Can you try with doveadm -D and send the log?


Aki
 ...





___
dovecot mailing list -- dovecot@dovecot.org
To unsubscribe send an email to dovecot-le...@dovecot.org


Re: replication timeout

2024-03-28 Thread Aki Tuomi via dovecot
Hi!

We received some more information to this. Are you by chance running these from 
some scheduler? It seems that if dovecot is logging, some schedulers can 
actually start blocking on log writes to stdout/stderr, which can lead to this 
problem.

I wonder if this could be the case for you?

Aki

> On 21/01/2024 17:36 EET Aki Tuomi via dovecot  wrote:
> 
>  
> Can you try with doveadm -D and send the log?
> 
> Aki
> 
> > On 20/01/2024 19:51 EET Andy Balholm  wrote:
> > 
> >  
> > I forgot to mention in my original message that I'm running Dovecot
> > 2.3.21 (47349e2482).
> > 
> > It seems like the stalls are more likely to happen
> > when the type of sync is "incremental" rather than
> > "normal" or "full".
> > (I'm inclined to think they only happen for incremental syncs,
> > but I'm not sure.)
> > 
> > Andy
> > 
> > 
> > On Friday, January 19, 2024 9:26:29 AM PST, Andy Balholm wrote:
> > > I have two Dovecot mail servers that replicate to each other.
> > > Sometimes there are delays in the synchronization,
> > > and I notice that the mail log has entries like this:
> > >
> > > Error: dsync(spokane): I/O has stalled, no activity for 600 
> > > seconds (last sent=mailbox, last recv=mailbox_state)
> > >
> > > Five minutes seems like a long time to sit there waiting with 
> > > nothing happening.
> > > Is there a way to reduce this timeout so that I don't have so many
> > > replicaton connections just sitting around doing nothing?
> > >
> > > (Of course, a way to prevent the I/O stalls would be great too,
> > > but with my limited upload bandwidth, they may be unavoidable.)
> > >
> > > Andy
> > >
> > 
> > ___
> > dovecot mailing list -- dovecot@dovecot.org
> > To unsubscribe send an email to dovecot-le...@dovecot.org
> ___
> dovecot mailing list -- dovecot@dovecot.org
> To unsubscribe send an email to dovecot-le...@dovecot.org
___
dovecot mailing list -- dovecot@dovecot.org
To unsubscribe send an email to dovecot-le...@dovecot.org


Re: replication timeout

2024-01-22 Thread Andy Balholm

That lets me run sync jobs and get verbose output.
But I haven't managed to have a manuall-started sync that stalled.

It's only a small fraction of the sync jobs that stall,
even though at any given time there are usually several stalled
jobs running.
(Because the stalled jobs take much longer than the ones that
complete normally.)

Andy

On Monday, January 22, 2024 11:22:14 AM PST, Aki Tuomi wrote:

doveconf replication_dsync_parameters

then you can do

doveadm sync -u   

Aki


On 22/01/2024 21:05 EET Andy Balholm  wrote:

 
Is there a way to find out the exact command line that the

replicator is using to invoke doveadm sync?

Andy ...





___
dovecot mailing list -- dovecot@dovecot.org
To unsubscribe send an email to dovecot-le...@dovecot.org


Re: replication timeout

2024-01-22 Thread Aki Tuomi via dovecot
doveconf replication_dsync_parameters

then you can do

doveadm sync -u   

Aki

> On 22/01/2024 21:05 EET Andy Balholm  wrote:
> 
>  
> Is there a way to find out the exact command line that the
> replicator is using to invoke doveadm sync?
> 
> Andy
> 
> On Monday, January 22, 2024 10:55:12 AM PST, Aki Tuomi wrote:
> > you could try running it manually from cli..
> >
> > doveadm -D sync 
> >
> > Aki
> >
> >> On 22/01/2024 20:32 EET Andy Balholm  wrote:
> >> 
> >>  
> >> I'm not sure how to do that, because I'm doing automatic replication,
> >> not running doveadm sync manually.
> >> I tried adding -D to replication_dsync_parameters, ...
> >
> >
> 
> ___
> dovecot mailing list -- dovecot@dovecot.org
> To unsubscribe send an email to dovecot-le...@dovecot.org
___
dovecot mailing list -- dovecot@dovecot.org
To unsubscribe send an email to dovecot-le...@dovecot.org


Re: replication timeout

2024-01-22 Thread Andy Balholm

Is there a way to find out the exact command line that the
replicator is using to invoke doveadm sync?

Andy

On Monday, January 22, 2024 10:55:12 AM PST, Aki Tuomi wrote:

you could try running it manually from cli..

doveadm -D sync 

Aki


On 22/01/2024 20:32 EET Andy Balholm  wrote:

 
I'm not sure how to do that, because I'm doing automatic replication,

not running doveadm sync manually.
I tried adding -D to replication_dsync_parameters, ...





___
dovecot mailing list -- dovecot@dovecot.org
To unsubscribe send an email to dovecot-le...@dovecot.org


Re: replication timeout

2024-01-22 Thread Aki Tuomi via dovecot
you could try running it manually from cli..

doveadm -D sync 

Aki

> On 22/01/2024 20:32 EET Andy Balholm  wrote:
> 
>  
> I'm not sure how to do that, because I'm doing automatic replication,
> not running doveadm sync manually.
> I tried adding -D to replication_dsync_parameters,
> but that gave me an error, because the -D was in the wrong place
> on the command line.
> (It should be doveadm -D sync, and it was ending up with something like
> doveadm sync -D)
> 
> Andy
> 
> On Sunday, January 21, 2024 7:36:13 AM PST, Aki Tuomi wrote:
> > Can you try with doveadm -D and send the log?
> >
> > Aki
> >
> >> On 20/01/2024 19:51 EET Andy Balholm  wrote:
> >> 
> >>  
> >> I forgot to mention in my original message that I'm running Dovecot
> >> 2.3.21 (47349e2482).
> >> 
> >> It seems like the stalls are more likely to happen ...
> >
> >
> 
> ___
> dovecot mailing list -- dovecot@dovecot.org
> To unsubscribe send an email to dovecot-le...@dovecot.org
___
dovecot mailing list -- dovecot@dovecot.org
To unsubscribe send an email to dovecot-le...@dovecot.org


Re: replication timeout

2024-01-22 Thread Andy Balholm

I'm not sure how to do that, because I'm doing automatic replication,
not running doveadm sync manually.
I tried adding -D to replication_dsync_parameters,
but that gave me an error, because the -D was in the wrong place
on the command line.
(It should be doveadm -D sync, and it was ending up with something like
doveadm sync -D)

Andy

On Sunday, January 21, 2024 7:36:13 AM PST, Aki Tuomi wrote:

Can you try with doveadm -D and send the log?

Aki


On 20/01/2024 19:51 EET Andy Balholm  wrote:

 
I forgot to mention in my original message that I'm running Dovecot

2.3.21 (47349e2482).

It seems like the stalls are more likely to happen ...





___
dovecot mailing list -- dovecot@dovecot.org
To unsubscribe send an email to dovecot-le...@dovecot.org


Re: replication timeout

2024-01-21 Thread Aki Tuomi via dovecot
Can you try with doveadm -D and send the log?

Aki

> On 20/01/2024 19:51 EET Andy Balholm  wrote:
> 
>  
> I forgot to mention in my original message that I'm running Dovecot
> 2.3.21 (47349e2482).
> 
> It seems like the stalls are more likely to happen
> when the type of sync is "incremental" rather than
> "normal" or "full".
> (I'm inclined to think they only happen for incremental syncs,
> but I'm not sure.)
> 
> Andy
> 
> 
> On Friday, January 19, 2024 9:26:29 AM PST, Andy Balholm wrote:
> > I have two Dovecot mail servers that replicate to each other.
> > Sometimes there are delays in the synchronization,
> > and I notice that the mail log has entries like this:
> >
> > Error: dsync(spokane): I/O has stalled, no activity for 600 
> > seconds (last sent=mailbox, last recv=mailbox_state)
> >
> > Five minutes seems like a long time to sit there waiting with 
> > nothing happening.
> > Is there a way to reduce this timeout so that I don't have so many
> > replicaton connections just sitting around doing nothing?
> >
> > (Of course, a way to prevent the I/O stalls would be great too,
> > but with my limited upload bandwidth, they may be unavoidable.)
> >
> > Andy
> >
> 
> ___
> dovecot mailing list -- dovecot@dovecot.org
> To unsubscribe send an email to dovecot-le...@dovecot.org
___
dovecot mailing list -- dovecot@dovecot.org
To unsubscribe send an email to dovecot-le...@dovecot.org


Re: replication timeout

2024-01-20 Thread Andy Balholm

I forgot to mention in my original message that I'm running Dovecot
2.3.21 (47349e2482).

It seems like the stalls are more likely to happen
when the type of sync is "incremental" rather than
"normal" or "full".
(I'm inclined to think they only happen for incremental syncs,
but I'm not sure.)

Andy


On Friday, January 19, 2024 9:26:29 AM PST, Andy Balholm wrote:

I have two Dovecot mail servers that replicate to each other.
Sometimes there are delays in the synchronization,
and I notice that the mail log has entries like this:

Error: dsync(spokane): I/O has stalled, no activity for 600 
seconds (last sent=mailbox, last recv=mailbox_state)


Five minutes seems like a long time to sit there waiting with 
nothing happening.

Is there a way to reduce this timeout so that I don't have so many
replicaton connections just sitting around doing nothing?

(Of course, a way to prevent the I/O stalls would be great too,
but with my limited upload bandwidth, they may be unavoidable.)

Andy



___
dovecot mailing list -- dovecot@dovecot.org
To unsubscribe send an email to dovecot-le...@dovecot.org


replication timeout

2024-01-19 Thread Andy Balholm

I have two Dovecot mail servers that replicate to each other.
Sometimes there are delays in the synchronization,
and I notice that the mail log has entries like this:

Error: dsync(spokane): I/O has stalled, no activity for 600 seconds (last 
sent=mailbox, last recv=mailbox_state)


Five minutes seems like a long time to sit there waiting with nothing 
happening.

Is there a way to reduce this timeout so that I don't have so many
replicaton connections just sitting around doing nothing?

(Of course, a way to prevent the I/O stalls would be great too,
but with my limited upload bandwidth, they may be unavoidable.)

Andy
___
dovecot mailing list -- dovecot@dovecot.org
To unsubscribe send an email to dovecot-le...@dovecot.org