Re: replication timeout
I'm not using any scheduler. It's just being activated by the standard replication-notify mechanism in Dovecot. Andy On Thursday, March 28, 2024 12:56:18 AM PDT, Aki Tuomi wrote: Hi! We received some more information to this. Are you by chance running these from some scheduler? It seems that if dovecot is logging, some schedulers can actually start blocking on log writes to stdout/stderr, which can lead to this problem. I wonder if this could be the case for you? Aki On 21/01/2024 17:36 EET Aki Tuomi via dovecot wrote: Can you try with doveadm -D and send the log? Aki ... ___ dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-le...@dovecot.org
Re: replication timeout
Hi! We received some more information to this. Are you by chance running these from some scheduler? It seems that if dovecot is logging, some schedulers can actually start blocking on log writes to stdout/stderr, which can lead to this problem. I wonder if this could be the case for you? Aki > On 21/01/2024 17:36 EET Aki Tuomi via dovecot wrote: > > > Can you try with doveadm -D and send the log? > > Aki > > > On 20/01/2024 19:51 EET Andy Balholm wrote: > > > > > > I forgot to mention in my original message that I'm running Dovecot > > 2.3.21 (47349e2482). > > > > It seems like the stalls are more likely to happen > > when the type of sync is "incremental" rather than > > "normal" or "full". > > (I'm inclined to think they only happen for incremental syncs, > > but I'm not sure.) > > > > Andy > > > > > > On Friday, January 19, 2024 9:26:29 AM PST, Andy Balholm wrote: > > > I have two Dovecot mail servers that replicate to each other. > > > Sometimes there are delays in the synchronization, > > > and I notice that the mail log has entries like this: > > > > > > Error: dsync(spokane): I/O has stalled, no activity for 600 > > > seconds (last sent=mailbox, last recv=mailbox_state) > > > > > > Five minutes seems like a long time to sit there waiting with > > > nothing happening. > > > Is there a way to reduce this timeout so that I don't have so many > > > replicaton connections just sitting around doing nothing? > > > > > > (Of course, a way to prevent the I/O stalls would be great too, > > > but with my limited upload bandwidth, they may be unavoidable.) > > > > > > Andy > > > > > > > ___ > > dovecot mailing list -- dovecot@dovecot.org > > To unsubscribe send an email to dovecot-le...@dovecot.org > ___ > dovecot mailing list -- dovecot@dovecot.org > To unsubscribe send an email to dovecot-le...@dovecot.org ___ dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-le...@dovecot.org
Re: replication timeout
That lets me run sync jobs and get verbose output. But I haven't managed to have a manuall-started sync that stalled. It's only a small fraction of the sync jobs that stall, even though at any given time there are usually several stalled jobs running. (Because the stalled jobs take much longer than the ones that complete normally.) Andy On Monday, January 22, 2024 11:22:14 AM PST, Aki Tuomi wrote: doveconf replication_dsync_parameters then you can do doveadm sync -u Aki On 22/01/2024 21:05 EET Andy Balholm wrote: Is there a way to find out the exact command line that the replicator is using to invoke doveadm sync? Andy ... ___ dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-le...@dovecot.org
Re: replication timeout
doveconf replication_dsync_parameters then you can do doveadm sync -u Aki > On 22/01/2024 21:05 EET Andy Balholm wrote: > > > Is there a way to find out the exact command line that the > replicator is using to invoke doveadm sync? > > Andy > > On Monday, January 22, 2024 10:55:12 AM PST, Aki Tuomi wrote: > > you could try running it manually from cli.. > > > > doveadm -D sync > > > > Aki > > > >> On 22/01/2024 20:32 EET Andy Balholm wrote: > >> > >> > >> I'm not sure how to do that, because I'm doing automatic replication, > >> not running doveadm sync manually. > >> I tried adding -D to replication_dsync_parameters, ... > > > > > > ___ > dovecot mailing list -- dovecot@dovecot.org > To unsubscribe send an email to dovecot-le...@dovecot.org ___ dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-le...@dovecot.org
Re: replication timeout
Is there a way to find out the exact command line that the replicator is using to invoke doveadm sync? Andy On Monday, January 22, 2024 10:55:12 AM PST, Aki Tuomi wrote: you could try running it manually from cli.. doveadm -D sync Aki On 22/01/2024 20:32 EET Andy Balholm wrote: I'm not sure how to do that, because I'm doing automatic replication, not running doveadm sync manually. I tried adding -D to replication_dsync_parameters, ... ___ dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-le...@dovecot.org
Re: replication timeout
you could try running it manually from cli.. doveadm -D sync Aki > On 22/01/2024 20:32 EET Andy Balholm wrote: > > > I'm not sure how to do that, because I'm doing automatic replication, > not running doveadm sync manually. > I tried adding -D to replication_dsync_parameters, > but that gave me an error, because the -D was in the wrong place > on the command line. > (It should be doveadm -D sync, and it was ending up with something like > doveadm sync -D) > > Andy > > On Sunday, January 21, 2024 7:36:13 AM PST, Aki Tuomi wrote: > > Can you try with doveadm -D and send the log? > > > > Aki > > > >> On 20/01/2024 19:51 EET Andy Balholm wrote: > >> > >> > >> I forgot to mention in my original message that I'm running Dovecot > >> 2.3.21 (47349e2482). > >> > >> It seems like the stalls are more likely to happen ... > > > > > > ___ > dovecot mailing list -- dovecot@dovecot.org > To unsubscribe send an email to dovecot-le...@dovecot.org ___ dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-le...@dovecot.org
Re: replication timeout
I'm not sure how to do that, because I'm doing automatic replication, not running doveadm sync manually. I tried adding -D to replication_dsync_parameters, but that gave me an error, because the -D was in the wrong place on the command line. (It should be doveadm -D sync, and it was ending up with something like doveadm sync -D) Andy On Sunday, January 21, 2024 7:36:13 AM PST, Aki Tuomi wrote: Can you try with doveadm -D and send the log? Aki On 20/01/2024 19:51 EET Andy Balholm wrote: I forgot to mention in my original message that I'm running Dovecot 2.3.21 (47349e2482). It seems like the stalls are more likely to happen ... ___ dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-le...@dovecot.org
Re: replication timeout
Can you try with doveadm -D and send the log? Aki > On 20/01/2024 19:51 EET Andy Balholm wrote: > > > I forgot to mention in my original message that I'm running Dovecot > 2.3.21 (47349e2482). > > It seems like the stalls are more likely to happen > when the type of sync is "incremental" rather than > "normal" or "full". > (I'm inclined to think they only happen for incremental syncs, > but I'm not sure.) > > Andy > > > On Friday, January 19, 2024 9:26:29 AM PST, Andy Balholm wrote: > > I have two Dovecot mail servers that replicate to each other. > > Sometimes there are delays in the synchronization, > > and I notice that the mail log has entries like this: > > > > Error: dsync(spokane): I/O has stalled, no activity for 600 > > seconds (last sent=mailbox, last recv=mailbox_state) > > > > Five minutes seems like a long time to sit there waiting with > > nothing happening. > > Is there a way to reduce this timeout so that I don't have so many > > replicaton connections just sitting around doing nothing? > > > > (Of course, a way to prevent the I/O stalls would be great too, > > but with my limited upload bandwidth, they may be unavoidable.) > > > > Andy > > > > ___ > dovecot mailing list -- dovecot@dovecot.org > To unsubscribe send an email to dovecot-le...@dovecot.org ___ dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-le...@dovecot.org
Re: replication timeout
I forgot to mention in my original message that I'm running Dovecot 2.3.21 (47349e2482). It seems like the stalls are more likely to happen when the type of sync is "incremental" rather than "normal" or "full". (I'm inclined to think they only happen for incremental syncs, but I'm not sure.) Andy On Friday, January 19, 2024 9:26:29 AM PST, Andy Balholm wrote: I have two Dovecot mail servers that replicate to each other. Sometimes there are delays in the synchronization, and I notice that the mail log has entries like this: Error: dsync(spokane): I/O has stalled, no activity for 600 seconds (last sent=mailbox, last recv=mailbox_state) Five minutes seems like a long time to sit there waiting with nothing happening. Is there a way to reduce this timeout so that I don't have so many replicaton connections just sitting around doing nothing? (Of course, a way to prevent the I/O stalls would be great too, but with my limited upload bandwidth, they may be unavoidable.) Andy ___ dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-le...@dovecot.org
replication timeout
I have two Dovecot mail servers that replicate to each other. Sometimes there are delays in the synchronization, and I notice that the mail log has entries like this: Error: dsync(spokane): I/O has stalled, no activity for 600 seconds (last sent=mailbox, last recv=mailbox_state) Five minutes seems like a long time to sit there waiting with nothing happening. Is there a way to reduce this timeout so that I don't have so many replicaton connections just sitting around doing nothing? (Of course, a way to prevent the I/O stalls would be great too, but with my limited upload bandwidth, they may be unavoidable.) Andy ___ dovecot mailing list -- dovecot@dovecot.org To unsubscribe send an email to dovecot-le...@dovecot.org