Re: replication: sync_client on master stops after restarting the replica
On 09/27/14 10:59, Marcus Schopen wrote: Hi, always when I have to reboot the replica or its cyrus the synchronization on master side stops, /var/lib/cyrus/sync/log fills up and I don't see a /usr/lib/cyrus/bin/sync_client -r process anymore. /var/log/mail.err on master when restarting replica: Sep 27 10:06:28 master cyrus/sync_client[1023]: Error in do_sync(): bailing out! Bad protocol Sep 27 10:06:28 master cyrus/sync_client[1023]: Processing sync log file /var/lib/cyrus/sync/log-1023 failed: Bad protocol When I restart cyrus on master side, synchronization starts again. Is there another way to get synchronization working again? I have added this in EVENTS { } synccheck cmd=/usr/share/cyrus-ugent/cyrus-synccheck -i mail1 -v cyrus-2.4.17 period=10 Where /usr/share/cyrus-ugent/cyrus-synccheck is a script that checks if sync_client is running. If not, it start it Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Strange load issue with 2.4.17
Apologies if I'm misreading, but that bug suggests many processes are created over a period of time. In contrast your grab shows the number of processes hasn't grown but the load has grown exponentially. I'd say it's not the same bug. The grab shows system CPU staying around the same, contrary to your description - which of them is correct? If load has increased while the CPU has dropped, I'd say you're still waiting on IO. On 13 October 2014 15:35, Sebastian Hagedorn haged...@uni-koeln.de wrote: Hi, for the last week we have seen strange load issues on our Cyrus server. All of a sudden the load increases to several thousands, user CPU goes down to basically zero, system CPU spikes. In the past we've had trouble with poor I/O performance, but that went along with an increase in Wait I/O. We don't see that now. vmstat shows a massive increase in context switches. When the system reaches this state, all we can do is restart Cyrus or reboot the machine if that doesn't work anymore. I'm attaching a Ganglia screenshot that shows the problem clearly. When the problem exists, there's not much we can do to analyze it. A colleague suggested that what we see could be related to this bug: https://bugzilla.cyrusimap.org/show_bug.cgi?id=3744 It was reported for 2.4.16, and it sounds as if it has been fixed, but is that fix really part of 2.4.17? Any other ideas? Thanks Sebastian -- .:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:. .:.Regionales Rechenzentrum (RRZK).:. .:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:. Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Strange load issue with 2.4.17
Thanks for your reply. I agree that it doesn't really look like bug 3744. System CPU *does* increase when the load starts to spike. It only goes to about 20 percent, but it's still a notable increase. Most of the system *appears* to be idle, but interactively you have to wait for each character you type to appear on the screen. My first assumption when dealing with a mail server is that all issues are I/O-related, but usually we would see an increase in Wait I/O when something was up ... Cheers Sebastian --On 13. Oktober 2014 15:48:03 +0100 Geoff Winkless cy...@geoff.dj wrote: Apologies if I'm misreading, but that bug suggests many processes are created over a period of time. In contrast your grab shows the number of processes hasn't grown but the load has grown exponentially. I'd say it's not the same bug. The grab shows system CPU staying around the same, contrary to your description - which of them is correct? If load has increased while the CPU has dropped, I'd say you're still waiting on IO. -- .:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:. .:.Regionales Rechenzentrum (RRZK).:. .:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:. p7sY1wPb17mbB.p7s Description: S/MIME cryptographic signature Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Strange load issue with 2.4.17
Hi, for the last week we have seen strange load issues on our Cyrus server. All of a sudden the load increases to several thousands, user CPU goes down to basically zero, system CPU spikes. In the past we've had trouble with poor I/O performance, but that went along with an increase in Wait I/O. We don't see that now. vmstat shows a massive increase in context switches. When the system reaches this state, all we can do is restart Cyrus or reboot the machine if that doesn't work anymore. I'm attaching a Ganglia screenshot that shows the problem clearly. When the problem exists, there's not much we can do to analyze it. A colleague suggested that what we see could be related to this bug: https://bugzilla.cyrusimap.org/show_bug.cgi?id=3744 It was reported for 2.4.16, and it sounds as if it has been fixed, but is that fix really part of 2.4.17? Any other ideas? Is this a physical host or running virtualized? Simon Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Strange load issue with 2.4.17
--On 13. Oktober 2014 17:35:25 +0200 Simon Matter simon.mat...@invoca.ch wrote: Hi, for the last week we have seen strange load issues on our Cyrus server. All of a sudden the load increases to several thousands, user CPU goes down to basically zero, system CPU spikes. In the past we've had trouble with poor I/O performance, but that went along with an increase in Wait I/O. We don't see that now. vmstat shows a massive increase in context switches. When the system reaches this state, all we can do is restart Cyrus or reboot the machine if that doesn't work anymore. I'm attaching a Ganglia screenshot that shows the problem clearly. When the problem exists, there's not much we can do to analyze it. A colleague suggested that what we see could be related to this bug: https://bugzilla.cyrusimap.org/show_bug.cgi?id=3744 It was reported for 2.4.16, and it sounds as if it has been fixed, but is that fix really part of 2.4.17? Any other ideas? Is this a physical host or running virtualized? It's virtualized, but it's been that way for more than a year. -- .:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:. .:.Regionales Rechenzentrum (RRZK).:. .:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:. p7shuqq8JecBI.p7s Description: S/MIME cryptographic signature Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Strange load issue with 2.4.17
--On 13. Oktober 2014 17:35:25 +0200 Simon Matter simon.mat...@invoca.ch wrote: Hi, for the last week we have seen strange load issues on our Cyrus server. All of a sudden the load increases to several thousands, user CPU goes down to basically zero, system CPU spikes. In the past we've had trouble with poor I/O performance, but that went along with an increase in Wait I/O. We don't see that now. vmstat shows a massive increase in context switches. When the system reaches this state, all we can do is restart Cyrus or reboot the machine if that doesn't work anymore. I'm attaching a Ganglia screenshot that shows the problem clearly. When the problem exists, there's not much we can do to analyze it. A colleague suggested that what we see could be related to this bug: https://bugzilla.cyrusimap.org/show_bug.cgi?id=3744 It was reported for 2.4.16, and it sounds as if it has been fixed, but is that fix really part of 2.4.17? Any other ideas? Is this a physical host or running virtualized? It's virtualized, but it's been that way for more than a year. Is this by any chance running on KVM, maybe on an AMD cpu? Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Strange load issue with 2.4.17
--On 13. Oktober 2014 17:39:23 +0200 Simon Matter simon.mat...@invoca.ch wrote: --On 13. Oktober 2014 17:35:25 +0200 Simon Matter simon.mat...@invoca.ch wrote: Hi, for the last week we have seen strange load issues on our Cyrus server. All of a sudden the load increases to several thousands, user CPU goes down to basically zero, system CPU spikes. In the past we've had trouble with poor I/O performance, but that went along with an increase in Wait I/O. We don't see that now. vmstat shows a massive increase in context switches. When the system reaches this state, all we can do is restart Cyrus or reboot the machine if that doesn't work anymore. I'm attaching a Ganglia screenshot that shows the problem clearly. When the problem exists, there's not much we can do to analyze it. A colleague suggested that what we see could be related to this bug: https://bugzilla.cyrusimap.org/show_bug.cgi?id=3744 It was reported for 2.4.16, and it sounds as if it has been fixed, but is that fix really part of 2.4.17? Any other ideas? Is this a physical host or running virtualized? It's virtualized, but it's been that way for more than a year. Is this by any chance running on KVM, maybe on an AMD cpu? No, it's VMware ESX on Intel CPUs. -- .:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:. .:.Regionales Rechenzentrum (RRZK).:. .:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:. p7stNQTY_zB0X.p7s Description: S/MIME cryptographic signature Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus