Re: replication: sync_client on master stops after restarting the replica

2014-10-14 Thread Marcus Schopen
Hi Rudy,

Am Montag, den 13.10.2014, 10:41 +0200 schrieb Rudy Gevaert:
> 
> 
> On 09/27/14 10:59, Marcus Schopen wrote:
> > Hi,
> >
> > always when I have to reboot the replica or its cyrus the
> > synchronization on master side stops, /var/lib/cyrus/sync/log fills up
> > and I don't see a "/usr/lib/cyrus/bin/sync_client -r" process anymore.
> >
> > /var/log/mail.err on master when restarting replica:
> >
> > Sep 27 10:06:28 master cyrus/sync_client[1023]: Error in do_sync():
> > bailing out! Bad protocol
> > Sep 27 10:06:28 master cyrus/sync_client[1023]: Processing sync log
> > file /var/lib/cyrus/sync/log-1023 failed: Bad protocol
> >
> > When I restart cyrus on master side, synchronization starts again.
> >
> > Is there another way to get synchronization working again?
> 
> I have added this in EVENTS { }
> 
> synccheck cmd="/usr/share/cyrus-ugent/cyrus-synccheck -i mail1 -v 
> cyrus-2.4.17" period=10
> 
> 
> Where /usr/share/cyrus-ugent/cyrus-synccheck is a script that  checks if 
> sync_client is running.  If not, it start it

Thanks, what a great idea.

Is it this script?

https://github.com/rgevaert/cyrus-ugent/blob/master/src/cyrus-synccheck

Ciao!
Marcus



Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus


Re: Strange load issue with 2.4.17

2014-10-14 Thread lst_hoe02


Hello,

I'm not a Cyrus expert, but to my knowledge high %sys loads point to  
CPU time spent in kernel space for doing things. One common reason  
could be slow/overloaded I/O but this would be noticed at the %wait at  
least as long as there is progress with I/O at all. So from my point  
of view you either have a hardware problem where the kernel is doing  
"busy wait" for things it should not have to wait, or you hit a kernel  
bug somewhere. This would also be in line with your observation that  
the system is non-responsive at all, while the number of processes,  
the mempry usage and the %user is at normal or lower than expected.


Do you have any events in the kernel/system log or at the console at  
the time problem starts?


Regards

Andreas




smime.p7s
Description: S/MIME Cryptographic Signature

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Strange load issue with 2.4.17

2014-10-14 Thread Michael Menge

Quoting Sebastian Hagedorn :


Hi,

--On 14. Oktober 2014 12:45:42 +0200 Michael Menge  
 wrote:



No, it's VMware ESX on Intel CPUs.


How is the memory usage? Is the system swaping?


the first time it happened (Monday a week ago), there was swapping  
and not enough memory. We have since increased the amount of RAM  
from 32 GB to 48 GB and now there is ample free memory at all times.  
You can see that from the screenshot in my first post.




Sorry I did miss this. What I have seen once or twice was a situation
where a slow response triggered clients to open an other connection
and sending the request again, resulting in more IO, slower responses
and more clients reconnecting. AFAIR there was a high load (many processes)
but not much CPU usage. But i think it was on a i586 system with only
4 GB ram.

Regards,

  Michael


M.MengeTel.: (49) 7071/29-70316
Universität Tübingen   Fax.: (49) 7071/29-5912
Zentrum für Datenverarbeitung  mail:  
michael.me...@zdv.uni-tuebingen.de

Wächterstraße 76
72074 Tübingen

smime.p7s
Description: S/MIME Signatur

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Strange load issue with 2.4.17

2014-10-14 Thread Sebastian Hagedorn

Hi,

--On 14. Oktober 2014 12:45:42 +0200 Michael Menge 
 wrote:



No, it's VMware ESX on Intel CPUs.


How is the memory usage? Is the system swaping?


the first time it happened (Monday a week ago), there was swapping and not 
enough memory. We have since increased the amount of RAM from 32 GB to 48 
GB and now there is ample free memory at all times. You can see that from 
the screenshot in my first post.


Cheers
Sebastian
--
   .:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:.
.:.Regionales Rechenzentrum (RRZK).:.
  .:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:.

p7sW_40JVpuhr.p7s
Description: S/MIME cryptographic signature

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: Strange load issue with 2.4.17

2014-10-14 Thread Michael Menge

Hi,

Quoting Sebastian Hagedorn :

--On 13. Oktober 2014 17:39:23 +0200 Simon Matter  
 wrote:



--On 13. Oktober 2014 17:35:25 +0200 Simon Matter
 wrote:


Hi,

for the last week we have seen strange load issues on our Cyrus server.
All
of a sudden the load increases to several thousands, user CPU goes down
to basically zero, system CPU spikes. In the past we've had trouble
with
poor I/O performance, but that went along with an increase in Wait I/O.
We don't
see that now. vmstat shows a massive increase in context switches. When
the
system reaches this state, all we can do is restart Cyrus or reboot the
machine if that doesn't work anymore.

I'm attaching a Ganglia screenshot that shows the problem clearly. When
the
problem exists, there's not much we can do to analyze it. A colleague
suggested that what we see could be related to this bug:

https://bugzilla.cyrusimap.org/show_bug.cgi?id=3744

It was reported for 2.4.16, and it sounds as if it has been fixed, but
is
that fix really part of 2.4.17? Any other ideas?


Is this a physical host or running virtualized?


It's virtualized, but it's been that way for more than a year.


Is this by any chance running on KVM, maybe on an AMD cpu?


No, it's VMware ESX on Intel CPUs.


How is the memory usage? Is the system swaping?



M.MengeTel.: (49) 7071/29-70316
Universität Tübingen   Fax.: (49) 7071/29-5912
Zentrum für Datenverarbeitung  mail:  
michael.me...@zdv.uni-tuebingen.de

Wächterstraße 76
72074 Tübingen

smime.p7s
Description: S/MIME Signatur

Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus