We have a jabberd2 installation with LDAP client authentication, and roster published from active directory ldap. Our jabber service has stopped working a few times in the last few weeks in the middle of the day; clients (pidgin) are disconnected, and can't reconnect (or only partially reconnect, not sure).
Up until today, I was able to resolve the problem by restarting the nonfunctional (but running) sm component. This morning, however, I started getting these: router: Thu Oct 24 10:04:41 2013 [notice] [127.0.0.1, port=48998] error: XML parse error (not well-formed (invalid token)) sm: Fri Oct 25 11:45:48 2013 [notice] error from router: XML parse error (junk after document element) That could be related to bad data in AD, but I don't believe any users were changed by anyone but myself, and don't see anything wrong. In the process of trying to figure out if it was a jabber bug or a data error, I ran under valgrind, which detected (I believe) a buffer over/under run in sx_can_read. That was a stripped binary, so no other details. I'm trying to reproduce the problem now on a 2nd server (which also experienced the problem during the failure period), so far without success. Intimate details follow: Until last week, that was running a debian package I created of 2.2.17; that used ubuntu's package as a template, plus a couple changes I made. Since those changes are now either included upstream, or otherwise unnecessary, I installed ubuntu's vanila 2.2.17 from their "saucy" suite. I believe the sm hangs (or whatever) happened before and after the upgrade. Checking ubuntu's patch, they only patch one C file, which fixes a typo in an error message. I found XML parse errors in earlier log files, but this time in c2s (only). It's possible that's from a misconfigured client. Also, we recently installed two new AD servers, and retired the old one. It's possible that's not all working as intended. Justin