We have a jabberd2 installation with LDAP client authentication, and
roster published from active directory ldap.  Our jabber service has
stopped working a few times in the last few weeks in the middle of the
day; clients (pidgin) are disconnected, and can't reconnect (or only
partially reconnect, not sure). 

Up until today, I was able to resolve the problem by restarting the
nonfunctional (but running) sm component.  This morning, however, I
started getting these:

router:
Thu Oct 24 10:04:41 2013 [notice] [127.0.0.1, port=48998] error: XML
parse error (not well-formed (invalid token))
sm:
Fri Oct 25 11:45:48 2013 [notice] error from router: XML parse error
(junk after document element)

That could be related to bad data in AD, but I don't believe any users
were changed by anyone but myself, and don't see anything wrong.

In the process of trying to figure out if it was a jabber bug or a
data error, I ran under valgrind, which detected (I believe) a buffer
over/under run in sx_can_read.  That was a stripped binary, so no
other details.

I'm trying to reproduce the problem now on a 2nd server (which also
experienced the problem during the failure period), so far without
success.

Intimate details follow:

Until last week, that was running a debian package I created of
2.2.17; that used ubuntu's package as a template, plus a couple
changes I made.  Since those changes are now either included upstream,
or otherwise unnecessary, I installed ubuntu's vanila 2.2.17 from
their "saucy" suite.  I believe the sm hangs (or whatever) happened
before and after the upgrade.  Checking ubuntu's patch, they only
patch one C file, which fixes a typo in an error message.

I found XML parse errors in earlier log files, but this time in c2s
(only).  It's possible that's from a misconfigured client.

Also, we recently installed two new AD servers, and retired the old
one.  It's possible that's not all working as intended.  

Justin


Reply via email to