Re: sm/router: XML parser error

2013-10-31 Thread Justin T Pryzby
On Wed, Oct 30, 2013 at 07:34:46PM +0100, Tomasz Sterna wrote:
> Dnia 2013-10-30, śro o godzinie 10:28 -0700, Justin T Pryzby pisze:
> > Thanks, I got a debug log by specifying a  path in router.xml,
> > but it doesn't say "XML parser error"; am I missing something from
> > stderr that doesn't make it to the logfile?
> 
> This is logged in the normal, not debug log.
> Check standard log to see when that happens and send me few screens of
> debug log before that happened.
> It should help me debug and fix your issue.

Sorry, I don't think I understood your message until now; the string
"XML parser error" goes to logfile, but the XML itself goes to the
error.log.

Anyway, it seems we're dealing with two problems.  One which only
happened once, which threw the XML error and broke jabber for ~30
minutes, including on a 2nd machine with the same config, which nobody
but myself attempted connecting to.  I'm currently unable to reproduce
that problem.

And two, which has happened for the last several weeks, and continues
to happen every day or so (happened again this morning), which has
always been resolved by restarting SM, and does not throw an XML
error.  Perhaps they have the same root cause, not sure.  Do you have
any advice how to track down this 2nd type of problem?

Thanks,
Justin




Re: sm/router: XML parser error

2013-10-30 Thread Tomasz Sterna
Dnia 2013-10-30, śro o godzinie 10:28 -0700, Justin T Pryzby pisze:
> Thanks, I got a debug log by specifying a  path in router.xml,
> but it doesn't say "XML parser error"; am I missing something from
> stderr that doesn't make it to the logfile?

This is logged in the normal, not debug log.
Check standard log to see when that happens and send me few screens of
debug log before that happened.
It should help me debug and fix your issue.





Re: sm/router: XML parser error

2013-10-30 Thread Justin T Pryzby
Thanks, I got a debug log by specifying a  path in router.xml,
but it doesn't say "XML parser error"; am I missing something from
stderr that doesn't make it to the logfile?

Or, if that file has everything, could I mail you the relevant section
surrounding from this morning?  I can't find anything obviously
broken.

Justin

On Sun, Oct 27, 2013 at 10:03:34PM +0100, Tomasz Sterna wrote:
> Dnia 2013-10-25, pią o godzinie 14:17 -0700, Justin T Pryzby pisze:
> > router:
> > Fri Oct 25 11:45:48 2013 [notice] error from router: XML parse error
> > (junk after document element)
> > That could be related to bad data in AD, but I don't believe any users
> > were changed by anyone but myself, and don't see anything wrong.
> 
> Here's what I do, to debug such obscure issues:
> 
> 1. rebuild jabberd with --enable-debug
> 2. hang router and sm on screen with -D enabled
> 3. wait for a crash
> 
> Usually there is an offending stanza visible right after I reattach to
> the screen of crashed process.
> 
> I've caught several parser/serializer bugs using this method. 




Re: sm/router: XML parser error

2013-10-27 Thread Tomasz Sterna
Dnia 2013-10-25, pią o godzinie 14:17 -0700, Justin T Pryzby pisze:
> router:
> Fri Oct 25 11:45:48 2013 [notice] error from router: XML parse error
> (junk after document element)
> That could be related to bad data in AD, but I don't believe any users
> were changed by anyone but myself, and don't see anything wrong.

Here's what I do, to debug such obscure issues:

1. rebuild jabberd with --enable-debug
2. hang router and sm on screen with -D enabled
3. wait for a crash

Usually there is an offending stanza visible right after I reattach to
the screen of crashed process.

I've caught several parser/serializer bugs using this method. 


-- 
Tomasz Sterna:(){ :|:&};:
Instant Messaging ConsultantOpen Source Developer 
http://abadcafe.pl/   http://www.xiaoka.com/portfolio





sm/router: XML parser error

2013-10-25 Thread Justin T Pryzby
We have a jabberd2 installation with LDAP client authentication, and
roster published from active directory ldap.  Our jabber service has
stopped working a few times in the last few weeks in the middle of the
day; clients (pidgin) are disconnected, and can't reconnect (or only
partially reconnect, not sure). 

Up until today, I was able to resolve the problem by restarting the
nonfunctional (but running) sm component.  This morning, however, I
started getting these:

router:
Thu Oct 24 10:04:41 2013 [notice] [127.0.0.1, port=48998] error: XML
parse error (not well-formed (invalid token))
sm:
Fri Oct 25 11:45:48 2013 [notice] error from router: XML parse error
(junk after document element)

That could be related to bad data in AD, but I don't believe any users
were changed by anyone but myself, and don't see anything wrong.

In the process of trying to figure out if it was a jabber bug or a
data error, I ran under valgrind, which detected (I believe) a buffer
over/under run in sx_can_read.  That was a stripped binary, so no
other details.

I'm trying to reproduce the problem now on a 2nd server (which also
experienced the problem during the failure period), so far without
success.

Intimate details follow:

Until last week, that was running a debian package I created of
2.2.17; that used ubuntu's package as a template, plus a couple
changes I made.  Since those changes are now either included upstream,
or otherwise unnecessary, I installed ubuntu's vanila 2.2.17 from
their "saucy" suite.  I believe the sm hangs (or whatever) happened
before and after the upgrade.  Checking ubuntu's patch, they only
patch one C file, which fixes a typo in an error message.

I found XML parse errors in earlier log files, but this time in c2s
(only).  It's possible that's from a misconfigured client.

Also, we recently installed two new AD servers, and retired the old
one.  It's possible that's not all working as intended.  

Justin