Tomasz Sterna <to...@xiaoka.com> writes:

> W dniu 03.01.2017, wto o godzinie 23∶35 -0500, użytkownik Greg Troxel
> napisał:
>
>> Jabberd mostly works fine, but on boot sm crashes.  I have adjusted
>> sequencing, although in theory it should not matter
>
> Does 48125019 [1] fix your issue?
>
> [1] 
> https://github.com/jabberd2/jabberd2/commit/48125019452e291b2c57275c789f3d7df87d7146

I applied the patch and rebuilt with -g.  I get the same behavior: on
first booting the machine (which starts jabberd), sm crashes.  If I then
run sm again, it starts fine and runs reliably indefinitely.  I am
running 2.4.0, with sqlite3 backend, on NetBSD 6 i386, built from pkgsrc
with gcc 4.5.

Reproduction recipe which may or may not work for you:
  send sm a HUP
  after that, log in

From the logs, sm connected to router, and disconnected after 6 seconds.
This is when the user below (the first to connect) succeeds in
authenticating.  This same user was the second to authenticate when I
restarted sm seconds later.  The last lines of sm log before the crash were

Thu Jan  5 06:44:30 2017 [notice] module 'iq-vcard' added to chain 
'user-delete' (order 9 index 6 seq 2)
Thu Jan  5 06:44:30 2017 [notice] module 'iq-version' added to chain 
'disco-extend' (order 0 index 17 seq 1)
Thu Jan  5 06:44:30 2017 [notice] module 'help' added to chain 'disco-extend' 
(order 1 index 18 seq 1)
Thu Jan  5 06:44:30 2017 [notice] reopening log ...
Thu Jan  5 06:44:30 2017 [notice] log started

I did notice on startup

Thu Jan  5 06:44:29 2017 [notice] module 'help' added to chain 'disco-extend' 
(order 1 index 18 seq 1)
Thu Jan  5 06:44:29 2017 [notice] version: jabberd sm 2.4.0
Thu Jan  5 06:44:29 2017 [notice] [example.com] configured
Thu Jan  5 06:44:29 2017 [notice] attempting connection to router at 127.0.0.1, 
port=5347
Thu Jan  5 06:44:29 2017 [notice] connection to router established
Thu Jan  5 06:44:29 2017 [notice] sm ready for sessions
Thu Jan  5 06:44:30 2017 [notice] HUP handled. reloading modules...
Thu Jan  5 06:44:30 2017 [notice] modules search path: /usr/pkg/lib/jabberd
Thu Jan  5 06:44:30 2017 [notice] module 'status' added to chain 'sess-start' 
(order 0 index 0 seq 0)
Thu Jan  5 06:44:30 2017 [notice] module 'status' added to chain 'sess-end' 
(order 0 index 0 seq 1)
Thu Jan  5 06:44:30 2017 [notice] module 'iq-last' added to chain 'sess-end' 
(order 1 index 1 seq 0)

The HUP is probably (wildly speculating) because the controlling tty of
the init script was the console, sm didn't detach, and when the init
scripts finished the tty was revoked to clean it up for console login.
But why the HUP happened is minor; the issue is the behavior when it
happened.  I don't see HUP in the router or c2s logs.  But now it makes
sense why it crashes on boot and not later.

After most of an hour of sm running, I sent a HUP, and sm reloaded
modules and stayed up. I logged out and in and on login it crashed.
Same value "status" in c/h.

Here is the backtrace from the crash on boot, which is similar to one I
posted yesterday.  (I have replaced the JID string; but the issue seems
to be the first connection, not this particular user.)

#0  0x080635de in xhash_getx (h=0x74617473, key=0xbb7e334c "storage.path", 
len=12) at xhash.c:174
#1  0x0806364f in xhash_get (h=0x74617473, key=<optimized out>) at xhash.c:187
#2  0x0805c426 in config_get_one (c=0xbb102060, key=0xbb7e334c "storage.path", 
num=0) at config.c:280
#3  0xbb7e258b in storage_add_type (st=0xbb10b040, driver=0xbb119028 "sqlite", 
type=0xbb06cb6a "active") at storage.c:114
#4  0xbb7e2a44 in storage_get (st=0xbb10b040, type=0xbb06cb6a "active", 
owner=0xbb12fc80 "user@examplecom", filter=0x0, os=0xbf7fe258) at storage.c:239
#5  0xbb06ca23 in _active_user_load (mi=0xbb12f660, user=0xbb11b550) at 
mod_active.c:35

You can see that argument h to frames 0 and 1 is suspect (should be a
pointer).  In frame 2, config_get_one has a config_t c which does
contain that value in c->hash.  But then I realized that (char *) &c->h
(also "c") is "status".

(gdb) print c
$1 = (config_t) 0xbb102060
(gdb) print *c
$2 = {hash = 0x74617473, nad = 0xbb007375}
(gdb) x/s c
0xbb102060:      "status"

In frame 3, storage_t is ok

(gdb) print st
$3 = (storage_t) 0xbb10b040
(gdb) print *st
$4 = {config = 0xbb102060, log = 0xbb102068, drivers = 0xbb119000, types = 
0xbb11a000, default_drv = 0xbb11b040}
(gdb) print *st->log
$5 = {type = log_FILE, file = 0xbb3d94c0}
(gdb) print *st->drivers
$6 = {p = 0xbb103080, prime = 101, dirty = 1, count = 1, zen = 0xbb122800, 
free_list = 0x0, iter_bucket = -1, iter_node = 0x0, stat = 0x0}
(gdb) print *st->types
$7 = {p = 0xbb1030c0, prime = 101, dirty = 0, count = 0, zen = 0xbb123000, 
free_list = 0x0, iter_bucket = -1, iter_node = 0x0, stat = 0x0}

My next step would be some guard for config_t, and to turn on the
existing guards.

But, I thought I would post what I have found out so far in case
something jumps out at somebody.  With no basis, I am suspicious of
util/config.c.

Attachment: signature.asc
Description: PGP signature

Reply via email to