Hi

I reproduced the problem using the install-ns.sh script running under gdb. 
Here's the output of backtrace and bt full. I'm new to using gdb so please let 
me know if you'd like to see some other info.

[15/Aug/2023:13:56:52][13147.7fffe35fe640][-driver:nsssl:0-] Notice: ... 
sockAccept accepted 2 connections
free(): invalid next size (fast)
Thread 4 "nsd" received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffff4ad8640 (LWP 13651)]
__pthread_kill_implementation (no_tid=0, signo=6, threadid=140737298400832) at 
./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.

(gdb) backtrace
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737016493632) 
at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140737016493632) at 
./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140737016493632, signo=signo@entry=6) at 
./nptl/pthread_kill.c:89
#3  0x00007ffff7c7d476 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/posix/raise.c:26
#4  0x00007ffff7c637f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x00007ffff7cc46f6 in __libc_message (action=action@entry=do_abort, 
fmt=fmt@entry=0x7ffff7e16b8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#6  0x00007ffff7cdbd7c in malloc_printerr (str=str@entry=0x7ffff7e19230 
"munmap_chunk(): invalid pointer") at ./malloc/malloc.c:5664
#7  0x00007ffff7cdc05c in munmap_chunk (p=<optimized out>) at 
./malloc/malloc.c:3060
#8  0x00007ffff7ce051a in __GI___libc_free (mem=<optimized out>) at 
./malloc/malloc.c:3381
#9  0x00007ffff7bdb1e5 in ns_free (ptr=0x7fffd4de0ba0) at memory.c:94
#10 0x00007ffff7f09b64 in Ns_SetFree (set=0x7fffd5886210) at set.c:397
#11 0x00007ffff7f3e119 in NsTclSetObjCmd (clientData=0x7fffd403d590, 
interp=0x7fffd4005250, objc=2, objv=0x7fffd453a510) at tclset.c:330
#12 0x00007ffff79cb18e in Dispatch (data=0x7fffd410e3b8, interp=0x7fffd4005250, 
result=0) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:4467
#13 0x00007ffff79cb21f in TclNRRunCallbacks (interp=0x7fffd4005250, result=0, 
rootPtr=0x0) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:4503
#14 0x00007ffff79ca949 in Tcl_EvalObjv (interp=0x7fffd4005250, objc=1, 
objv=0x7fffd453a2b0, flags=2097168) at 
/usr/local/src/tcl8.6.13/generic/tclBasic.c:4226
#15 0x00007ffff79cd384 in TclEvalEx (interp=0x7fffd4005250, 
script=0x7fffe3dfe880 "ns_cleanup", numBytes=10, flags=0, line=1, 
clNextOuter=0x0,
    outerScript=0x7fffe3dfe880 "ns_cleanup") at 
/usr/local/src/tcl8.6.13/generic/tclBasic.c:5372
#16 0x00007ffff79cc5d9 in Tcl_EvalEx (interp=0x7fffd4005250, 
script=0x7fffe3dfe880 "ns_cleanup", numBytes=10, flags=0) at 
/usr/local/src/tcl8.6.13/generic/tclBasic.c:5037
#17 0x00007ffff7f18c02 in Ns_TclEvalCallback (interp=0x7fffd4005250, 
cbPtr=0x5555556a1b30, resultDString=0x0) at tclcallbacks.c:186
#18 0x00007ffff7f29764 in NsTclTraceProc (interp=0x7fffd4005250, 
arg=0x5555556a1b30) at tclinit.c:1913
#19 0x00007ffff7f2a158 in RunTraces (itPtr=0x7fffd403d590, 
why=NS_TCL_TRACE_DEALLOCATE) at tclinit.c:2375
#20 0x00007ffff7f29976 in PushInterp (itPtr=0x7fffd403d590) at tclinit.c:2026
#21 0x00007ffff7f29717 in NsFreeConnInterp (connPtr=0x55555562ebd0) at 
tclinit.c:1885
#22 0x00007ffff7efdf11 in ConnRun (connPtr=0x55555562ebd0) at queue.c:2648
#23 0x00007ffff7efd0de in NsConnThread (arg=0x555555649030) at queue.c:2211
#24 0x00007ffff7bdd734 in NsThreadMain (arg=0x55555855cdc0) at thread.c:232
#25 0x00007ffff7bdf6f5 in ThreadMain (arg=0x55555855cdc0) at pthread.c:870
#26 0x00007ffff7ccfb43 in start_thread (arg=<optimized out>) at 
./nptl/pthread_create.c:442
#27 0x00007ffff7d61a00 in clone3 () at 
../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

gdb) bt full
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737016493632) 
at ./nptl/pthread_kill.c:44
        tid = <optimized out>
        ret = 0
        pd = 0x7fffe3dff640
        old_mask = {__val = {140737016487840, 140736755639472, 3823099840, 512, 
140737016487920, 140737348535152, 140736755639488, 140736750178896, 
140736755638224,
            140733193388032, 140736750163408, 140736755639472, 140736755639432, 
140736752725904, 93825010219632, 93825010219632}}
        ret = <optimized out>
        pd = <optimized out>
        old_mask = <optimized out>
        ret = <optimized out>
        tid = <optimized out>
        ret = <optimized out>
        resultvar = <optimized out>
        resultvar = <optimized out>
        __arg3 = <optimized out>
       __arg2 = <optimized out>
        __arg1 = <optimized out>
        _a3 = <optimized out>
        _a2 = <optimized out>
        _a1 = <optimized out>
        __futex = <optimized out>
        resultvar = <optimized out>
        __arg3 = <optimized out>
        __arg2 = <optimized out>
        __arg1 = <optimized out>
        _a3 = <optimized out>
        _a2 = <optimized out>
        _a1 = <optimized out>
        __futex = <optimized out>
        __private = <optimized out>
        __oldval = <optimized out>
        result = <optimized out>
#1  __pthread_kill_internal (signo=6, threadid=140737016493632) at 
./nptl/pthread_kill.c:78
No locals.
#2  __GI___pthread_kill (threadid=140737016493632, signo=signo@entry=6) at 
./nptl/pthread_kill.c:89
No locals.
#3  0x00007ffff7c7d476 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/posix/raise.c:26
        ret = <optimized out>
#4  0x00007ffff7c637f3 in __GI_abort () at ./stdlib/abort.c:79
        save_stage = 1
        act = {__sigaction_handler = {sa_handler = 0x600000004, sa_sigaction = 
0x600000004}, sa_mask = {__val = {140736789161264, 140733193388042, 
140737347688968,
              140737016488736, 279037356156, 18446744073709551615, 
140736755314240, 140737016488272, 140733193388033, 140736792190256, 
140736790551568, 0, 140736755639936,
              93825035611088, 140736753589312, 140736756049120}}, sa_flags = 
1487610384, sa_restorer = 0x1}
--Type <RET> for more, q to quit, c to continue without paging--
        sigs = {__val = {32, 140737350793296, 140737488347040, 140737350862035, 
93824993127520, 140736755639576, 8589934656, 93825010219632, 25769803776, 
193273528320,
            140737016488160, 140737349119905, 3823100240, 4294967296, 
2202846355952, 3556773632}}
#5  0x00007ffff7cc46f6 in __libc_message (action=action@entry=do_abort, 
fmt=fmt@entry=0x7ffff7e16b8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155
        ap = {{gp_offset = 24, fp_offset = 0, overflow_arg_area = 
0x7fffe3dfe2a0, reg_save_area = 0x7fffe3dfe230}}
        fd = <optimized out>
        list = <optimized out>
        nlist = <optimized out>
        cp = <optimized out>
#6  0x00007ffff7cdbd7c in malloc_printerr (str=str@entry=0x7ffff7e19230 
"munmap_chunk(): invalid pointer") at ./malloc/malloc.c:5664
No locals.
#7  0x00007ffff7cdc05c in munmap_chunk (p=<optimized out>) at 
./malloc/malloc.c:3060
        pagesize = <optimized out>
        size = <optimized out>
        __PRETTY_FUNCTION__ = "munmap_chunk"
        mem = <optimized out>
        block = <optimized out>
        total_size = <optimized out>
#8  0x00007ffff7ce051a in __GI___libc_free (mem=<optimized out>) at 
./malloc/malloc.c:3381
        ar_ptr = <optimized out>
        p = <optimized out>
        err = 25
#9  0x00007ffff7bdb1e5 in ns_free (ptr=0x7fffd4de0ba0) at memory.c:94
No locals.
#10 0x00007ffff7f09b64 in Ns_SetFree (set=0x7fffd5886210) at set.c:397
        i = 10
        __PRETTY_FUNCTION__ = "Ns_SetFree"
#11 0x00007ffff7f3e119 in NsTclSetObjCmd (clientData=0x7fffd403d590, 
interp=0x7fffd4005250, objc=2, objv=0x7fffd453a510) at tclset.c:330
        key = 0x7fffd464eb50 "d8"
        itPtr = 0x7fffd403d590
        set = 0x7fffd5886210
        ds = {string = 0x7fffd6650c80 "%", length = -738176432, spaceAvl = 
32767,
          staticSpace = 
"\320\344\337\343\377\177\000\000\240\236t\336\377\177\000\000\260\356\004\324\377\177\000\000PZ\000\324\377\177\000\000\000\345\337\343\377\177\000\000\312Ĝ\367\377\177\000\000\360\357Y\324\377\177\000\000\000\000\000\000\000\000\000\000\200\fe\326\377\177\000\000PR\000\324\377\177\000\000\000\000\000\000\001\000\000\000PR\000\324\377\177\000\000PZ\000\324\377\177\000\000\260\356\004\324\377\177\000\000\300\345\337\343\377\177\000\000Э\234\367\377\177\000\000`\345\337\343\377\177\000\000p\203\252\367\000\000\000\000PR\000\324\377\177\000\000H\003Z\324\377\177\000\000\000\017\000\324\377\177\000\000\000\000\000\000\020\000
 
\000\002\000\000\000\377\177\000\000\260\356\004\324\377\177\000\000\000\000\000\000\000\000\000"}
        tablePtr = 0x7fffd403d760
        hPtr = 0x7fffd464eb30
        search = {tablePtr = 0x7fffd403d760, nextIndex = 13, nextEntryPtr = 0x0}
        opt = 1
        result = 0
        opts = {0x7ffff7f89745 "array", 0x7ffff7f8974b "cleanup", 
0x7ffff7f89753 "copy", 0x7ffff7f89758 "cput", 0x7ffff7f8975d "create", 
0x7ffff7f89764 "delete", 0x7ffff7f8976b "delkey", 0x7ffff7f89772 "find", 
0x7ffff7f89777 "free", 0x7ffff7f8977c "get", 0x7ffff7f89780 "icput", 
0x7ffff7f89786 "idelkey", 0x7ffff7f8978e "ifind", 0x7ffff7f89794 "iget", 
0x7ffff7f89799 "imerge", 0x7ffff7f897a0 "isnull", 0x7ffff7f897a7 "iunique", 
0x7ffff7f897af "iupdate", 0x7ffff7f897b7 "key", 0x7ffff7f897bb "keys", 
0x7ffff7f897c0 "list", 0x7ffff7f897c5 "merge", 0x7ffff7f897cb "move", 
0x7ffff7f897d0 "name", 0x7ffff7f897d5 "new", 0x7ffff7f897d9 "print", 
0x7ffff7f897df "put", 0x7ffff7f897e3 "size", 0x7ffff7f897e8 "split", 
0x7ffff7f897ee "truncate", 0x7ffff7f897f7 "unique", 0x7ffff7f897fe "update", 
0x7ffff7f89805 "value", 0x7ffff7f8980b "values", 0x0}
        SArrayIdx = SArrayIdx
        SCleanupIdx = SCleanupIdx
        SCopyIdx = SCopyIdx
        SCPutIdx = SCPutIdx
        SCreateidx = SCreateidx
        SDeleteIdx = SDeleteIdx
        SDelkeyIdx = SDelkeyIdx
        SFindIdx = SFindIdx
        SFreeIdx = SFreeIdx
        SGetIdx = SGetIdx
        SICPutIdx = SICPutIdx
        SIDelkeyIdx = SIDelkeyIdx
        SIFindIdx = SIFindIdx
        SIGetIdx = SIGetIdx
        SIMergeIdx = SIMergeIdx
        SIsNullIdx = SIsNullIdx
        SIUniqueIdx = SIUniqueIdx
        SIUpdateIdx = SIUpdateIdx
        SKeyIdx = SKeyIdx
        SKeysIdx = SKeysIdx
        SListIdx = SListIdx
        SMergeIdx = SMergeIdx
        SMoveIdx = SMoveIdx
        sINameIdx = sINameIdx
        SNewIdx = SNewIdx
        SPrintIdx = SPrintIdx
        SPutIdx = SPutIdx
        SSizeIdx = SSizeIdx
        SSplitIdx = SSplitIdx
        STruncateIdx = STruncateIdx
        SUniqueIdx = SUniqueIdx
        SUpdateIdx = SUpdateIdx
        SValueIdx = SValueIdx
        SValuesIdx = SValuesIdx
        __PRETTY_FUNCTION__ = "NsTclSetObjCmd"
#12 0x00007ffff79cb18e in Dispatch (data=0x7fffd410e3b8, interp=0x7fffd4005250, 
result=0) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:4467
        objProc = 0x7ffff7f3df2d <NsTclSetObjCmd>
        clientData = 0x7fffd403d590
        objc = 2
        objv = 0x7fffd453a510
        iPtr = 0x7fffd4005250
#13 0x00007ffff79cb21f in TclNRRunCallbacks (interp=0x7fffd4005250, result=0, 
rootPtr=0x0) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:4503
        callbackPtr = 0x7fffd410e3b0
        procPtr = 0x7ffff79cb10e <Dispatch>
        iPtr = 0x7fffd4005250
#14 0x00007ffff79ca949 in Tcl_EvalObjv (interp=0x7fffd4005250, objc=1, 
objv=0x7fffd453a2b0, flags=2097168) at 
/usr/local/src/tcl8.6.13/generic/tclBasic.c:4226
        result = 0
        rootPtr = 0x0
#15 0x00007ffff79cd384 in TclEvalEx (interp=0x7fffd4005250, 
script=0x7fffe3dfe880 "ns_cleanup", numBytes=10, flags=0, line=1, 
clNextOuter=0x0, outerScript=0x7fffe3dfe880 "ns_cleanup") at 
/usr/local/src/tcl8.6.13/generic/tclBasic.c:5372
        wordLine = 1
        wordCLNext = 0x0
        objectsNeeded = 1
        wordStart = 0x7fffe3dfe880 "ns_cleanup"
        numWords = 1
        iPtr = 0x7fffd4005250
        p = 0x7fffe3dfe880 "ns_cleanup"
        next = 0x1e3dfe820 <error: Cannot access memory at address 0x1e3dfe820>
        minObjs = 20
        objv = 0x7fffd453a2b0
        objvSpace = 0x7fffd453a2b0
        expand = 0x7fffd453a360
        lines = 0x7fffd453a3c0
        lineSpace = 0x7fffd453a3c0
        tokenPtr = 0x7fffd453a090
        commandLength = 32767
        bytesLeft = 10
        expandRequested = 0
        code = 0
        savedVarFramePtr = 0x7fffd4001550
        allowExceptions = 0
        gotParse = 1
        i = 3823101680
        objectsUsed = 1
        parsePtr = 0x7fffd453a000
        eeFramePtr = 0x7fffd453a250
        stackObjArray = 0x7fffd453a2b0
        expandStack = 0x7fffd453a360
        linesStack = 0x7fffd453a3c0
        clNext = 0x0
#16 0x00007ffff79cc5d9 in Tcl_EvalEx (interp=0x7fffd4005250, 
script=0x7fffe3dfe880 "ns_cleanup", numBytes=10, flags=0) at 
/usr/local/src/tcl8.6.13/generic/tclBasic.c:5037
No locals.
#17 0x00007ffff7f18c02 in Ns_TclEvalCallback (interp=0x7fffd4005250, 
cbPtr=0x5555556a1b30, resultDString=0x0) at tclcallbacks.c:186
        arg = 0x0
        ii = 0
        ap = {{gp_offset = 32, fp_offset = 48, overflow_arg_area = 
0x7fffe3dfea10, reg_save_area = 0x7fffe3dfe950}}
        ds = {string = 0x7fffe3dfe880 "ns_cleanup", length = 10, spaceAvl = 
200, staticSpace = 
"ns_cleanup\000\367\377\177\000\000\300\350\337\343\377\177\000\000P\351\337\343\377\177\000\000\210\277jUUU\000\000@\351\337\343\377\177\000\000\000\351\337\343\377\177\000\000`\354bU\001\001\001\000\340\350\337\343\377\177\000\000\360\350\337\343\377\177\000\000\020\351\337\343\377\177\000\000\270\277jUUU\000\000\020\351\337\343\377\177\000\000\332\356\275\367\377\177\000\000\223z\333d\000\000\000\000
 
\300jUUU\000\000\000\000\000\000\000\000\000\000\270\277jU\005\000\000\000\220\351\337\343\377\177\000\000\023\275\275\367\377\177\000\000\060\352\337\343\377\177\000\000¢\362\367\377\177\000\000\223z\333d\000\000\000\000P\340\025\324\b\000\000\000\220\033jUUU\000"}
        deallocInterp = false
        status = 1
        __PRETTY_FUNCTION__ = "Ns_TclEvalCallback"
#18 0x00007ffff7f29764 in NsTclTraceProc (interp=0x7fffd4005250, 
arg=0x5555556a1b30) at tclinit.c:1913
        cbPtr = 0x5555556a1b30
        result = 0
#19 0x00007ffff7f2a158 in RunTraces (itPtr=0x7fffd403d590, 
why=NS_TCL_TRACE_DEALLOCATE) at tclinit.c:2375
        tracePtr = 0x5555556a1b90
        servPtr = 0x555555628560
        __PRETTY_FUNCTION__ = "RunTraces"
#20 0x00007ffff7f29976 in PushInterp (itPtr=0x7fffd403d590) at tclinit.c:2026
        interp = 0x7fffd4005250
        ok = true
        __PRETTY_FUNCTION__ = "PushInterp"
#21 0x00007ffff7f29717 in NsFreeConnInterp (connPtr=0x55555562ebd0) at 
tclinit.c:1885
        itPtr = 0x7fffd403d590
#22 0x00007ffff7efdf11 in ConnRun (connPtr=0x55555562ebd0) at queue.c:2648
        sockPtr = 0x7fffd98f68a0
        conn = 0x55555562ebd0
        servPtr = 0x555555628560
        status = NS_OK
        auth = 0x0
        __PRETTY_FUNCTION__ = "ConnRun"
#23 0x00007ffff7efd0de in NsConnThread (arg=0x555555649030) at queue.c:2211
        argPtr = 0x555555649030
        poolPtr = 0x55555562d7c0
        servPtr = 0x555555628560
        connPtr = 0x55555562ebd0
        wait = {sec = 1692105481, usec = 312006}
        timePtr = 0x7fffe3dfec20
        threadId = 1
        duringShutdown = 219
        fromQueue = true
        cpt = 1000
        ncons = 996
        current = 2
        status = NS_OK
        timeout = {sec = 120, usec = 0}
        exitMsg = 0x7fffd4000b70 ""
        joinThread = 0x7fffe3dff640
        threadsLockPtr = 0x55555562d830
        tqueueLockPtr = 0x55555562d878
        wqueueLockPtr = 0x55555562d808
        __PRETTY_FUNCTION__ = "NsConnThread"
#24 0x00007ffff7bdd734 in NsThreadMain (arg=0x55555855cdc0) at thread.c:232
        thrPtr = 0x55555855cdc0
#25 0x00007ffff7bdf6f5 in ThreadMain (arg=0x55555855cdc0) at pthread.c:870
No locals.
#26 0x00007ffff7ccfb43 in start_thread (arg=<optimized out>) at 
./nptl/pthread_create.c:442
        ret = <optimized out>
        pd = <optimized out>
        out = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140737488346688, 
-3886469656811452993, 140737016493632, 0, 140737350793296, 140737488347040, 
3886531503754790335, 3886487635365545407}, mask_was_saved = 0}}, priv = {pad = 
{0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#27 0x00007ffff7d61a00 in clone3 () at 
../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
No locals.


thanks
Brian

________________________________
From: Brian Fenton <brian.fen...@aimssoftware.ie>
Sent: Monday 14 August 2023 5:40 pm
To: naviserver-devel@lists.sourceforge.net 
<naviserver-devel@lists.sourceforge.net>
Subject: Re: [naviserver-devel] Crashing on all versions >4.99.24 on Ubuntu

Hi Gustaf

thanks again for the advice. Today I made some more progress on this. There 
does appear to be some differences between your script and the Oupfiz5 
installer e.g. his ns-build.sh script 
https://github.com/oupfiz5/tcl-build/blob/master/src/builds/ns-build.sh  I have 
reached the conclusion that I will be wasting your time if I can't reproduce 
this problem using your scripts, so my next task will be to run your script and 
try to reproduce. I am now seeing the downsides to using a non-official Docker 
approach!

Today I took the approach of installing (through the APM) our OpenACS packages 
one by one. For example, we use packages such as Categories, General Comments 
etc as well as many of our own custom packages. After each package I bounced 
Naviserver and tested the site. The system worked perfectly until after I 
installed the last package, which is our main core of our product, very large 
and old with a lot of features. This makes me very confident that Oracle and 
nsoracle are working fine. The problem could be some API call in our custom 
package that maybe changed in 4.99.25.

To answer some of your questions:

  *   did you run at this state any Oracle queries? Yes, I did. I'm 95% 
confident that Oracle and nsoracle are working fine.
  *   did you recompile in the "clean install" also the oracle driver? Yes, I'm 
building nsoracle from scratch (I am also running the same version of nsoracle 
in the 4.99.24 build that is working without issue)
  *   you mean the crash happens in the plain openacs-config.tcl, with no 
additional drivers etc, no oracle involved? No, this does use Oracle, sorry for 
not being clear. We have our own heavily modified config file, so I wanted to 
rule that out by using the openacs-config.tcl that you provide. I just changed 
the database to Oracle and left everything else as is. The fact that it crashed 
too means that I can eliminate some strange configuration setting in our custom 
config file as a possible cause.
  *   My request in the last mail was to try to reproduce the problem with 
nsd-config.tcl (i.e. no OpenACS involved). Yes, I replied previously that it 
runs fine. And also a simple OpenACS install on Oracle runs fine. The problems 
only start with our custom OpenACS package.
  *   To be on the safe side, all /usr/local/ns/bin/*.so files should be newly 
compiled. Yes, these all appear to be freshly compiled.

# ls -l /usr/local/ns/bin/*.so
-rwxr-xr-x 1 nsadmin nsadmin 32560 Aug 10 15:31 /usr/local/ns/bin/nscgi.so
-rwxr-xr-x 1 nsadmin nsadmin 27360 Aug 10 15:31 /usr/local/ns/bin/nscp.so
-rwxr-xr-x 1 nsadmin nsadmin 15808 Aug 10 15:31 /usr/local/ns/bin/nsdb.so
-rwxr-xr-x 1 nsadmin nsadmin 50808 Aug 10 15:31 /usr/local/ns/bin/nsdbpg.so
-rwxr-xr-x 1 nsadmin nsadmin 16176 Aug 10 15:31 /usr/local/ns/bin/nsdbtest.so
-rwxr-xr-x 1 nsadmin nsadmin 32640 Aug 10 15:31 /usr/local/ns/bin/nslog.so
-rwxr-xr-x 1 nsadmin nsadmin 90688 Aug 10 15:42 /usr/local/ns/bin/nsoracle.so
-rwxr-xr-x 1 nsadmin nsadmin 90848 Aug 10 15:42 
/usr/local/ns/bin/nsoraclecass.so
-rwxr-xr-x 1 nsadmin nsadmin 31712 Aug 10 15:31 /usr/local/ns/bin/nsperm.so
-rwxr-xr-x 1 nsadmin nsadmin 15888 Aug 10 15:31 /usr/local/ns/bin/nsproxy.so
-rwxr-xr-x 1 nsadmin nsadmin 16536 Aug 10 15:31 /usr/local/ns/bin/nssock.so
-rwxr-xr-x 1 nsadmin nsadmin 26624 Aug 10 15:31 /usr/local/ns/bin/nsssl.so

So my next steps are to try to reproduce the problem using your install-ns.sh 
script. Then I can compile with debugging and have some fun with gdb.

thanks
Brian

________________________________
From: Gustaf Neumann <neum...@wu.ac.at>
Sent: Saturday 12 August 2023 11:55 am
To: naviserver-devel@lists.sourceforge.net 
<naviserver-devel@lists.sourceforge.net>
Subject: Re: [naviserver-devel] Crashing on all versions >4.99.24 on Ubuntu



On 11.08.23 20:15, Brian Fenton wrote:
Hi Gustaf

thanks for the response. I've been looking at this in more detail this 
afternoon and it does appear to be caused by something in the interaction of 
our OpenACS application with 4.99.27. As I previously mentioned, it has been 
running fine on 4.99.24 on the same Ubuntu version. I realise that I may not 
have been clear on this point on my previous email: this is Naviserver running 
on Ubuntu in a Docker container. The version of Naviserver is based on this 
Docker build https://github.com/oupfiz5/naviserver-s6 which I have forked and 
updated to 4.99.27 (I may well have missed something in updating NS version - 
maybe I should have waited until oupfiz updates his build).

  *   I can confirm that nsd-config.tcl runs fine with 4.99.27
  *   Some good news: I am able to do an OpenACS clean install on Oracle with 
4.99.27. I then successfully installed our application using the APM.

did you run at this state any Oracle queries?
did you recompile in the "clean install" also the oracle driver?

  *   However, once I restart Naviserver the problems start.
  *   I tried using the openacs-config.tcl that ships with 4.99.27 and the 
problems are happening with that too.

you mean the crash happens in the plain openacs-config.tcl, with no additional 
drivers etc, no oracle involved?
this can get us closer to something i might be able to reproduce. My request in 
the last mail was to try to reproduce the problem with nsd-config.tcl (i.e. no 
OpenACS involved). If you can reproduce the crash, you should compile with 
debugging turned on and run nsd under gdb or lldb. First one should get he most 
simple case causing the crash.


What is odd is that it seems to be able to handle one request before crashing. 
Eg. I type in the URL, it shows the /register page but then crashes. After 
restarting, I enter my login details on the register page, press return. It 
then crashes. After restarting, it successfully logs me, then crashes again.
the memory errors or normally hinting on some buffer overflow, or a mixture 
between 32bit and 64bit compilation, etc.

There is no clear pattern in the logs. I thought it might be related to OCSP 
and disabled that, but the problems continued to occur.
if you suspect nsssl, then one potential problem might be a mixture during of 
different OpenSSL versions during compilation (when using install_ns.sh, this 
will not happen).
Turning on debug hasn't helped - but maybe there is so much information in the 
log that I have missed something important.

What drivers are you referring to in your question?

actually all naviserver modules you are using, including the db drivers (since 
you mentioned nsoracle, which is not part of the regular regression tests). To 
be on the safe side, all /usr/local/ns/bin/*.so files should be newly compiled.


all the best

-gn

thanks
Brian

________________________________
From: Gustaf Neumann <neum...@wu.ac.at><mailto:neum...@wu.ac.at>
Sent: Thursday 10 August 2023 7:27 pm
To: 
naviserver-devel@lists.sourceforge.net<mailto:naviserver-devel@lists.sourceforge.net>
 
<naviserver-devel@lists.sourceforge.net><mailto:naviserver-devel@lists.sourceforge.net>
Subject: Re: [naviserver-devel] Crashing on all versions >4.99.24 on Ubuntu


Hi Brian,


The new NaviServer versions are running fine on Ubuntu 22.04. Have you 
recompiled the drivers you are using with the updated version?


A good test for the NaviServer binary is to test it with one of the packaged 
configuration files, e.g. nsd-config.tcl.


all the best

-gn


On 10.08.23 18:23, Brian Fenton wrote:
Hello

we have been testing out our OpenACS application on Ubuntu 22.04.2 LTS 
(previously we only ran on Windows). It was working great with Naviserver 
4.99.24 but I have been getting constant crashes on more recent versions.

I get this error on 4.99.25, 4.99.26 and today I also got it on 4.99.27. The 
server runs fine until I click on a page, then it immediately crashes.
The log has only the following error:
free(): invalid size

and today I got this one:
[10/Aug/2023:15:02:23][303.7fa3a64ee640][-conn:openacs:default:1:119-] Fatal: 
received fatal signal 11

We have an Oracle application and are using the latest nsoracle driver, which 
might be a factor here.
We have been running it with a pretty old OpenACS config file, so I am 
currently looking to merge in all the latest changes to ensure that is not an 
issue.
Also note that I am running Naviserver on Docker on Windows, but as mentioned 
it was running great on 4.99.24.

thanks for any help
Brian





_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net<mailto:naviserver-devel@lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


--
Univ.Prof. Dr. Gustaf Neumann
Head of the Institute of Information Systems and New Media
of Vienna University of Economics and Business
Program Director of MSc "Information Systems"
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to