Hi I reproduced the problem using the install-ns.sh script running under gdb. Here's the output of backtrace and bt full. I'm new to using gdb so please let me know if you'd like to see some other info.
[15/Aug/2023:13:56:52][13147.7fffe35fe640][-driver:nsssl:0-] Notice: ... sockAccept accepted 2 connections free(): invalid next size (fast) Thread 4 "nsd" received signal SIGABRT, Aborted. [Switching to Thread 0x7ffff4ad8640 (LWP 13651)] __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737298400832) at ./nptl/pthread_kill.c:44 44 ./nptl/pthread_kill.c: No such file or directory. (gdb) backtrace #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737016493632) at ./nptl/pthread_kill.c:44 #1 __pthread_kill_internal (signo=6, threadid=140737016493632) at ./nptl/pthread_kill.c:78 #2 __GI___pthread_kill (threadid=140737016493632, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 #3 0x00007ffff7c7d476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #4 0x00007ffff7c637f3 in __GI_abort () at ./stdlib/abort.c:79 #5 0x00007ffff7cc46f6 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7e16b8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155 #6 0x00007ffff7cdbd7c in malloc_printerr (str=str@entry=0x7ffff7e19230 "munmap_chunk(): invalid pointer") at ./malloc/malloc.c:5664 #7 0x00007ffff7cdc05c in munmap_chunk (p=<optimized out>) at ./malloc/malloc.c:3060 #8 0x00007ffff7ce051a in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3381 #9 0x00007ffff7bdb1e5 in ns_free (ptr=0x7fffd4de0ba0) at memory.c:94 #10 0x00007ffff7f09b64 in Ns_SetFree (set=0x7fffd5886210) at set.c:397 #11 0x00007ffff7f3e119 in NsTclSetObjCmd (clientData=0x7fffd403d590, interp=0x7fffd4005250, objc=2, objv=0x7fffd453a510) at tclset.c:330 #12 0x00007ffff79cb18e in Dispatch (data=0x7fffd410e3b8, interp=0x7fffd4005250, result=0) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:4467 #13 0x00007ffff79cb21f in TclNRRunCallbacks (interp=0x7fffd4005250, result=0, rootPtr=0x0) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:4503 #14 0x00007ffff79ca949 in Tcl_EvalObjv (interp=0x7fffd4005250, objc=1, objv=0x7fffd453a2b0, flags=2097168) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:4226 #15 0x00007ffff79cd384 in TclEvalEx (interp=0x7fffd4005250, script=0x7fffe3dfe880 "ns_cleanup", numBytes=10, flags=0, line=1, clNextOuter=0x0, outerScript=0x7fffe3dfe880 "ns_cleanup") at /usr/local/src/tcl8.6.13/generic/tclBasic.c:5372 #16 0x00007ffff79cc5d9 in Tcl_EvalEx (interp=0x7fffd4005250, script=0x7fffe3dfe880 "ns_cleanup", numBytes=10, flags=0) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:5037 #17 0x00007ffff7f18c02 in Ns_TclEvalCallback (interp=0x7fffd4005250, cbPtr=0x5555556a1b30, resultDString=0x0) at tclcallbacks.c:186 #18 0x00007ffff7f29764 in NsTclTraceProc (interp=0x7fffd4005250, arg=0x5555556a1b30) at tclinit.c:1913 #19 0x00007ffff7f2a158 in RunTraces (itPtr=0x7fffd403d590, why=NS_TCL_TRACE_DEALLOCATE) at tclinit.c:2375 #20 0x00007ffff7f29976 in PushInterp (itPtr=0x7fffd403d590) at tclinit.c:2026 #21 0x00007ffff7f29717 in NsFreeConnInterp (connPtr=0x55555562ebd0) at tclinit.c:1885 #22 0x00007ffff7efdf11 in ConnRun (connPtr=0x55555562ebd0) at queue.c:2648 #23 0x00007ffff7efd0de in NsConnThread (arg=0x555555649030) at queue.c:2211 #24 0x00007ffff7bdd734 in NsThreadMain (arg=0x55555855cdc0) at thread.c:232 #25 0x00007ffff7bdf6f5 in ThreadMain (arg=0x55555855cdc0) at pthread.c:870 #26 0x00007ffff7ccfb43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 #27 0x00007ffff7d61a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 gdb) bt full #0 __pthread_kill_implementation (no_tid=0, signo=6, threadid=140737016493632) at ./nptl/pthread_kill.c:44 tid = <optimized out> ret = 0 pd = 0x7fffe3dff640 old_mask = {__val = {140737016487840, 140736755639472, 3823099840, 512, 140737016487920, 140737348535152, 140736755639488, 140736750178896, 140736755638224, 140733193388032, 140736750163408, 140736755639472, 140736755639432, 140736752725904, 93825010219632, 93825010219632}} ret = <optimized out> pd = <optimized out> old_mask = <optimized out> ret = <optimized out> tid = <optimized out> ret = <optimized out> resultvar = <optimized out> resultvar = <optimized out> __arg3 = <optimized out> __arg2 = <optimized out> __arg1 = <optimized out> _a3 = <optimized out> _a2 = <optimized out> _a1 = <optimized out> __futex = <optimized out> resultvar = <optimized out> __arg3 = <optimized out> __arg2 = <optimized out> __arg1 = <optimized out> _a3 = <optimized out> _a2 = <optimized out> _a1 = <optimized out> __futex = <optimized out> __private = <optimized out> __oldval = <optimized out> result = <optimized out> #1 __pthread_kill_internal (signo=6, threadid=140737016493632) at ./nptl/pthread_kill.c:78 No locals. #2 __GI___pthread_kill (threadid=140737016493632, signo=signo@entry=6) at ./nptl/pthread_kill.c:89 No locals. #3 0x00007ffff7c7d476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 ret = <optimized out> #4 0x00007ffff7c637f3 in __GI_abort () at ./stdlib/abort.c:79 save_stage = 1 act = {__sigaction_handler = {sa_handler = 0x600000004, sa_sigaction = 0x600000004}, sa_mask = {__val = {140736789161264, 140733193388042, 140737347688968, 140737016488736, 279037356156, 18446744073709551615, 140736755314240, 140737016488272, 140733193388033, 140736792190256, 140736790551568, 0, 140736755639936, 93825035611088, 140736753589312, 140736756049120}}, sa_flags = 1487610384, sa_restorer = 0x1} --Type <RET> for more, q to quit, c to continue without paging-- sigs = {__val = {32, 140737350793296, 140737488347040, 140737350862035, 93824993127520, 140736755639576, 8589934656, 93825010219632, 25769803776, 193273528320, 140737016488160, 140737349119905, 3823100240, 4294967296, 2202846355952, 3556773632}} #5 0x00007ffff7cc46f6 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7e16b8c "%s\n") at ../sysdeps/posix/libc_fatal.c:155 ap = {{gp_offset = 24, fp_offset = 0, overflow_arg_area = 0x7fffe3dfe2a0, reg_save_area = 0x7fffe3dfe230}} fd = <optimized out> list = <optimized out> nlist = <optimized out> cp = <optimized out> #6 0x00007ffff7cdbd7c in malloc_printerr (str=str@entry=0x7ffff7e19230 "munmap_chunk(): invalid pointer") at ./malloc/malloc.c:5664 No locals. #7 0x00007ffff7cdc05c in munmap_chunk (p=<optimized out>) at ./malloc/malloc.c:3060 pagesize = <optimized out> size = <optimized out> __PRETTY_FUNCTION__ = "munmap_chunk" mem = <optimized out> block = <optimized out> total_size = <optimized out> #8 0x00007ffff7ce051a in __GI___libc_free (mem=<optimized out>) at ./malloc/malloc.c:3381 ar_ptr = <optimized out> p = <optimized out> err = 25 #9 0x00007ffff7bdb1e5 in ns_free (ptr=0x7fffd4de0ba0) at memory.c:94 No locals. #10 0x00007ffff7f09b64 in Ns_SetFree (set=0x7fffd5886210) at set.c:397 i = 10 __PRETTY_FUNCTION__ = "Ns_SetFree" #11 0x00007ffff7f3e119 in NsTclSetObjCmd (clientData=0x7fffd403d590, interp=0x7fffd4005250, objc=2, objv=0x7fffd453a510) at tclset.c:330 key = 0x7fffd464eb50 "d8" itPtr = 0x7fffd403d590 set = 0x7fffd5886210 ds = {string = 0x7fffd6650c80 "%", length = -738176432, spaceAvl = 32767, staticSpace = "\320\344\337\343\377\177\000\000\240\236t\336\377\177\000\000\260\356\004\324\377\177\000\000PZ\000\324\377\177\000\000\000\345\337\343\377\177\000\000\312Ĝ\367\377\177\000\000\360\357Y\324\377\177\000\000\000\000\000\000\000\000\000\000\200\fe\326\377\177\000\000PR\000\324\377\177\000\000\000\000\000\000\001\000\000\000PR\000\324\377\177\000\000PZ\000\324\377\177\000\000\260\356\004\324\377\177\000\000\300\345\337\343\377\177\000\000Э\234\367\377\177\000\000`\345\337\343\377\177\000\000p\203\252\367\000\000\000\000PR\000\324\377\177\000\000H\003Z\324\377\177\000\000\000\017\000\324\377\177\000\000\000\000\000\000\020\000 \000\002\000\000\000\377\177\000\000\260\356\004\324\377\177\000\000\000\000\000\000\000\000\000"} tablePtr = 0x7fffd403d760 hPtr = 0x7fffd464eb30 search = {tablePtr = 0x7fffd403d760, nextIndex = 13, nextEntryPtr = 0x0} opt = 1 result = 0 opts = {0x7ffff7f89745 "array", 0x7ffff7f8974b "cleanup", 0x7ffff7f89753 "copy", 0x7ffff7f89758 "cput", 0x7ffff7f8975d "create", 0x7ffff7f89764 "delete", 0x7ffff7f8976b "delkey", 0x7ffff7f89772 "find", 0x7ffff7f89777 "free", 0x7ffff7f8977c "get", 0x7ffff7f89780 "icput", 0x7ffff7f89786 "idelkey", 0x7ffff7f8978e "ifind", 0x7ffff7f89794 "iget", 0x7ffff7f89799 "imerge", 0x7ffff7f897a0 "isnull", 0x7ffff7f897a7 "iunique", 0x7ffff7f897af "iupdate", 0x7ffff7f897b7 "key", 0x7ffff7f897bb "keys", 0x7ffff7f897c0 "list", 0x7ffff7f897c5 "merge", 0x7ffff7f897cb "move", 0x7ffff7f897d0 "name", 0x7ffff7f897d5 "new", 0x7ffff7f897d9 "print", 0x7ffff7f897df "put", 0x7ffff7f897e3 "size", 0x7ffff7f897e8 "split", 0x7ffff7f897ee "truncate", 0x7ffff7f897f7 "unique", 0x7ffff7f897fe "update", 0x7ffff7f89805 "value", 0x7ffff7f8980b "values", 0x0} SArrayIdx = SArrayIdx SCleanupIdx = SCleanupIdx SCopyIdx = SCopyIdx SCPutIdx = SCPutIdx SCreateidx = SCreateidx SDeleteIdx = SDeleteIdx SDelkeyIdx = SDelkeyIdx SFindIdx = SFindIdx SFreeIdx = SFreeIdx SGetIdx = SGetIdx SICPutIdx = SICPutIdx SIDelkeyIdx = SIDelkeyIdx SIFindIdx = SIFindIdx SIGetIdx = SIGetIdx SIMergeIdx = SIMergeIdx SIsNullIdx = SIsNullIdx SIUniqueIdx = SIUniqueIdx SIUpdateIdx = SIUpdateIdx SKeyIdx = SKeyIdx SKeysIdx = SKeysIdx SListIdx = SListIdx SMergeIdx = SMergeIdx SMoveIdx = SMoveIdx sINameIdx = sINameIdx SNewIdx = SNewIdx SPrintIdx = SPrintIdx SPutIdx = SPutIdx SSizeIdx = SSizeIdx SSplitIdx = SSplitIdx STruncateIdx = STruncateIdx SUniqueIdx = SUniqueIdx SUpdateIdx = SUpdateIdx SValueIdx = SValueIdx SValuesIdx = SValuesIdx __PRETTY_FUNCTION__ = "NsTclSetObjCmd" #12 0x00007ffff79cb18e in Dispatch (data=0x7fffd410e3b8, interp=0x7fffd4005250, result=0) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:4467 objProc = 0x7ffff7f3df2d <NsTclSetObjCmd> clientData = 0x7fffd403d590 objc = 2 objv = 0x7fffd453a510 iPtr = 0x7fffd4005250 #13 0x00007ffff79cb21f in TclNRRunCallbacks (interp=0x7fffd4005250, result=0, rootPtr=0x0) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:4503 callbackPtr = 0x7fffd410e3b0 procPtr = 0x7ffff79cb10e <Dispatch> iPtr = 0x7fffd4005250 #14 0x00007ffff79ca949 in Tcl_EvalObjv (interp=0x7fffd4005250, objc=1, objv=0x7fffd453a2b0, flags=2097168) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:4226 result = 0 rootPtr = 0x0 #15 0x00007ffff79cd384 in TclEvalEx (interp=0x7fffd4005250, script=0x7fffe3dfe880 "ns_cleanup", numBytes=10, flags=0, line=1, clNextOuter=0x0, outerScript=0x7fffe3dfe880 "ns_cleanup") at /usr/local/src/tcl8.6.13/generic/tclBasic.c:5372 wordLine = 1 wordCLNext = 0x0 objectsNeeded = 1 wordStart = 0x7fffe3dfe880 "ns_cleanup" numWords = 1 iPtr = 0x7fffd4005250 p = 0x7fffe3dfe880 "ns_cleanup" next = 0x1e3dfe820 <error: Cannot access memory at address 0x1e3dfe820> minObjs = 20 objv = 0x7fffd453a2b0 objvSpace = 0x7fffd453a2b0 expand = 0x7fffd453a360 lines = 0x7fffd453a3c0 lineSpace = 0x7fffd453a3c0 tokenPtr = 0x7fffd453a090 commandLength = 32767 bytesLeft = 10 expandRequested = 0 code = 0 savedVarFramePtr = 0x7fffd4001550 allowExceptions = 0 gotParse = 1 i = 3823101680 objectsUsed = 1 parsePtr = 0x7fffd453a000 eeFramePtr = 0x7fffd453a250 stackObjArray = 0x7fffd453a2b0 expandStack = 0x7fffd453a360 linesStack = 0x7fffd453a3c0 clNext = 0x0 #16 0x00007ffff79cc5d9 in Tcl_EvalEx (interp=0x7fffd4005250, script=0x7fffe3dfe880 "ns_cleanup", numBytes=10, flags=0) at /usr/local/src/tcl8.6.13/generic/tclBasic.c:5037 No locals. #17 0x00007ffff7f18c02 in Ns_TclEvalCallback (interp=0x7fffd4005250, cbPtr=0x5555556a1b30, resultDString=0x0) at tclcallbacks.c:186 arg = 0x0 ii = 0 ap = {{gp_offset = 32, fp_offset = 48, overflow_arg_area = 0x7fffe3dfea10, reg_save_area = 0x7fffe3dfe950}} ds = {string = 0x7fffe3dfe880 "ns_cleanup", length = 10, spaceAvl = 200, staticSpace = "ns_cleanup\000\367\377\177\000\000\300\350\337\343\377\177\000\000P\351\337\343\377\177\000\000\210\277jUUU\000\000@\351\337\343\377\177\000\000\000\351\337\343\377\177\000\000`\354bU\001\001\001\000\340\350\337\343\377\177\000\000\360\350\337\343\377\177\000\000\020\351\337\343\377\177\000\000\270\277jUUU\000\000\020\351\337\343\377\177\000\000\332\356\275\367\377\177\000\000\223z\333d\000\000\000\000 \300jUUU\000\000\000\000\000\000\000\000\000\000\270\277jU\005\000\000\000\220\351\337\343\377\177\000\000\023\275\275\367\377\177\000\000\060\352\337\343\377\177\000\000¢\362\367\377\177\000\000\223z\333d\000\000\000\000P\340\025\324\b\000\000\000\220\033jUUU\000"} deallocInterp = false status = 1 __PRETTY_FUNCTION__ = "Ns_TclEvalCallback" #18 0x00007ffff7f29764 in NsTclTraceProc (interp=0x7fffd4005250, arg=0x5555556a1b30) at tclinit.c:1913 cbPtr = 0x5555556a1b30 result = 0 #19 0x00007ffff7f2a158 in RunTraces (itPtr=0x7fffd403d590, why=NS_TCL_TRACE_DEALLOCATE) at tclinit.c:2375 tracePtr = 0x5555556a1b90 servPtr = 0x555555628560 __PRETTY_FUNCTION__ = "RunTraces" #20 0x00007ffff7f29976 in PushInterp (itPtr=0x7fffd403d590) at tclinit.c:2026 interp = 0x7fffd4005250 ok = true __PRETTY_FUNCTION__ = "PushInterp" #21 0x00007ffff7f29717 in NsFreeConnInterp (connPtr=0x55555562ebd0) at tclinit.c:1885 itPtr = 0x7fffd403d590 #22 0x00007ffff7efdf11 in ConnRun (connPtr=0x55555562ebd0) at queue.c:2648 sockPtr = 0x7fffd98f68a0 conn = 0x55555562ebd0 servPtr = 0x555555628560 status = NS_OK auth = 0x0 __PRETTY_FUNCTION__ = "ConnRun" #23 0x00007ffff7efd0de in NsConnThread (arg=0x555555649030) at queue.c:2211 argPtr = 0x555555649030 poolPtr = 0x55555562d7c0 servPtr = 0x555555628560 connPtr = 0x55555562ebd0 wait = {sec = 1692105481, usec = 312006} timePtr = 0x7fffe3dfec20 threadId = 1 duringShutdown = 219 fromQueue = true cpt = 1000 ncons = 996 current = 2 status = NS_OK timeout = {sec = 120, usec = 0} exitMsg = 0x7fffd4000b70 "" joinThread = 0x7fffe3dff640 threadsLockPtr = 0x55555562d830 tqueueLockPtr = 0x55555562d878 wqueueLockPtr = 0x55555562d808 __PRETTY_FUNCTION__ = "NsConnThread" #24 0x00007ffff7bdd734 in NsThreadMain (arg=0x55555855cdc0) at thread.c:232 thrPtr = 0x55555855cdc0 #25 0x00007ffff7bdf6f5 in ThreadMain (arg=0x55555855cdc0) at pthread.c:870 No locals. #26 0x00007ffff7ccfb43 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442 ret = <optimized out> pd = <optimized out> out = <optimized out> unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140737488346688, -3886469656811452993, 140737016493632, 0, 140737350793296, 140737488347040, 3886531503754790335, 3886487635365545407}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}} not_first_call = <optimized out> #27 0x00007ffff7d61a00 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 No locals. thanks Brian ________________________________ From: Brian Fenton <brian.fen...@aimssoftware.ie> Sent: Monday 14 August 2023 5:40 pm To: naviserver-devel@lists.sourceforge.net <naviserver-devel@lists.sourceforge.net> Subject: Re: [naviserver-devel] Crashing on all versions >4.99.24 on Ubuntu Hi Gustaf thanks again for the advice. Today I made some more progress on this. There does appear to be some differences between your script and the Oupfiz5 installer e.g. his ns-build.sh script https://github.com/oupfiz5/tcl-build/blob/master/src/builds/ns-build.sh I have reached the conclusion that I will be wasting your time if I can't reproduce this problem using your scripts, so my next task will be to run your script and try to reproduce. I am now seeing the downsides to using a non-official Docker approach! Today I took the approach of installing (through the APM) our OpenACS packages one by one. For example, we use packages such as Categories, General Comments etc as well as many of our own custom packages. After each package I bounced Naviserver and tested the site. The system worked perfectly until after I installed the last package, which is our main core of our product, very large and old with a lot of features. This makes me very confident that Oracle and nsoracle are working fine. The problem could be some API call in our custom package that maybe changed in 4.99.25. To answer some of your questions: * did you run at this state any Oracle queries? Yes, I did. I'm 95% confident that Oracle and nsoracle are working fine. * did you recompile in the "clean install" also the oracle driver? Yes, I'm building nsoracle from scratch (I am also running the same version of nsoracle in the 4.99.24 build that is working without issue) * you mean the crash happens in the plain openacs-config.tcl, with no additional drivers etc, no oracle involved? No, this does use Oracle, sorry for not being clear. We have our own heavily modified config file, so I wanted to rule that out by using the openacs-config.tcl that you provide. I just changed the database to Oracle and left everything else as is. The fact that it crashed too means that I can eliminate some strange configuration setting in our custom config file as a possible cause. * My request in the last mail was to try to reproduce the problem with nsd-config.tcl (i.e. no OpenACS involved). Yes, I replied previously that it runs fine. And also a simple OpenACS install on Oracle runs fine. The problems only start with our custom OpenACS package. * To be on the safe side, all /usr/local/ns/bin/*.so files should be newly compiled. Yes, these all appear to be freshly compiled. # ls -l /usr/local/ns/bin/*.so -rwxr-xr-x 1 nsadmin nsadmin 32560 Aug 10 15:31 /usr/local/ns/bin/nscgi.so -rwxr-xr-x 1 nsadmin nsadmin 27360 Aug 10 15:31 /usr/local/ns/bin/nscp.so -rwxr-xr-x 1 nsadmin nsadmin 15808 Aug 10 15:31 /usr/local/ns/bin/nsdb.so -rwxr-xr-x 1 nsadmin nsadmin 50808 Aug 10 15:31 /usr/local/ns/bin/nsdbpg.so -rwxr-xr-x 1 nsadmin nsadmin 16176 Aug 10 15:31 /usr/local/ns/bin/nsdbtest.so -rwxr-xr-x 1 nsadmin nsadmin 32640 Aug 10 15:31 /usr/local/ns/bin/nslog.so -rwxr-xr-x 1 nsadmin nsadmin 90688 Aug 10 15:42 /usr/local/ns/bin/nsoracle.so -rwxr-xr-x 1 nsadmin nsadmin 90848 Aug 10 15:42 /usr/local/ns/bin/nsoraclecass.so -rwxr-xr-x 1 nsadmin nsadmin 31712 Aug 10 15:31 /usr/local/ns/bin/nsperm.so -rwxr-xr-x 1 nsadmin nsadmin 15888 Aug 10 15:31 /usr/local/ns/bin/nsproxy.so -rwxr-xr-x 1 nsadmin nsadmin 16536 Aug 10 15:31 /usr/local/ns/bin/nssock.so -rwxr-xr-x 1 nsadmin nsadmin 26624 Aug 10 15:31 /usr/local/ns/bin/nsssl.so So my next steps are to try to reproduce the problem using your install-ns.sh script. Then I can compile with debugging and have some fun with gdb. thanks Brian ________________________________ From: Gustaf Neumann <neum...@wu.ac.at> Sent: Saturday 12 August 2023 11:55 am To: naviserver-devel@lists.sourceforge.net <naviserver-devel@lists.sourceforge.net> Subject: Re: [naviserver-devel] Crashing on all versions >4.99.24 on Ubuntu On 11.08.23 20:15, Brian Fenton wrote: Hi Gustaf thanks for the response. I've been looking at this in more detail this afternoon and it does appear to be caused by something in the interaction of our OpenACS application with 4.99.27. As I previously mentioned, it has been running fine on 4.99.24 on the same Ubuntu version. I realise that I may not have been clear on this point on my previous email: this is Naviserver running on Ubuntu in a Docker container. The version of Naviserver is based on this Docker build https://github.com/oupfiz5/naviserver-s6 which I have forked and updated to 4.99.27 (I may well have missed something in updating NS version - maybe I should have waited until oupfiz updates his build). * I can confirm that nsd-config.tcl runs fine with 4.99.27 * Some good news: I am able to do an OpenACS clean install on Oracle with 4.99.27. I then successfully installed our application using the APM. did you run at this state any Oracle queries? did you recompile in the "clean install" also the oracle driver? * However, once I restart Naviserver the problems start. * I tried using the openacs-config.tcl that ships with 4.99.27 and the problems are happening with that too. you mean the crash happens in the plain openacs-config.tcl, with no additional drivers etc, no oracle involved? this can get us closer to something i might be able to reproduce. My request in the last mail was to try to reproduce the problem with nsd-config.tcl (i.e. no OpenACS involved). If you can reproduce the crash, you should compile with debugging turned on and run nsd under gdb or lldb. First one should get he most simple case causing the crash. What is odd is that it seems to be able to handle one request before crashing. Eg. I type in the URL, it shows the /register page but then crashes. After restarting, I enter my login details on the register page, press return. It then crashes. After restarting, it successfully logs me, then crashes again. the memory errors or normally hinting on some buffer overflow, or a mixture between 32bit and 64bit compilation, etc. There is no clear pattern in the logs. I thought it might be related to OCSP and disabled that, but the problems continued to occur. if you suspect nsssl, then one potential problem might be a mixture during of different OpenSSL versions during compilation (when using install_ns.sh, this will not happen). Turning on debug hasn't helped - but maybe there is so much information in the log that I have missed something important. What drivers are you referring to in your question? actually all naviserver modules you are using, including the db drivers (since you mentioned nsoracle, which is not part of the regular regression tests). To be on the safe side, all /usr/local/ns/bin/*.so files should be newly compiled. all the best -gn thanks Brian ________________________________ From: Gustaf Neumann <neum...@wu.ac.at><mailto:neum...@wu.ac.at> Sent: Thursday 10 August 2023 7:27 pm To: naviserver-devel@lists.sourceforge.net<mailto:naviserver-devel@lists.sourceforge.net> <naviserver-devel@lists.sourceforge.net><mailto:naviserver-devel@lists.sourceforge.net> Subject: Re: [naviserver-devel] Crashing on all versions >4.99.24 on Ubuntu Hi Brian, The new NaviServer versions are running fine on Ubuntu 22.04. Have you recompiled the drivers you are using with the updated version? A good test for the NaviServer binary is to test it with one of the packaged configuration files, e.g. nsd-config.tcl. all the best -gn On 10.08.23 18:23, Brian Fenton wrote: Hello we have been testing out our OpenACS application on Ubuntu 22.04.2 LTS (previously we only ran on Windows). It was working great with Naviserver 4.99.24 but I have been getting constant crashes on more recent versions. I get this error on 4.99.25, 4.99.26 and today I also got it on 4.99.27. The server runs fine until I click on a page, then it immediately crashes. The log has only the following error: free(): invalid size and today I got this one: [10/Aug/2023:15:02:23][303.7fa3a64ee640][-conn:openacs:default:1:119-] Fatal: received fatal signal 11 We have an Oracle application and are using the latest nsoracle driver, which might be a factor here. We have been running it with a pretty old OpenACS config file, so I am currently looking to merge in all the latest changes to ensure that is not an issue. Also note that I am running Naviserver on Docker on Windows, but as mentioned it was running great on 4.99.24. thanks for any help Brian _______________________________________________ naviserver-devel mailing list naviserver-devel@lists.sourceforge.net<mailto:naviserver-devel@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/naviserver-devel -- Univ.Prof. Dr. Gustaf Neumann Head of the Institute of Information Systems and New Media of Vienna University of Economics and Business Program Director of MSc "Information Systems"
_______________________________________________ naviserver-devel mailing list naviserver-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/naviserver-devel