Dear all,
This problem is now fixed on bitbucket. the problem occurred,
when one thread frees a nsv-array, but an internal representation
of an Tcl_Obj for this array in another thread still contained a
pointer to the (freed) array. It seems that whole nsv-arrays
are not to often freed in applications.
The bug was introduced many years ago, when starting to use
TclSetOpaqueObj() for Array structures. This did not hurt very
long time, since the arrays were never freed - causing
a memory leak. The problem became virulent by a change of me
fixing this memory leak of unset nsv-array structures in Nov 2014.
After the change we use now a less aggressive caching by
just storing the bucket pointer in the internal representation of
the Tcl_Obj. In order to get a full caching of the array as before,
the best thing would probably be the introduction of a new
tcl-obj type which uses an epoch on the bucket for the
validation of the array structure.
For the time being it is more important to get a robust version out.
There are two more fixes already committed on bitbucket, where
were flagged by the testing of Wolfgang Winkler (many thanks!),
so i think we should treat 4.99.7 as a pre-release of 4.99.8,
which we could release next week or so.
all the best
-gustaf neumann
Am 02.03.15 um 14:06 schrieb David Osborne:
Thanks Gustaf.
I've over written the original core dump I sent to you, but this the
equivalent info from a new core (this was a seg fault this time but
appears to be at exactly the same location). The arrayObj is
0x2b45a40084e0 in this case.
Does any of this help further?
PS. this does not happen every time. It's intermittent. Maybe 50% of
the times I run "make test"
(gdb) bt
#0 Ns_MutexLock (mutex=0x2b4500001004) at mutex.c:239
#1 0x00002b45989bf9ed in LockArrayObj
(interp=interp@entry=0x2b45a4030ea0, arrayObj=0x2b45a40084e0,
create=create@entry=0) at tclvar.c:1265
#2 0x00002b45989bef7e in NsTclNsvArrayObjCmd
(UNUSED_clientData=<optimized out>, interp=0x2b45a4030ea0, objc=3,
objv=0x2b45a40571b8) at tclvar.c:669
#3 0x00002b4599288e59 in TclEvalObjvInternal () from
/usr/lib/x86_64-linux-gnu/libtcl8.5.so.0
#4 0x00002b45992cf95e in TclExecuteByteCode () from
/usr/lib/x86_64-linux-gnu/libtcl8.5.so.0
#5 0x00002b4599312ce9 in TclObjInterpProcCore () from
/usr/lib/x86_64-linux-gnu/libtcl8.5.so.0
#6 0x00002b4599288e59 in TclEvalObjvInternal () from
/usr/lib/x86_64-linux-gnu/libtcl8.5.so.0
#7 0x00002b4599289b29 in TclEvalEx () from
/usr/lib/x86_64-linux-gnu/libtcl8.5.so.0
#8 0x00002b4599289473 in Tcl_EvalEx () from
/usr/lib/x86_64-linux-gnu/libtcl8.5.so.0
#9 0x00002b45989ac1cb in Ns_TclEval (dsPtr=dsPtr@entry=0x0,
server=<optimized out>,
script=script@entry=0x2b459c382d74 "\n # If necessary due
to running this code in a different environment, you\n # can
have the newly spawned worker thread first source this file here.\n
tst_cond_worker\n ") at tclinit.c:334
#10 0x00002b45989bd073 in NsTclThread (arg=0x2b459c382d60) at
tclthread.c:834
#11 0x00002b459905184c in NsThreadMain (arg=<optimized out>) at
thread.c:227
#12 0x00002b4599052839 in ThreadMain (arg=<optimized out>) at
pthread.c:809
#13 0x00002b459a04f0a4 in start_thread () from
/lib/x86_64-linux-gnu/libpthread.so.0
#14 0x00002b4599b7fccd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111
(gdb) frame 1
#1 0x00002b45989bf9ed in LockArrayObj
(interp=interp@entry=0x2b45a4030ea0, arrayObj=0x2b45a40084e0,
create=create@entry=0) at tclvar.c:1265
1265 Ns_MutexLock(&(arrayPtr->bucketPtr->lock));
(gdb) list
1260 assert(interp != NULL);
1261 assert(arrayObj != NULL);
1262
1263 if (likely(Ns_TclGetOpaqueFromObj(arrayObj, arrayType,
(void **) &arrayPtr) == TCL_OK)
1264 && arrayPtr->bucketPtr != NULL) {
1265 Ns_MutexLock(&(arrayPtr->bucketPtr->lock));
1266 arrayPtr->locks++;
1267 } else {
1268 NsInterp *itPtr = NsGetInterpData(interp);
1269
(gdb) print *arrayPtr
$41 = {bucketPtr = 0x2b4500001004, entryPtr = 0x2b459c320010, vars =
{buckets = 0x1, staticBuckets = {[0] = 0x0, [1] = 0x2b45a4023210, [2]
= 0x3557e6,
[3] = 0x2b459c2190f0}, numBuckets = -1674442720, numEntries =
11077, rebuildSize = -1725047104, downShift = 11077, mask = -1725047072,
keyType = 11077, findProc = 0x2b459957f5c0 <tclVarHashKeyType>,
createProc = 0, typePtr = 0x6c757365722d207d}, locks = 13545984096477300}
(gdb) print arrayObj
$42 = (Tcl_Obj *) 0x2b45a40084e0
(gdb) print *arrayObj
$43 = {refCount = 3, bytes = 0x2b459c392dd0 "ct1_work_queue", length =
14, typePtr = 0x2b4598bf5ac0, internalRep = {longValue = 47577913202687,
doubleValue = 2.3506612414264323e-310, otherValuePtr =
0x2b45989d97ff, wideValue = 47577913202687, twoPtrValue = {ptr1 =
0x2b45989d97ff,
ptr2 = 0x2b459c2190f0}, ptrAndLongRep = {ptr = 0x2b45989d97ff,
value = 47577972183280}}}
(gdb) x/s 0x2b45989d97ff
0x2b45989d97ff: "nsv:array"
(gdb) print *arrayObj->typePtr
$46 = {name = 0x2b45989d7f03 "ns:addr", freeIntRepProc = 0,
dupIntRepProc = 0, updateStringProc = 0x2b45989b3930 <UpdateStringOfAddr>,
setFromAnyProc = 0x2b45989b39c0 <SetAddrFromAny>}
On 27 February 2015 at 20:48, Gustaf Neumann <neum...@wu.ac.at
<mailto:neum...@wu.ac.at>> wrote:
Hi David,
this is certainly not as expected, but i can't reproduce it.
Does this happen on every run?
It would be interesting to see the content of
arrayObj=0xa05a8b0
in frame #1 where the bytes should be the name of the array in
question,
the type should be a "ns:addr", ptr1 should be "nsv:array" and ptr2
the arrayPtr. It is also interesting to see the pontents of arrayPtr.
i've built and tested the server on various unix systems, the closest
was probably an ubunu 12.04.
-g
Am 27.02.15 um 17:52 schrieb David Osborne:
Hi,
We're looking at doing a build of Naviserver tagged as 4.99.7.
But we seem to be hitting an intermittent seg fault or bus error
during the "make test".
Sometimes the tests complete cleanly.
It's often while running ns_conn.test, usually it PASSED
ns_conn-1.2 then crashes.
This is on a Debian 7.8 server.
Built by doing:
./autogen.sh --with-tcl=/usr/lib/tcl8.5
./configure --with-tcl=/usr/lib/tcl8.5 --prefix=/usr/local/ns
--enable-symbols --enable-threads
make
make test
I have core dumps. There's a backtrace at the end of the bus
error (but gdb isn't my area so apologies if there's nothing
useful in there).
Is this anything of concern?
--
David
Qcode Software Limited
http://www.qcode.co.uk
Using host libthread_db library
"/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./nsd/nsd -u root -c -d -t
/root/naviserver/tests/test.nscfg /root/naviserver/t'.
Program terminated with signal 7, Bus error.
#0 Ns_MutexLock (mutex=0x206c617665757274) at mutex.c:239
239 mutexPtr = GETMUTEX(mutex);
(gdb) info threads
Id Target Id Frame
18 Thread 0x2b8bd7b5f700 (LWP 7173) 0x00002b8bd685b5f2 in
Tcl_ExternalToUtfDString () from /usr/lib/libtcl8.5.so.0
17 Thread 0x2b8bdc580700 (LWP 7192)
pthread_cond_timedwait@@GLIBC_2.3.2
<mailto:pthread_cond_timedwait@@GLIBC_2.3.2> ()
at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
16 Thread 0x2b8bdc37f700 (LWP 7189) 0x00002b8bd7075d13 in
*__GI___poll (fds=<optimized out>, nfds=<optimized out>,
timeout=timeout@entry=30000) at
../sysdeps/unix/sysv/linux/poll.c:87
15 Thread 0x2b8bdc17e700 (LWP 7188) 0x00002b8bd7075d13 in
*__GI___poll (fds=<optimized out>, nfds=<optimized out>,
timeout=timeout@entry=30000) at
../sysdeps/unix/sysv/linux/poll.c:87
14 Thread 0x2b8bdbf7d700 (LWP 7187) 0x00002b8bd7075d13 in
*__GI___poll (fds=<optimized out>, nfds=<optimized out>,
timeout=timeout@entry=30000) at
../sysdeps/unix/sysv/linux/poll.c:87
13 Thread 0x2b8bdbd7c700 (LWP 7186) 0x00002b8bd7075d13 in
*__GI___poll (fds=<optimized out>, nfds=<optimized out>,
timeout=timeout@entry=30000) at
../sysdeps/unix/sysv/linux/poll.c:87
12 Thread 0x2b8bdbb7b700 (LWP 7185) 0x00002b8bd7075d13 in
*__GI___poll (fds=<optimized out>, nfds=<optimized out>,
timeout=timeout@entry=30000) at
../sysdeps/unix/sysv/linux/poll.c:87
11 Thread 0x2b8bdb97a700 (LWP 7184) 0x00002b8bd7075d13 in
*__GI___poll (fds=<optimized out>, nfds=<optimized out>,
timeout=timeout@entry=30000) at
../sysdeps/unix/sysv/linux/poll.c:87
10 Thread 0x2b8bdb779700 (LWP 7183) 0x00002b8bd7075d13 in
*__GI___poll (fds=<optimized out>, nfds=<optimized out>,
timeout=timeout@entry=10000) at
../sysdeps/unix/sysv/linux/poll.c:87
9 Thread 0x2b8bdb578700 (LWP 7182)
pthread_cond_timedwait@@GLIBC_2.3.2
<mailto:pthread_cond_timedwait@@GLIBC_2.3.2> ()
at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
8 Thread 0x2b8bdb377700 (LWP 7181)
pthread_cond_timedwait@@GLIBC_2.3.2
<mailto:pthread_cond_timedwait@@GLIBC_2.3.2> ()
at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
7 Thread 0x2b8bdb176700 (LWP 7180)
pthread_cond_timedwait@@GLIBC_2.3.2
<mailto:pthread_cond_timedwait@@GLIBC_2.3.2> ()
at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
6 Thread 0x2b8bdaf75700 (LWP 7179)
pthread_cond_timedwait@@GLIBC_2.3.2
<mailto:pthread_cond_timedwait@@GLIBC_2.3.2> ()
at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
5 Thread 0x2b8bdad74700 (LWP 7178)
pthread_cond_timedwait@@GLIBC_2.3.2
<mailto:pthread_cond_timedwait@@GLIBC_2.3.2> ()
at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
4 Thread 0x2b8bda567700 (LWP 7177)
pthread_cond_timedwait@@GLIBC_2.3.2
<mailto:pthread_cond_timedwait@@GLIBC_2.3.2> ()
at
../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:216
3 Thread 0x2b8bd7ed7700 (LWP 7174) 0x00002b8bd707a453 in
select () at ../sysdeps/unix/syscall-template.S:82
2 Thread 0x2b8bd77522e0 (LWP 7172) do_sigwait
(set=0x7fff74ac2f50, sig=0x7fff74ac2f4c)
at
../nptl/sysdeps/unix/sysv/linux/../../../../../sysdeps/unix/sysv/linux/sigwait.c:65
* 1 Thread 0x2b8bdcba2700 (LWP 7196) Ns_MutexLock
(mutex=0x206c617665757274) at mutex.c:239
(gdb) bt
#0 Ns_MutexLock (mutex=0x206c617665757274) at mutex.c:239
#1 0x00002b8bd5f569ed in LockArrayObj
(interp=interp@entry=0xa0022a0, arrayObj=0xa05a8b0,
create=create@entry=0) at tclvar.c:1265
#2 0x00002b8bd5f55f7e in NsTclNsvArrayObjCmd
(UNUSED_clientData=<optimized out>, interp=0xa0022a0, objc=3,
objv=0xa060c38)
at tclvar.c:669
#3 0x00002b8bd681fdbe in ?? () from /usr/lib/libtcl8.5.so.0
#4 0x00002b8bd68624be in ?? () from /usr/lib/libtcl8.5.so.0
#5 0x00002b8bd68a427b in TclObjInterpProcCore () from
/usr/lib/libtcl8.5.so.0
#6 0x00002b8bd681fdbe in ?? () from /usr/lib/libtcl8.5.so.0
#7 0x00002b8bd68209f5 in ?? () from /usr/lib/libtcl8.5.so.0
#8 0x00002b8bd6820546 in Tcl_EvalEx () from /usr/lib/libtcl8.5.so.0
#9 0x00002b8bd5f431cb in Ns_TclEval (dsPtr=dsPtr@entry=0x0,
server=<optimized out>,
script=script@entry=0x9fb2884 "\n # If necessary due
to running this code in a different environment, you\n #
can have the newly spawned worker thread first source this file
here.\n tst_cond_worker\n ") at tclinit.c:334
#10 0x00002b8bd5f54073 in NsTclThread (arg=0x9fb2870) at
tclthread.c:834
#11 0x00002b8bd65e884c in NsThreadMain (arg=<optimized out>) at
thread.c:227
#12 0x00002b8bd65e9839 in ThreadMain (arg=<optimized out>) at
pthread.c:809
#13 0x00002b8bd753bb50 in start_thread (arg=<optimized out>) at
pthread_create.c:304
#14 0x00002b8bd708095d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#15 0x0000000000000000 in ?? ()
(gdb) frame 15
#15 0x0000000000000000 in ?? ()
(gdb) frame 14
#14 0x00002b8bd708095d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
112 in ../sysdeps/unix/sysv/linux/x86_64/clone.S
(gdb) frame 13
#13 0x00002b8bd753bb50 in start_thread (arg=<optimized out>) at
pthread_create.c:304
304 pthread_create.c: No such file or directory.
(gdb) frame 12
#12 0x00002b8bd65e9839 in ThreadMain (arg=<optimized out>) at
pthread.c:809
809 NsThreadMain(arg);
(gdb) frame 11
#11 0x00002b8bd65e884c in NsThreadMain (arg=<optimized out>) at
thread.c:227
227 (*thrPtr->proc) (thrPtr->arg);
(gdb) frame 10
#10 0x00002b8bd5f54073 in NsTclThread (arg=0x9fb2870) at
tclthread.c:834
834 (void) Ns_TclEval(dsPtr, argPtr->server, argPtr->script);
(gdb) frame 9
#9 0x00002b8bd5f431cb in Ns_TclEval (dsPtr=dsPtr@entry=0x0,
server=<optimized out>,
script=script@entry=0x9fb2884 "\n # If necessary due
to running this code in a different environment, you\n #
can have the newly spawned worker thread first source this file
here.\n tst_cond_worker\n ") at tclinit.c:334
334 if (Tcl_EvalEx(interp, script, -1, 0) != TCL_OK) {
(gdb) frame 8
#8 0x00002b8bd6820546 in Tcl_EvalEx () from /usr/lib/libtcl8.5.so.0
(gdb) frame 7
#7 0x00002b8bd68209f5 in ?? () from /usr/lib/libtcl8.5.so.0
(gdb) frame 6
#6 0x00002b8bd681fdbe in ?? () from /usr/lib/libtcl8.5.so.0
(gdb) frame 5
#5 0x00002b8bd68a427b in TclObjInterpProcCore () from
/usr/lib/libtcl8.5.so.0
(gdb) frame 4
#4 0x00002b8bd68624be in ?? () from /usr/lib/libtcl8.5.so.0
(gdb) frame 3
#3 0x00002b8bd681fdbe in ?? () from /usr/lib/libtcl8.5.so.0
(gdb) frame 2
#2 0x00002b8bd5f55f7e in NsTclNsvArrayObjCmd
(UNUSED_clientData=<optimized out>, interp=0xa0022a0, objc=3,
objv=0xa060c38)
at tclvar.c:669
669 arrayPtr = LockArrayObj(interp, objv[2], 0);
(gdb) frame 1
#1 0x00002b8bd5f569ed in LockArrayObj
(interp=interp@entry=0xa0022a0, arrayObj=0xa05a8b0,
create=create@entry=0) at tclvar.c:1265
1265 Ns_MutexLock(&(arrayPtr->bucketPtr->lock));
(gdb) frame 0
#0 Ns_MutexLock (mutex=0x206c617665757274) at mutex.c:239
239 mutexPtr = GETMUTEX(mutex);
(gdb) list
234 Ns_GetTime(&startTime);
235 #endif
236
237 assert(mutex != NULL);
238
239 mutexPtr = GETMUTEX(mutex);
240 if (unlikely(!NsLockTry(mutexPtr->lock))) {
241 NsLockSet(mutexPtr->lock);
242 ++mutexPtr->nbusy;
243
(gdb)
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel