RH9 = NPTL (New POSIX thread Library), which is a totally, different, far more 
compatible implementation of POSIX threads for Linxu vs. linuxthreads.

RH8 is linuxthreads.

There's a way to turn off NPTL and run RH9 under linuxthreads...

set the following flag:

$ export LD_ASSUME_KERNEL=2.4.1

before ./configure, make (and probably) running ntop.

I'd be interested if you're seeing problems with that version?

If so, the patch to turn off sched_yield get's a bit ugly, but it's workable.  Please 
let me know ASAP...

Without sched_yield, I'm up 6+ hours, without it, longest ntop stayed up is 15m before 
deadlocking.

-----Burton


---------- Original Message ----------------------------------
From: Luca Deri <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Date:  Tue, 26 Aug 2003 08:54:23 +0200

>Burton,
>I have a dual Athlon XP, 3 NICs (2xIntel e100 and 1x Realtek), RH9 and 
>never experienced this problem.
>
>Luca
>
>Burton M. Strauss III wrote:
>
>>Tigger is a dual CPU (P3-1000) running RH8 (2.4.20 kernel) with two NICs for
>>ntop) - one WAN one LAN.
>>
>>I'm seeing deadlocks, which look - under gdb - like they're occurring in the
>>sched_yield() call.  It actually looks like it's deadlocking the POSIX
>>thread control thread.
>>
>>If I disable the SCHED_YIELD, it seems better - at least I'm now up to 52m
>>of run time vs. the usual 1-10 before lockup...
>>
>>Specifically, I'm seeing this when I connect to the hung ntop via gdb:
>>
>>(gdb) info thread
>>  9 Thread 114696 (LWP 27300)  0x40253c68 in recvfrom () from
>>/lib/i686/libpthread.so.0
>>  8 Thread 98311 (LWP 27299)  0x4209ad41 in __tzfile_compute () from
>>/lib/i686/libc.so.6
>>  7 Thread 81926 (LWP 27298)  0x40250a35 in __pthread_sigsuspend () from
>>/lib/i686/libpthread.so.0
>>  6 Thread 65541 (LWP 27297)  0x420b0226 in nanosleep () from
>>/lib/i686/libc.so.6
>>  5 Thread 49156 (LWP 27296)  0x40250a35 in __pthread_sigsuspend () from
>>/lib/i686/libpthread.so.0
>>  4 Thread 32771 (LWP 27295)  0x420cd207 in sched_yield () from
>>/lib/i686/libc.so.6
>>  3 Thread 16386 (LWP 27294)  0x40250a35 in __pthread_sigsuspend () from
>>/lib/i686/libpthread.so.0
>>  2 Thread 32769 (LWP 27293)  0x420db1a7 in poll () from /lib/i686/libc.so.6
>>  1 Thread 16384 (LWP 27290)  0x420b0226 in nanosleep () from
>>/lib/i686/libc.so.6
>>
>>(gdb) thread 4
>>[Switching to thread 4 (Thread 32771 (LWP 27295))]#0  0x420cd207 in
>>sched_yield ()
>>   from /lib/i686/libc.so.6
>>(gdb) info stac
>>#0  0x420cd207 in sched_yield () from /lib/i686/libc.so.6
>>#1  0x4008ff47 in freeHostSessions (host=0x41a1ebe8, theDevice=0) at
>>hash.c:176
>>#2  0x400900a8 in freeHostInfo (host=0x41a1ebe8, actualDeviceId=0) at
>>hash.c:232
>>#3  0x40090bd6 in purgeIdleHosts (actDevice=0) at hash.c:545
>>#4  0x40097f19 in scanIdleLoop (notUsed=0x0) at ntop.c:588
>>#5  0x4024e881 in pthread_start_thread () from /lib/i686/libpthread.so.0
>>
>>
>>Other threads that reference mutexes, semaphores, anything POSIX are hung,
>>even though the mutex/semaphore shows not locked...:
>>
>>[Switching to thread 7 (Thread 81926 (LWP 27298))]#0  0x40250a35 in
>>__pthread_sigsuspend ()
>>   from /lib/i686/libpthread.so.0
>>(gdb) info stack
>>#0  0x40250a35 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
>>#1  0x4024fdb8 in __pthread_wait_for_restart_signal () from
>>/lib/i686/libpthread.so.0
>>#2  0x40252190 in __pthread_alt_lock () from /lib/i686/libpthread.so.0
>>#3  0x4024ed77 in pthread_mutex_lock () from /lib/i686/libpthread.so.0
>>#4  0x400a86fe in _accessMutex (mutexId=0x400b8ba8, where=0x4006a8df
>>"returnHTTPPage",
>>    fileName=0x40069e3d "http.c", fileLine=2590) at util.c:1143
>>#5  0x4003c09e in handleHTTPrequest (from={s_addr = 3232246305}) at
>>http.c:2590
>>#6  0x40068091 in handleSingleWebConnection (fdmask=0x4413aa3c) at
>>webInterface.c:5423
>>#7  0x40067ed6 in handleWebConnections (notUsed=0x0) at webInterface.c:5288
>>#8  0x4024e881 in pthread_start_thread () from /lib/i686/libpthread.so.0
>>
>>(gdb) thread 3
>>[Switching to thread 3 (Thread 16386 (LWP 27294))]#0  0x40250a35 in
>>__pthread_sigsuspend ()
>>   from /lib/i686/libpthread.so.0
>>(gdb) info stac
>>#0  0x40250a35 in __pthread_sigsuspend () from /lib/i686/libpthread.so.0
>>#1  0x4024fdb8 in __pthread_wait_for_restart_signal () from
>>/lib/i686/libpthread.so.0
>>#2  0x4025163b in sem_wait@@GLIBC_2.1 () from /lib/i686/libpthread.so.0
>>#3  0x400a8dd8 in waitSem (semId=0x400b8874) at util.c:1476
>>#4  0x4009d03a in dequeuePacket (notUsed=0x0) at pbuf.c:1693
>>#5  0x4024e881 in pthread_start_thread () from /lib/i686/libpthread.so.0
>>
>>
>>Oddly, ntop's extra data shows the mutex locked, but the internal flag
>>(__m_lock) shows some weird status...
>>
>>(gdb) frame 4
>>#4  0x400a86fe in _accessMutex (mutexId=0x400b8ba8, where=0x4006a8df
>>"returnHTTPPage",
>>    fileName=0x40069e3d "http.c", fileLine=2590) at util.c:1143
>>1143      rc = pthread_mutex_lock(&(mutexId->mutex));
>>(gdb) list
>>1138
>>1139      strcpy(mutexId->lockAttemptFile, fileName);
>>1140      mutexId->lockAttemptLine=fileLine;
>>1141      mutexId->lockAttemptPid=myPid;
>>1142
>>1143      rc = pthread_mutex_lock(&(mutexId->mutex));
>>1144
>>1145      pthread_mutex_lock(&stateChangeMutex);
>>1146      mutexId->lockAttemptFile[0] = '\0';
>>1147      mutexId->lockAttemptLine=0;
>>(gdb) print *mutexId
>>$2 = {mutex = {__m_reserved = 0, __m_count = 0, __m_owner = 0x0, __m_kind =
>>0, __m_lock = {
>>      __status = 1142136828, __spinlock = 0}}, isLocked = 1 '\001',
>>isInitialized = 1 '\001',
>>  lockFile = "hash.c", '\0' <repeats 57 times>, lockLine = 465, lockPid =
>>27295,
>>  unlockFile = "http.c", '\0' <repeats 57 times>, unlockLine = 2618,
>>unlockPid = 27298,
>>  numLocks = 361, numReleases = 360, lockTime = 1061844745,
>>  maxLockedDurationUnlockFile = "http.c", '\0' <repeats 57 times>,
>>  maxLockedDurationUnlockLine = 2618, maxLockedDuration = 1,
>>  where = "purgeIdleHosts", '\0' <repeats 49 times>,
>>  lockAttemptFile = "http.c", '\0' <repeats 57 times>, lockAttemptLine =
>>2590,
>>  lockAttemptPid = 27298}
>>
>>vs. a normal LOCKED mutex:
>>
>>(gdb) print myGlobals.packetProcessMutex
>>$4 = {mutex = {__m_reserved = 0, __m_count = 0, __m_owner = 0x0, __m_kind =
>>0, __m_lock = {
>>      __status = 1, __spinlock = 0}}, isLocked = 1 '\001', isInitialized = 1
>>'\001',
>>  lockFile = "pbuf.c", '\0' <repeats 57 times>, lockLine = 1591, lockPid =
>>2849,
>>  unlockFile = "pbuf.c", '\0' <repeats 57 times>, unlockLine = 1602,
>>unlockPid = 2849,
>>  numLocks = 50182, numReleases = 50181, lockTime = 1061847839,
>>  maxLockedDurationUnlockFile = "pbuf.c", '\0' <repeats 57 times>,
>>  maxLockedDurationUnlockLine = 1602, maxLockedDuration = 5,
>>  where = "queuePacket\000t", '\0' <repeats 50 times>,
>>  lockAttemptFile = "\000buf.c", '\0' <repeats 57 times>, lockAttemptLine =
>>0, lockAttemptPid = 0}
>>
>>
>>and UNLOCKED:
>>
>>(gdb) print myGlobals.purgePortsMutex
>>$2 = {mutex = {__m_reserved = 0, __m_count = 0, __m_owner = 0x0, __m_kind =
>>0, __m_lock = {
>>      __status = 0, __spinlock = 0}}, isLocked = 0 '\0', isInitialized = 1
>>'\001',
>>  lockFile = "pbuf.c", '\0' <repeats 57 times>, lockLine = 0, lockPid =
>>2849,
>>  unlockFile = "pbuf.c", '\0' <repeats 57 times>, unlockLine = 591,
>>unlockPid = 2849,
>>  numLocks = 24424, numReleases = 24424, lockTime = 1061847839,
>>  maxLockedDurationUnlockFile = "ntop.c", '\0' <repeats 57 times>,
>>  maxLockedDurationUnlockLine = 572, maxLockedDuration = 1,
>>  where = "updateInterfacePorts", '\0' <repeats 43 times>,
>>  lockAttemptFile = "\000buf.c", '\0' <repeats 57 times>, lockAttemptLine =
>>0, lockAttemptPid = 0}
>>
>>
>>
>>Anybody else having problems w/ 2.2.94 or another fairly recent version???
>>
>>-----Burton
>>
>>_______________________________________________
>>Ntop mailing list
>>[EMAIL PROTECTED]
>>http://listgateway.unipi.it/mailman/listinfo/ntop
>>  
>>
>
>
>-- 
>Luca Deri <[EMAIL PROTECTED]>  http://luca.ntop.org/
>Hacker: someone who loves to program and enjoys being
>clever about it - Richard Stallman
>
>
>_______________________________________________
>Ntop mailing list
>[EMAIL PROTECTED]
>http://listgateway.unipi.it/mailman/listinfo/ntop
>


____________________________________________________________
Free 20MB Web Site Hosting and Personalized E-mail Service!
Get It Now At Doteasy.com http://www.doteasy.com/et/
_______________________________________________
Ntop mailing list
[EMAIL PROTECTED]
http://listgateway.unipi.it/mailman/listinfo/ntop

Reply via email to