Hi Luca;
Sorry about the delay. Is there any way I can access the modified files besides 
the svn? Our server does not have internet access, and our proxy denies access 
to ntop´s svn host because of a problem with the digital certificate (I think 
it is self-signed).
As you may already know, defining MUTEX_DEBUG (and avoiding using r/w locks) in 
globals-defines.h solves the problem.
To test your fix, I need the files modified in revision 4359:
New Revision: 4359

Modified:
trunk/ntop/COPYING
trunk/ntop/INSTALL
trunk/ntop/config.sub
trunk/ntop/depcomp
trunk/ntop/globals-structtypes.h
trunk/ntop/install-sh
trunk/ntop/missing
trunk/ntop/plugins/netflowPlugin.c
Log:
Fix for systems with no r/w locks supported


        Rafael Sarres de Almeida 
        


-----Mensagem original-----
De: [email protected] 
[mailto:[email protected]] Em nome de Luca Deri
Enviada em: domingo, 12 de setembro de 2010 07:22
Para: [email protected]
Assunto: Re: [Ntop-dev] RES: RES: RES: Ntop processing only one packet

Rafael
I think I have found the problem. For some reason your system does not support 
rw locks. The compatibility code for systems like yours as you have pointed out 
with warnings you reported, was the issue IMHO.

I have committed the fix. Please validate it and report to me if the problem is 
now solved.

Regards Luca

On Sep 11, 2010, at 12:17 AM, Rafael Sarres de Almeida wrote:

> Hi all;
> Here are more things that I discovered. I am quite sure that this is a MUTEX 
> problem. I analyzed the behavior of the 3.3.10 code and the 4.0.1 code in 
> GDB, so I could see clearly that the MUTEX is not being locked on 4.0.1:
> 
> GDB NTOP 4.0.1 does not lock mutex:
> 
> Breakpoint 1, queuePacket (_deviceId=<value optimized out>, h=0x45007050, 
> p=0x2aaaad590042 "") at pbuf.c:2510
> 2510      if(tryLockMutex(&myGlobals.device[deviceId].packetProcessMutex, 
> "queuePacket") == 0) {
> (gdb) print myGlobals.device[deviceId].packetProcessMutex
> $1 = {isInitialized = 1 '\001', mutex = {__data = {__lock = 0, __count = 0, 
> __owner = 0, __nusers = 0, __kind = 0, __spins = 0, 
>      __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 
> times>, __align = 0}}
> 
> 
> (gdb) next
> Fri Sep 10 18:53:21 2010  THREADMGMT[t1094719808]: SIH: Idle host scan thread 
> running [p21033]
> Fri Sep 10 18:53:21 2010  THREADMGMT[t1084229952]: SFP: Fingerprint scan 
> thread running [p21033]
> 2514        myGlobals.receivedPacketsProcessed++;
> (gdb) print myGlobals.device[deviceId].packetProcessMutex
> $2 = {isInitialized = 1 '\001', mutex = {__data = {__lock = 0, __count = 0, 
> __owner = 0, __nusers = 0, __kind = 0, __spins = 0, 
>      __list = {__prev = 0x523d, __next = 0x0}}, __size = '\000' <repeats 24 
> times>, "=R", '\000' <repeats 13 times>, __align = 0}}
> (gdb) 
> 
> ****************You can see that {__lock = 0 is not set after calling 
> tryLockMutex.
> 
> 
> GDB NTOP 3.3.10 locks mutex
> 
> Breakpoint 1, queuePacket (_deviceId=<value optimized out>, h=0x42802e50, 
> p=0x2aaaae08b042 "") at pbuf.c:2476
> 2476      if(tryLockMutex(&myGlobals.device[deviceId].packetProcessMutex, 
> "queuePacket") == 0) {
> (gdb) print myGlobals.device[deviceId].packetProcessMutex
> $1 = {mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, 
> __kind = 0, __spins = 0, __list = {__prev = 0x0, 
>        __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, 
> statedatamutex = {__data = {__lock = 0, __count = 0, 
>      __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 
> 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, 
>    __align = 0}, isLocked = 0 '\000', isInitialized = 1 '\001', numLocks = 0, 
> numReleases = 0, attempt = {time = {tv_sec = 0, 
>      tv_usec = 0}, pid = 0, thread = 0, line = 0, file = "\000\000\000\000"}, 
> lock = {time = {tv_sec = 0, tv_usec = 0}, pid = 0, 
>    thread = 0, line = 0, file = "\000\000\000\000"}, unlock = {time = {tv_sec 
> = 0, tv_usec = 0}, pid = 0, thread = 0, line = 0, 
>    file = "\000\000\000\000"}, max = {time = {tv_sec = 0, tv_usec = 0}, pid = 
> 0, thread = 0, line = 0, file = "\000\000\000\000"}, 
>  maxLockedDuration = 0}
> 
> 
> (gdb) next
> Fri Sep 10 18:55:09 2010  THREADMGMT[t1084229952]: SFP: Fingerprint scan 
> thread running [p25640]
> 2480        myGlobals.receivedPacketsProcessed++;
> (gdb) print myGlobals.device[deviceId].packetProcessMutex
> $2 = {mutex = {__data = {__lock = 1, __count = 0, __owner = 25653, __nusers = 
> 1, __kind = 0, __spins = 0, __list = {__prev = 0x0, 
>        __next = 0x0}}, __size = 
> "\001\000\000\000\000\000\000\000\065d\000\000\001", '\000' <repeats 26 
> times>, __align = 1}, 
>  statedatamutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 
> 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, 
>        __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, 
> isLocked = 1 '\001', isInitialized = 1 '\001', numLocks = 1, 
>  numReleases = 0, attempt = {time = {tv_sec = 0, tv_usec = 0}, pid = 0, 
> thread = 0, line = 0, file = "\000\000\000\000"}, lock = {
>    time = {tv_sec = 0, tv_usec = 0}, pid = 0, thread = 0, line = 0, file = 
> "\000\000\000\000"}, unlock = {time = {tv_sec = 0, 
>      tv_usec = 0}, pid = 0, thread = 0, line = 0, file = "\000\000\000\000"}, 
> max = {time = {tv_sec = 0, tv_usec = 0}, pid = 0, 
>    thread = 0, line = 0, file = "\000\000\000\000"}, maxLockedDuration = 0}
> (gdb) 
> 
> ****************You can see that {__lock = 1 is set after calling 
> tryLockMutex.
> 
> 
> 
> After that, I recompiled the code more carefully and found some compile 
> WARNINGS that I have ignored the first time: 
> 
> util.c: In function '_deleteMutex':
> util.c:1927: warning: passing argument 1 of 'pthread_rwlock_destroy' from 
> incompatible pointer type
> util.c: In function '_tryLockMutex':
> util.c:2078: warning: passing argument 1 of 'pthread_rwlock_trywrlock' from 
> incompatible pointer type
> 
> 
> Well, the good thing is that I think the problem is found. The bad thing is 
> that I checked the pointer type and it seems right: 
> For "int pthread_rwlock_trywrlock(pthread_rwlock_t *rwlock)" is passed 
> &mutexId->mutex, and mutex is pthread_rwlock_t type. I am not an expert C++ 
> programmer but I think it is right.
> 
> Well, if is anybody still reading this, any ideas? Maybe a compiler problem? 
> Even if no one is out there, I hope this info can help a future wanderer 
> through the ntop-dev archives....  :)
> 
> Thanks.
>        Rafael Sarres de Almeida 
> 
> 
> 
> -----Mensagem original-----
> De: [email protected] 
> [mailto:[email protected]] Em nome de Luca Deri
> Enviada em: terça-feira, 31 de agosto de 2010 06:03
> Para: [email protected]
> Assunto: Re: [Ntop-dev] RES: RES: Ntop processing only one packet
> 
> Rafael
> thanks for debugging the code. The software works as follows:
> - are we the only one processing packets? If so (i.e. no other threads 
> are doing this) then process the packet immediately. This turns into
> 
>  if(tryLockMutex(&myGlobals.device[deviceId].packetProcessMutex, 
> "queuePacket") == 0) {
>     /* Locked so we can process the packet now */
> .....
>   processPacket(_deviceId, h, p1);
>     releaseMutex(&myGlobals.device[deviceId].packetProcessMutex);
>     return;
> }
> 
> - if another thread is processing packets already, we need to queue the 
> packet
>  /*
>     If we reach this point it means that somebody was already
>     processing a packet so we need to queue it.
>   */
>   if(myGlobals.device[deviceId].packetQueueLen >= 
> CONST_PACKET_QUEUE_LENGTH) {
> ...
> }
> 
> In this second case ntop notifies the dequeue thread that there's a 
> packet to process
> 
>  signalCondvar(&myGlobals.device[deviceId].queueCondvar);
> 
> Now my question is: are you sure that for some reason the dequeue thread 
> isn't looping or isn't really awake? Can you please check what happens 
> in (pbuf.c)
> 
> void* dequeuePacket(void* _deviceId) {
> }
> 
> Just enable the traces (around #ifdef DEBUG) to see what happens there.
> 
> Cheers Luca
> 
> 
> 
> On 08/30/2010 11:26 PM, Rafael Sarres de Almeida wrote:
>> Hi Luca;
>> Just to add more info to my previous mail:
>> I gdb the code in the first packet, seems like that the releaseMutex 
>> (2538,pbuf.c) is not releasing. I followed the code, it calls the 
>> realeaseMutex function after it processes the first packet, but on the next 
>> loop, the tryLockMutex (2510,pbuf.c) fails, so the program thinks the mutex 
>> is not released. Here is the debug:
>> 
>> 
>> Breakpoint 4, queuePacket (_deviceId=<value optimized out>, h=0x45007050, 
>> p=0x2aaaad590042 "") at pbuf.c:2510
>> 2510      if(tryLockMutex(&myGlobals.device[deviceId].packetProcessMutex, 
>> "queuePacket") == 0) {
>> 
>> *********** It is going to process first packet if Mutex is not locked.
>> 
>> 
>> (gdb) step
>> Mon Aug 30 18:09:35 2010  THREADMGMT[t1094719808]: SIH: Idle host scan 
>> thread running [p12154]
>> _tryLockMutex (mutexId=0x2aaaab7e1150, where=0x2aaaaad8d59d "queuePacket", 
>> fileName=0x2aaaaad8d41a "pbuf.c", fileLine=2510)
>>     at util.c:2078
>> 2078      return(pthread_rwlock_trywrlock(&mutexId->mutex));
>> (gdb)
>> [New Thread 0x45a08940 (LWP 12189)]
>> Mon Aug 30 18:09:42 2010  THREADMGMT[t1168148800]: RRD: Started thread for 
>> throughput data collection
>> Mon Aug 30 18:09:42 2010  THREADMGMT[t1147169088]: RRD: Data collection 
>> thread running [p12154]
>> Mon Aug 30 18:09:42 2010  THREADMGMT[t1168148800]: RRD: Throughput data 
>> collection: Thread starting [p12154]
>> Mon Aug 30 18:09:42 2010  THREADMGMT[t1168148800]: RRD: Throughput data 
>> collection: Thread running [p12154]
>> 0x000000304fa0a760 in pthread_rwlock_trywrlock () from /lib64/libpthread.so.0
>> (gdb)
>> Single stepping until exit from function pthread_rwlock_trywrlock,
>> which has no line number information.
>> queuePacket (_deviceId=<value optimized out>, h=0x45007050, p=0x2aaaad590042 
>> "") at pbuf.c:2514
>> 2514        myGlobals.receivedPacketsProcessed++;
>> (gdb) break 2538
>> Breakpoint 5 at 0x2aaaaad595a7: file pbuf.c, line 2538.
>> (gdb) continue
>> Continuing.
>> 
>> Breakpoint 5, queuePacket (_deviceId=<value optimized out>, h=0x45007050, 
>> p=0x2aaaad590042 "") at pbuf.c:2538
>> 2538        releaseMutex(&myGlobals.device[deviceId].packetProcessMutex);
>> 
>> ***************Releasing MUTEX
>> 
>> (gdb) step
>> _releaseMutex (mutexId=0x2aaaab7e1150, fileName=0x2aaaaad8d41a "pbuf.c", 
>> fileLine=2538) at util.c:2156
>> 2156      return(pthread_rwlock_unlock(&mutexId->mutex));
>> (gdb)
>> 0x000000304eedfa10 in pthread_mutex_unlock () from /lib64/libc.so.6
>> (gdb)
>> Single stepping until exit from function pthread_mutex_unlock,
>> which has no line number information.
>> 0x000000304fa0a020 in pthread_mutex_unlock () from /lib64/libpthread.so.0
>> (gdb)
>> Single stepping until exit from function pthread_mutex_unlock,
>> which has no line number information.
>> 0x000000304fa0a0d8 in _L_unlock_766 () from /lib64/libpthread.so.0
>> (gdb)
>> Single stepping until exit from function _L_unlock_766,
>> which has no line number information.
>> 0x000000304fa0d5e0 in __lll_unlock_wake () from /lib64/libpthread.so.0
>> (gdb)
>> Single stepping until exit from function __lll_unlock_wake,
>> which has no line number information.
>> 0x000000304fa0a0e7 in _L_unlock_766 () from /lib64/libpthread.so.0
>> (gdb)
>> Single stepping until exit from function _L_unlock_766,
>> which has no line number information.
>> 0x000000304fa0a04e in pthread_mutex_unlock () from /lib64/libpthread.so.0
>> (gdb)
>> 
>> 
>> Any ideas?
>> 
>>         Rafael Sarres de Almeida
>> 
>> 
> 
> _______________________________________________
> Ntop-dev mailing list
> [email protected]
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev
> _______________________________________________
> Ntop-dev mailing list
> [email protected]
> http://listgateway.unipi.it/mailman/listinfo/ntop-dev

---

"Debugging is twice as hard as writing the code in the first place. Therefore, 
if you write the code as cleverly as possible, you are, by definition, not 
smart enough to debug it. - Brian W. Kernighan

_______________________________________________
Ntop-dev mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
_______________________________________________
Ntop-dev mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-dev

Reply via email to