Hi Luca;
Today I had some time to continue debugging this problem. The workaround I did
may work, but I want to solve this correctly.
About your questions:
void* dequeuePacket(void* _deviceId) never seems to be executed (or I am doing
something very wrong in gdb). I have put a lot of breakpoints in this function
and it never stops. The breaks only stop in queuePacket. I think it does not
run ever.
I think the dequeue thread is not running. How can I verify this? I see a lot
of THREADMGMT logs, what is the dequeue thread? I can see 10 threads running
when debugging:
(gdb) info thread
10 Thread 0x45a08940 (LWP 10165) 0x000000304ee9a1a1 in nanosleep () from
/lib64/libc.so.6
* 9 Thread 0x45007940 (LWP 10149) queuePacket (_deviceId=<value optimized
out>, h=0x45007050, p=0x2aaaad594042 "") at pbuf.c:2564
8 Thread 0x44606940 (LWP 10148) 0x000000304ee9a1a1 in nanosleep () from
/lib64/libc.so.6
7 Thread 0x43c05940 (LWP 10147) 0x000000304eeccfc2 in select () from
/lib64/libc.so.6
6 Thread 0x43204940 (LWP 10144) 0x000000304fa0aee9 in
pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
5 Thread 0x42803940 (LWP 10143) 0x000000304fa0aee9 in
pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
4 Thread 0x41e02940 (LWP 10142) 0x000000304fa0aee9 in
pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
3 Thread 0x41401940 (LWP 10141) 0x000000304ee9a1a1 in nanosleep () from
/lib64/libc.so.6
2 Thread 0x40a00940 (LWP 10140) 0x000000304ee9a1a1 in nanosleep () from
/lib64/libc.so.6
1 Thread 0x2aaaab7e02c0 (LWP 10127) 0x000000304ee9a1a1 in nanosleep () from
/lib64/libc.so.6
If you tell me where in the code this thread should be started, I can try to
debug it and discover why it is not running (if it is the case).
Thanks.
Rafael Sarres de Almeida
-----Mensagem original-----
De: [email protected]
[mailto:[email protected]] Em nome de Luca Deri
Enviada em: terça-feira, 31 de agosto de 2010 06:03
Para: [email protected]
Assunto: Re: [Ntop-dev] RES: RES: Ntop processing only one packet
Rafael
thanks for debugging the code. The software works as follows:
- are we the only one processing packets? If so (i.e. no other threads
are doing this) then process the packet immediately. This turns into
if(tryLockMutex(&myGlobals.device[deviceId].packetProcessMutex,
"queuePacket") == 0) {
/* Locked so we can process the packet now */
.....
processPacket(_deviceId, h, p1);
releaseMutex(&myGlobals.device[deviceId].packetProcessMutex);
return;
}
- if another thread is processing packets already, we need to queue the
packet
/*
If we reach this point it means that somebody was already
processing a packet so we need to queue it.
*/
if(myGlobals.device[deviceId].packetQueueLen >=
CONST_PACKET_QUEUE_LENGTH) {
...
}
In this second case ntop notifies the dequeue thread that there's a
packet to process
signalCondvar(&myGlobals.device[deviceId].queueCondvar);
Now my question is: are you sure that for some reason the dequeue thread
isn't looping or isn't really awake? Can you please check what happens
in (pbuf.c)
void* dequeuePacket(void* _deviceId) {
}
Just enable the traces (around #ifdef DEBUG) to see what happens there.
Cheers Luca
On 08/30/2010 11:26 PM, Rafael Sarres de Almeida wrote:
> Hi Luca;
> Just to add more info to my previous mail:
> I gdb the code in the first packet, seems like that the releaseMutex
> (2538,pbuf.c) is not releasing. I followed the code, it calls the
> realeaseMutex function after it processes the first packet, but on the next
> loop, the tryLockMutex (2510,pbuf.c) fails, so the program thinks the mutex
> is not released. Here is the debug:
>
>
> Breakpoint 4, queuePacket (_deviceId=<value optimized out>, h=0x45007050,
> p=0x2aaaad590042 "") at pbuf.c:2510
> 2510 if(tryLockMutex(&myGlobals.device[deviceId].packetProcessMutex,
> "queuePacket") == 0) {
>
> *********** It is going to process first packet if Mutex is not locked.
>
>
> (gdb) step
> Mon Aug 30 18:09:35 2010 THREADMGMT[t1094719808]: SIH: Idle host scan thread
> running [p12154]
> _tryLockMutex (mutexId=0x2aaaab7e1150, where=0x2aaaaad8d59d "queuePacket",
> fileName=0x2aaaaad8d41a "pbuf.c", fileLine=2510)
> at util.c:2078
> 2078 return(pthread_rwlock_trywrlock(&mutexId->mutex));
> (gdb)
> [New Thread 0x45a08940 (LWP 12189)]
> Mon Aug 30 18:09:42 2010 THREADMGMT[t1168148800]: RRD: Started thread for
> throughput data collection
> Mon Aug 30 18:09:42 2010 THREADMGMT[t1147169088]: RRD: Data collection
> thread running [p12154]
> Mon Aug 30 18:09:42 2010 THREADMGMT[t1168148800]: RRD: Throughput data
> collection: Thread starting [p12154]
> Mon Aug 30 18:09:42 2010 THREADMGMT[t1168148800]: RRD: Throughput data
> collection: Thread running [p12154]
> 0x000000304fa0a760 in pthread_rwlock_trywrlock () from /lib64/libpthread.so.0
> (gdb)
> Single stepping until exit from function pthread_rwlock_trywrlock,
> which has no line number information.
> queuePacket (_deviceId=<value optimized out>, h=0x45007050, p=0x2aaaad590042
> "") at pbuf.c:2514
> 2514 myGlobals.receivedPacketsProcessed++;
> (gdb) break 2538
> Breakpoint 5 at 0x2aaaaad595a7: file pbuf.c, line 2538.
> (gdb) continue
> Continuing.
>
> Breakpoint 5, queuePacket (_deviceId=<value optimized out>, h=0x45007050,
> p=0x2aaaad590042 "") at pbuf.c:2538
> 2538 releaseMutex(&myGlobals.device[deviceId].packetProcessMutex);
>
> ***************Releasing MUTEX
>
> (gdb) step
> _releaseMutex (mutexId=0x2aaaab7e1150, fileName=0x2aaaaad8d41a "pbuf.c",
> fileLine=2538) at util.c:2156
> 2156 return(pthread_rwlock_unlock(&mutexId->mutex));
> (gdb)
> 0x000000304eedfa10 in pthread_mutex_unlock () from /lib64/libc.so.6
> (gdb)
> Single stepping until exit from function pthread_mutex_unlock,
> which has no line number information.
> 0x000000304fa0a020 in pthread_mutex_unlock () from /lib64/libpthread.so.0
> (gdb)
> Single stepping until exit from function pthread_mutex_unlock,
> which has no line number information.
> 0x000000304fa0a0d8 in _L_unlock_766 () from /lib64/libpthread.so.0
> (gdb)
> Single stepping until exit from function _L_unlock_766,
> which has no line number information.
> 0x000000304fa0d5e0 in __lll_unlock_wake () from /lib64/libpthread.so.0
> (gdb)
> Single stepping until exit from function __lll_unlock_wake,
> which has no line number information.
> 0x000000304fa0a0e7 in _L_unlock_766 () from /lib64/libpthread.so.0
> (gdb)
> Single stepping until exit from function _L_unlock_766,
> which has no line number information.
> 0x000000304fa0a04e in pthread_mutex_unlock () from /lib64/libpthread.so.0
> (gdb)
>
>
> Any ideas?
>
> Rafael Sarres de Almeida
>
>
_______________________________________________
Ntop-dev mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-dev
_______________________________________________
Ntop-dev mailing list
[email protected]
http://listgateway.unipi.it/mailman/listinfo/ntop-dev