Hi Eddie,

Sorry for the wait. Today I pulled the latest sources and rebuilt a
vanilla click kernel on a 64-bit 4 core Xeon machine (with the click e1000
driver installed). The locking seems to be working fine, but
'click-uninstall' is resulting in a lockup. See the attached dumps (click
was started with click-install -t 3 in both cases). I still see the lockup
with -t 1.

Click uninstalls without problem with our version from a couple of weeks
back.

Unrelated, but I also saw:
../include/click/timestamp.hh: In constructor ‘Timestamp::Timestamp(double)’:
../include/click/timestamp.hh:938: warning: converting to ‘int64_t’ from
‘double’
all over when building userlevel click.

Sorry I don't have the time to dive into this further right now.

Thanks!
Kevin




> Kevin,
>
> Is there any chance you'd be able to try the newer Spinlock code any time
> soon?
>
> Eddie
>
>
> [EMAIL PROTECTED] wrote:
>> Hi,
>>
>> I think there is a concurrency issue with Spinlocks in linuxmodule
>> multi-threaded click (running a 2.6.19.2 patched kernel, the e1000-NAPI
>> driver and today's trunk). I've put together a temporary patch, but
>> there
>> might be further issues. Sorry, I don't have the time to investigate
>> more.
>>
>>
>> Problems:
>> The series of events are:
>> -Thread A is running on CPU 0 and thread B is running on CPU 1.
>> -A acquires the Spinlock and executes code in the critical section.
>> -Linux schedules Thread A to run on CPU 1 and thread B to run on CPU 0.
>> -B acquires the Spinlock (works because CPU 0 is the owner of the lock,
>> not the thread)
>> -A and B are both operating in the critical section
>> -(if you're lucky) A releases the Spinlock and generates a “releasing
>> someone else's lock” message
>> -(if you're unlucky) Oops
>>
>> Attached is a test element and a configuration to elicit the behavior. I
>> could only replicate the problem when using a polling input source under
>> heavy load.
>>
>> After fixing the above problem I was still seeing two threads inside the
>> CS. For some reason the atomic_uint32_t::swap was not working atomically
>> as expected. Changing this to atomic_uint32::compare_and_swap solved the
>> problem. We're running x86_64 quad core Xeon CPUs.
>>
>>
>> Solution:
>> In the patch attached I added a new function click_current_thread()
>> which
>> returns a thread_info pointer. Inside of Spinlock I replace
>> click_current_processor() uses with click_current_thread(). This way
>> even
>> if the thread is assigned to another core it will still have the same
>> thread_info pointer.
>>
>> The problem with this solution is that it does not work as an index into
>> an array nicely and cannot simply replace all uses of
>> click_current_processor(). There may be other places in the code which
>> make the same incorrect use of click_current_processor(), but a better
>> approach will be needed if the value is used to index into an array
>> (such
>> as in ReadWriteLock).
>>
>>
>> Thanks!
>> Kevin Springborn
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> click mailing list
>> [email protected]
>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click
>
Sep  3 14:37:05 stock-inspector kernel: BUG: soft lockup detected on CPU#1!
Sep  3 14:37:05 stock-inspector kernel:
Sep  3 14:37:05 stock-inspector kernel: Call Trace:
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8026c697>] 
dump_trace+0xaa/0x403
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8026ca2c>] 
show_trace+0x3c/0x52
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8026ca57>] 
dump_stack+0x15/0x17
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff802b29ac>] 
softlockup_tick+0xc8/0xdd
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8024e0f8>] 
run_local_timers+0x13/0x15
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8028ea87>] 
update_process_times+0x4c/0x78
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff802750c7>] 
smp_local_timer_interrupt+0x34/0x54
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff80275660>] 
smp_apic_timer_interrupt+0x60/0x77
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8025fd6b>] 
apic_timer_interrupt+0x6b/0x70
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8020c608>] __delay+0x6/0x15
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff802077ba>] 
_raw_spin_lock+0x8a/0xf1
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff802656f4>] 
_spin_lock+0x2d/0x31
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8830784e>] 
:click:_ZN12RouterThread6driverEv+0x6c6/0x776
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8838250f>] 
:click:_Z11click_schedPv+0x145/0x1fa
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8025ff48>] child_rip+0xa/0x12
Sep  3 14:37:05 stock-inspector kernel:
Sep  3 14:37:05 stock-inspector kernel: BUG: soft lockup detected on CPU#3!
Sep  3 14:37:05 stock-inspector kernel:
Sep  3 14:37:05 stock-inspector kernel: Call Trace:
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8026c697>] 
dump_trace+0xaa/0x403
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8026ca2c>] 
show_trace+0x3c/0x52
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8026ca57>] 
dump_stack+0x15/0x17
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff802b29ac>] 
softlockup_tick+0xc8/0xdd
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8024e0f8>] 
run_local_timers+0x13/0x15
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8028ea87>] 
update_process_times+0x4c/0x78
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff802750c7>] 
smp_local_timer_interrupt+0x34/0x54
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff80275660>] 
smp_apic_timer_interrupt+0x60/0x77
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8025fd6b>] 
apic_timer_interrupt+0x6b/0x70
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff802077ac>] 
_raw_spin_lock+0x7c/0xf1
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff802656f4>] 
_spin_lock+0x2d/0x31
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff88306f1b>] 
:click:_ZN12RouterThread23unschedule_router_tasksEP6Router+0x2f/0x88
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8830f9d3>] 
:click:_ZN6Master11kill_routerEP6Router+0xcb/0x1de
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8830e6fb>] 
:click:_ZN6Router10initializeEP12ErrorHandler+0x627/0x6ea
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff883816ab>] 
:click:_Z12write_configRK6StringP7ElementPvP12ErrorHandler+0x151/0x1ba
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff883095d1>] 
:click:_ZNK7Handler10call_writeERK6StringP7ElementP12ErrorHandler+0x63/0xf6
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff88383a80>] 
:click:handler_do_write+0x240/0x546
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff88384799>] 
:click:handler_flush+0xe1/0x10e
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff80224ae6>] 
filp_close+0x3f/0x70
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8021e665>] 
sys_close+0x98/0xd7
Sep  3 14:37:05 stock-inspector kernel:  [<ffffffff8025f11e>] 
system_call+0x7e/0x83
Sep  3 14:37:05 stock-inspector kernel:  [<00000033466c42f0>]
Sep  3 14:45:21 stock-inspector kernel: BUG: soft lockup detected on CPU#0!
Sep  3 14:45:21 stock-inspector kernel:
Sep  3 14:45:21 stock-inspector kernel: Call Trace:
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8026c697>] 
dump_trace+0xaa/0x403
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8026ca2c>] 
show_trace+0x3c/0x52
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8026ca57>] 
dump_stack+0x15/0x17
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802b29ac>] 
softlockup_tick+0xc8/0xdd
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8024e0f8>] 
run_local_timers+0x13/0x15
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8028ea87>] 
update_process_times+0x4c/0x78
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802750c7>] 
smp_local_timer_interrupt+0x34/0x54
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff80275660>] 
smp_apic_timer_interrupt+0x60/0x77
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8025fd6b>] 
apic_timer_interrupt+0x6b/0x70
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802077ac>] 
_raw_spin_lock+0x7c/0xf1
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802656f4>] 
_spin_lock+0x2d/0x31
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff883098c6>] 
:click:_ZN12RouterThread6driverEv+0x73e/0x776
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8838450f>] 
:click:_Z11click_schedPv+0x145/0x1fa
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8025ff48>] child_rip+0xa/0x12
Sep  3 14:45:21 stock-inspector kernel:
Sep  3 14:45:21 stock-inspector kernel: BUG: soft lockup detected on CPU#2!
Sep  3 14:45:21 stock-inspector kernel:
Sep  3 14:45:21 stock-inspector kernel: Call Trace:
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8026c697>] 
dump_trace+0xaa/0x403
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8026ca2c>] 
show_trace+0x3c/0x52
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8026ca57>] 
dump_stack+0x15/0x17
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802b29ac>] 
softlockup_tick+0xc8/0xdd
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8024e0f8>] 
run_local_timers+0x13/0x15
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8028ea87>] 
update_process_times+0x4c/0x78
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802750c7>] 
smp_local_timer_interrupt+0x34/0x54
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff80275660>] 
smp_apic_timer_interrupt+0x60/0x77
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8025fd6b>] 
apic_timer_interrupt+0x6b/0x70
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8020c608>] __delay+0x6/0x15
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802077ba>] 
_raw_spin_lock+0x8a/0xf1
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802656f4>] 
_spin_lock+0x2d/0x31
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff88308f1b>] 
:click:_ZN12RouterThread23unschedule_router_tasksEP6Router+0x2f/0x88
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff883119d3>] 
:click:_ZN6Master11kill_routerEP6Router+0xcb/0x1de
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8830d513>] 
:click:_ZN6RouterD1Ev+0x6f/0x3da
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8830d89f>] 
:click:_ZN6Router5unuseEv+0x21/0x2e
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff883833c7>] 
:click:_Z11kill_routerv+0x1b/0x28
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff88383658>] 
:click:_Z12write_configRK6StringP7ElementPvP12ErrorHandler+0xfe/0x1ba
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8830b5d1>] 
:click:_ZNK7Handler10call_writeERK6StringP7ElementP12ErrorHandler+0x63/0xf6
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff88385a80>] 
:click:handler_do_write+0x240/0x546
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff88386799>] 
:click:handler_flush+0xe1/0x10e
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff80224ae6>] 
filp_close+0x3f/0x70
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8021e665>] 
sys_close+0x98/0xd7
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8025f11e>] 
system_call+0x7e/0x83
Sep  3 14:45:21 stock-inspector kernel:  [<00000033466c42f0>]
Sep  3 14:45:21 stock-inspector kernel:
Sep  3 14:45:21 stock-inspector kernel: BUG: soft lockup detected on CPU#3!
Sep  3 14:45:21 stock-inspector kernel:
Sep  3 14:45:21 stock-inspector kernel: Call Trace:
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8026c697>] 
dump_trace+0xaa/0x403
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8026ca2c>] 
show_trace+0x3c/0x52
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8026ca57>] 
dump_stack+0x15/0x17
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802b29ac>] 
softlockup_tick+0xc8/0xdd
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8024e0f8>] 
run_local_timers+0x13/0x15
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8028ea87>] 
update_process_times+0x4c/0x78
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802750c7>] 
smp_local_timer_interrupt+0x34/0x54
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff80275660>] 
smp_apic_timer_interrupt+0x60/0x77
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8025fd6b>] 
apic_timer_interrupt+0x6b/0x70
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802077ac>] 
_raw_spin_lock+0x7c/0xf1
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff802656f4>] 
_spin_lock+0x2d/0x31
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff883098c6>] 
:click:_ZN12RouterThread6driverEv+0x73e/0x776
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8838450f>] 
:click:_Z11click_schedPv+0x145/0x1fa
Sep  3 14:45:21 stock-inspector kernel:  [<ffffffff8025ff48>] child_rip+0xa/0x12
Sep  3 14:45:21 stock-inspector kernel:
_______________________________________________
click mailing list
[email protected]
https://amsterdam.lcs.mit.edu/mailman/listinfo/click

Reply via email to