Hi all, I bisected tree for this issue, and found following commit is the first bad commit.
commit a7dd5c93a51b3b5b8cd4a177ef379492c27859bf click-combine documentation: Mention that RouterLink is fake. Any idea? Thanks, Joonwoo 2008/9/3 <[EMAIL PROTECTED]>: > Hi Eddie, > > Sorry for the wait. Today I pulled the latest sources and rebuilt a > vanilla click kernel on a 64-bit 4 core Xeon machine (with the click e1000 > driver installed). The locking seems to be working fine, but > 'click-uninstall' is resulting in a lockup. See the attached dumps (click > was started with click-install -t 3 in both cases). I still see the lockup > with -t 1. > > Click uninstalls without problem with our version from a couple of weeks > back. > > Unrelated, but I also saw: > ../include/click/timestamp.hh: In constructor 'Timestamp::Timestamp(double)': > ../include/click/timestamp.hh:938: warning: converting to 'int64_t' from > 'double' > all over when building userlevel click. > > Sorry I don't have the time to dive into this further right now. > > Thanks! > Kevin > > > > >> Kevin, >> >> Is there any chance you'd be able to try the newer Spinlock code any time >> soon? >> >> Eddie >> >> >> [EMAIL PROTECTED] wrote: >>> Hi, >>> >>> I think there is a concurrency issue with Spinlocks in linuxmodule >>> multi-threaded click (running a 2.6.19.2 patched kernel, the e1000-NAPI >>> driver and today's trunk). I've put together a temporary patch, but >>> there >>> might be further issues. Sorry, I don't have the time to investigate >>> more. >>> >>> >>> Problems: >>> The series of events are: >>> -Thread A is running on CPU 0 and thread B is running on CPU 1. >>> -A acquires the Spinlock and executes code in the critical section. >>> -Linux schedules Thread A to run on CPU 1 and thread B to run on CPU 0. >>> -B acquires the Spinlock (works because CPU 0 is the owner of the lock, >>> not the thread) >>> -A and B are both operating in the critical section >>> -(if you're lucky) A releases the Spinlock and generates a "releasing >>> someone else's lock" message >>> -(if you're unlucky) Oops >>> >>> Attached is a test element and a configuration to elicit the behavior. I >>> could only replicate the problem when using a polling input source under >>> heavy load. >>> >>> After fixing the above problem I was still seeing two threads inside the >>> CS. For some reason the atomic_uint32_t::swap was not working atomically >>> as expected. Changing this to atomic_uint32::compare_and_swap solved the >>> problem. We're running x86_64 quad core Xeon CPUs. >>> >>> >>> Solution: >>> In the patch attached I added a new function click_current_thread() >>> which >>> returns a thread_info pointer. Inside of Spinlock I replace >>> click_current_processor() uses with click_current_thread(). This way >>> even >>> if the thread is assigned to another core it will still have the same >>> thread_info pointer. >>> >>> The problem with this solution is that it does not work as an index into >>> an array nicely and cannot simply replace all uses of >>> click_current_processor(). There may be other places in the code which >>> make the same incorrect use of click_current_processor(), but a better >>> approach will be needed if the value is used to index into an array >>> (such >>> as in ReadWriteLock). >>> >>> >>> Thanks! >>> Kevin Springborn >>> >>> >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> click mailing list >>> [email protected] >>> https://amsterdam.lcs.mit.edu/mailman/listinfo/click >> > > _______________________________________________ > click mailing list > [email protected] > https://amsterdam.lcs.mit.edu/mailman/listinfo/click > > _______________________________________________ click mailing list [email protected] https://amsterdam.lcs.mit.edu/mailman/listinfo/click
