On 03/01/2012 03:04 PM, Andrei Warkentin wrote:
> ----- Original Message -----
>> From: "Andrei Warkentin" <[email protected]>
>> To: "Jason Wessel" <[email protected]>
>> Cc: [email protected], [email protected], "Andrei Warkentin" 
>> <[email protected]>,
>> [email protected], "Matt Mackall" <[email protected]>, 
>> "Andrei Warkentin"
>> <[email protected]>
>> Sent: Tuesday, February 28, 2012 12:43:52 PM
>> Subject: Re: [PATCHv3 1/3] NETPOLL: Extend rx_hook support.
>>
>>> All that netpoll_poll() did was to call netpoll_poll_dev().  I have
>>> not yet looked at the differences between kgdboe and the netkdb
>>> code
>>> you proposed but I would have suspected it also falls victim to the
>>> ethernet preemption problem which prevented kgdboe from ever being
>>> considered for a mainline merge.  Certainly there are ways to fix
>>> this
>>> problem but most involved changes to scheduling, core net code, or
>>> substantial driver specific changes.
>>>
>> I see, I read up on the issues w.r.t. preemption. Could this be
>> worked
>> around by modifiying affected drivers to bypass locking if they are
>> used in KDB context? Make some accessor netdev-specific lock/unlocks
>> that won't do anything if running in KDB context.
>>
>>
> By the way, is there a good way to repro the preemption case? Hopefully this 
> doesn't
> involve some crazy hardware...


I have several cases which will usually hang the machine fairly quickly, but 
they all involve using gdb and a target using SMP.  Most often it is as simple 
as this:

* Use an SMP system with with at least 2 cores
* Start two threads rapidly running some processes
     while [ 1 ] ; do date > /dev/null ; done &
     while [ 1 ] ; do date > /dev/null ; done &
* Connect with gdb to kgdb and set a breakpoint at do_fork
   Now do "c"
   Now do "c 1000"

Generally the system will hang long before you get 1000 breakpoints hit and it 
will be a condition where there is a lock needed to create an skb, or the 
ethernet driver is preempted or some part of the network stack is preempted (or 
holding a lock) on the non master cpu.

There is another condition that is hard to catch that involves a task migrating 
from one cpu to the next, but we'll stick to the simple test case I described 
above for now.

I did have a question, because it seems you were using qemu / kvm.   I have a 
number of test cases that use kvm, but the netkkgdb does not seem to work with 
the nc.  My question is how am I supposed to actually use the netkgdb?

Here is what I observe on the target system:

insmod netkgdb.ko netkgdb=@/,@10.0.2.2/
echo g > /proc/sysrq-trigger

On my host system:
nc.traditional -l -u -p 7777

I will type help, and then the netkgdb is toast.  It doesn't seem to respond 
anymore.

Jason.

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Kgdb-bugreport mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport

Reply via email to