Re: RESEND, HTB(?) softlockup, vanilla 2.6.24

2008-02-17 Thread Jarek Poplawski
On Sun, Feb 17, 2008 at 02:03:33AM +0200, Denys Fedoryshchenko wrote:
 Server is fully redundant now, so i apply patches (but i apply both, probably 
 it will make system more reliable somehow) and i enable required debug 
 options in kernel. So i will try to catch this bug few more times, probably 
 if it will generate more detailed info over netconsole it will be useful.

I guess you mean the patches mentioned in the BUG/ spinlock lockup;
they could be useful, but we are not sure this is the same problem.
Anyway, if there are really stack overflows then we don't need any
bug report after this: with stack data corrupted they would show some
false problems. We need to find which code overflows and why. If you
want to debug this, then try to make this more reproducible e.g. with
CONFIG_4KSTACKS; anyway you should always turn on these options with
such problems: CONFIG_DEBUG_STACKOVERFLOW CONFIG_DEBUG_STACK_USAGE.

 Is there any project to dump console messages/kernel dump to disk? For 
...
I don't know, but there is probably something better: a project by
Intel to save this in some cpu memory (or something...). But again:
we don't need corrupted messages after stack overflow, and, if we
don't let for this, maybe these netconsole messages would be properly
printed and quite enough...

 I notice some code in MTD(CONFIG_MTD_OOPS), but i am not sure it is correct 
 and will work if i will setup MTD emulation for block device.

I'm not sure what do you mean by MTD emulation: it should be used with
MTD devices only, I presume?

Regards,
Jarek P.

PS: BTW, for HTB with actions I recommend my sch_htb: htb_requeue fix,
available in 2.6.25-rc.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RESEND, HTB(?) softlockup, vanilla 2.6.24

2008-02-16 Thread Denys Fedoryshchenko
Thanks, i will try it.
You think lockdep can be buggy?

On Sat, 16 Feb 2008 09:00:36 +0100, Jarek Poplawski wrote
 Denys Fedoryshchenko wrote, On 02/13/2008 09:13 AM:
 
  It is very difficult to reproduce, happened after running about 1month. 
No 
  changes done in classes at time of crash.
  
  Kernel 2.6.24 vanilla
 
 Hi,
 
 I could be wrong, but IMHO this looks like stack was overridden here,
 so my proposal is to try this:
 
 CONFIG_DEBUG_STACKOVERFLOW=y
 
 But, if you're not very interested in reproducing this, you could 
 also try to turn off some other debugging, especially lockdep.
 
 Regards,
 Jarek P.
 
 
 
  Feb 10 15:53:22 SHAPER [ 8271.778915] BUG: NMI Watchdog detected LOCKUP
  Feb 10 15:53:22 SHAPER on CPU1, eip c01f0e5d, registers:
 
 
 
  Feb 10 15:53:22 SHAPER [ 8271.779307] Pid: 0, comm: swapper Not tainted 
  (2.6.24-build-0021 #26)
  Feb 10 15:53:22 SHAPER [ 8271.779327] EIP: 0060:[c01f0e5d] EFLAGS: 
0082 
  CPU: 1
  Feb 10 15:53:22 SHAPER [ 8271.779349] EIP is at __rb_rotate_right+0x5/0x50
  Feb 10 15:53:22 SHAPER [ 8271.779366] EAX: f76494a4 EBX: f76494a4 ECX: 
  f76494a4 EDX: c1ff5f80
  Feb 10 15:53:22 SHAPER [ 8271.779386] ESI: f76494a4 EDI: c1ff5f80 EBP: 
   ESP: f7c29c70
  Feb 10 15:53:22 SHAPER [ 8271.779406]  DS: 007b ES: 007b FS: 00d8 GS: 
 
  SS: 0068
  Feb 10 15:53:22 SHAPER [ 8271.779425] Process swapper (pid: 0, 
ti=f7c28000 
  task=f7c20a60 task.ti=f7c28000)
  Feb 10 15:53:22 SHAPER
  Feb 10 15:53:22 SHAPER [ 8271.779446] Stack:
  Feb 10 15:53:22 SHAPER f76494a4
  Feb 10 15:53:22 SHAPER f76494a4
  Feb 10 15:53:22 SHAPER f76494a4
  Feb 10 15:53:22 SHAPER c01f0ef4
  Feb 10 15:53:22 SHAPER c1ff5f80
  Feb 10 15:53:22 SHAPER f76494a4
  Feb 10 15:53:22 SHAPER f76494a8
  Feb 10 15:53:22 SHAPER c1ff5f78
  Feb 10 15:53:22 SHAPER
  Feb 10 15:53:22 SHAPER [ 8271.779493]
  Feb 10 15:53:22 SHAPER [ 8271.779307] Pid: 0, comm: swapper Not tainted 
  (2.6.24-build-0021 #26)
  Feb 10 15:53:22 SHAPER [ 8271.779327] EIP: 0060:[c01f0e5d] EFLAGS: 
0082 
  CPU: 1
  Feb 10 15:53:22 SHAPER [ 8271.779349] EIP is at __rb_rotate_right+0x5/0x50
  Feb 10 15:53:22 SHAPER [ 8271.779366] EAX: f76494a4 EBX: f76494a4 ECX: 
  f76494a4 EDX: c1ff5f80
  Feb 10 15:53:22 SHAPER [ 8271.779386] ESI: f76494a4 EDI: c1ff5f80 EBP: 
   ESP: f7c29c70
  Feb 10 15:53:22 SHAPER [ 8271.779406]  DS: 007b ES: 007b FS: 00d8 GS: 
 
  SS: 0068
  Feb 10 15:53:22 SHAPER [ 8271.779425] Process swapper (pid: 0, 
ti=f7c28000 
  task=f7c20a60 task.ti=f7c28000)
  Feb 10 15:53:22 SHAPER
  Feb 10 15:53:22 SHAPER [ 8271.779446] Stack:
  Feb 10 15:53:22 SHAPER f76494a4
  Feb 10 15:53:22 SHAPER f76494a4
  Feb 10 15:53:22 SHAPER f76494a4
 
 
 --
 To unsubscribe from this list: send the line unsubscribe netdev in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
Denys Fedoryshchenko
Technical Manager
Virtual ISP S.A.L.

--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RESEND, HTB(?) softlockup, vanilla 2.6.24

2008-02-16 Thread Jarek Poplawski
On Sat, Feb 16, 2008 at 12:25:31PM +0200, Denys Fedoryshchenko wrote:
 Thanks, i will try it.
 You think lockdep can be buggy?

Just like every code... But the main reason is it has quite meaningful
overhead, so could be right in production only after lockups happen.
But if it doesn't report anything anyway...

Your report shows there are quite long paths of calls during softirqs
with some actions (ipt + mirred here?) and qdiscs, so if I'm not wrong
with this stack problem, this would need some optimization. And, of
course, there could be some additional bugs involved around too:
otherwise it seems this should happen more often. But I don't expect
you would try to debug this on your servers, so I hope, it simply will
be found BTW some day... 

Regards,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RESEND, HTB(?) softlockup, vanilla 2.6.24

2008-02-15 Thread Jarek Poplawski
Denys Fedoryshchenko wrote, On 02/13/2008 09:13 AM:

 It is very difficult to reproduce, happened after running about 1month. No 
 changes done in classes at time of crash.
 
 Kernel 2.6.24 vanilla


Hi,

I could be wrong, but IMHO this looks like stack was overridden here,
so my proposal is to try this:

CONFIG_DEBUG_STACKOVERFLOW=y

But, if you're not very interested in reproducing this, you could also
try to turn off some other debugging, especially lockdep.

Regards,
Jarek P.

...

 Feb 10 15:53:22 SHAPER [ 8271.778915] BUG: NMI Watchdog detected LOCKUP
 Feb 10 15:53:22 SHAPER on CPU1, eip c01f0e5d, registers:

...

 Feb 10 15:53:22 SHAPER [ 8271.779307] Pid: 0, comm: swapper Not tainted 
 (2.6.24-build-0021 #26)
 Feb 10 15:53:22 SHAPER [ 8271.779327] EIP: 0060:[c01f0e5d] EFLAGS: 0082 
 CPU: 1
 Feb 10 15:53:22 SHAPER [ 8271.779349] EIP is at __rb_rotate_right+0x5/0x50
 Feb 10 15:53:22 SHAPER [ 8271.779366] EAX: f76494a4 EBX: f76494a4 ECX: 
 f76494a4 EDX: c1ff5f80
 Feb 10 15:53:22 SHAPER [ 8271.779386] ESI: f76494a4 EDI: c1ff5f80 EBP: 
  ESP: f7c29c70
 Feb 10 15:53:22 SHAPER [ 8271.779406]  DS: 007b ES: 007b FS: 00d8 GS:  
 SS: 0068
 Feb 10 15:53:22 SHAPER [ 8271.779425] Process swapper (pid: 0, ti=f7c28000 
 task=f7c20a60 task.ti=f7c28000)
 Feb 10 15:53:22 SHAPER
 Feb 10 15:53:22 SHAPER [ 8271.779446] Stack:
 Feb 10 15:53:22 SHAPER f76494a4
 Feb 10 15:53:22 SHAPER f76494a4
 Feb 10 15:53:22 SHAPER f76494a4
 Feb 10 15:53:22 SHAPER c01f0ef4
 Feb 10 15:53:22 SHAPER c1ff5f80
 Feb 10 15:53:22 SHAPER f76494a4
 Feb 10 15:53:22 SHAPER f76494a8
 Feb 10 15:53:22 SHAPER c1ff5f78
 Feb 10 15:53:22 SHAPER
 Feb 10 15:53:22 SHAPER [ 8271.779493]
 Feb 10 15:53:22 SHAPER [ 8271.779307] Pid: 0, comm: swapper Not tainted 
 (2.6.24-build-0021 #26)
 Feb 10 15:53:22 SHAPER [ 8271.779327] EIP: 0060:[c01f0e5d] EFLAGS: 0082 
 CPU: 1
 Feb 10 15:53:22 SHAPER [ 8271.779349] EIP is at __rb_rotate_right+0x5/0x50
 Feb 10 15:53:22 SHAPER [ 8271.779366] EAX: f76494a4 EBX: f76494a4 ECX: 
 f76494a4 EDX: c1ff5f80
 Feb 10 15:53:22 SHAPER [ 8271.779386] ESI: f76494a4 EDI: c1ff5f80 EBP: 
  ESP: f7c29c70
 Feb 10 15:53:22 SHAPER [ 8271.779406]  DS: 007b ES: 007b FS: 00d8 GS:  
 SS: 0068
 Feb 10 15:53:22 SHAPER [ 8271.779425] Process swapper (pid: 0, ti=f7c28000 
 task=f7c20a60 task.ti=f7c28000)
 Feb 10 15:53:22 SHAPER
 Feb 10 15:53:22 SHAPER [ 8271.779446] Stack:
 Feb 10 15:53:22 SHAPER f76494a4
 Feb 10 15:53:22 SHAPER f76494a4
 Feb 10 15:53:22 SHAPER f76494a4

...
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RESEND, HTB(?) softlockup, vanilla 2.6.24

2008-02-13 Thread Jarek Poplawski
On 13-02-2008 09:13, Denys Fedoryshchenko wrote:
 It is very difficult to reproduce, happened after running about 1month. No 
 changes done in classes at time of crash.
 
 Kernel 2.6.24 vanilla
 
 I will try to attach also .config
 

Hi Denys,

This report looks very interesting. I don't know how others, but I
plan to study it more soon (on the weekend?), then maybe more
questions. Of course some exemplary tc rules should be helpful.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


HTB(?) softlockup, vanilla 2.6.24

2008-02-10 Thread Denys Fedoryshchenko
It is very difficult to reproduce, happened after running about 1month. No 
changes done in classes at that time

Feb 10 15:53:22 SHAPER [ 8271.778915] BUG: NMI Watchdog detected LOCKUP
Feb 10 15:53:22 SHAPER on CPU1, eip c01f0e5d, registers:
Feb 10 15:53:22 SHAPER [ 8271.778952] Modules linked in:
Feb 10 15:53:22 SHAPER netconsole
Feb 10 15:53:22 SHAPER configfs
Feb 10 15:53:22 SHAPER softdog
Feb 10 15:53:22 SHAPER nf_nat_pptp
Feb 10 15:53:22 SHAPER nf_conntrack_pptp
Feb 10 15:53:22 SHAPER nf_conntrack_proto_gre
Feb 10 15:53:22 SHAPER nf_nat_proto_gre
Feb 10 15:53:22 SHAPER xt_tcpudp
Feb 10 15:53:22 SHAPER ipt_TTL
Feb 10 15:53:22 SHAPER ipt_ttl
Feb 10 15:53:22 SHAPER xt_NOTRACK
Feb 10 15:53:22 SHAPER iptable_raw
Feb 10 15:53:22 SHAPER iptable_mangle
Feb 10 15:53:22 SHAPER ifb
Feb 10 15:53:22 SHAPER e1000e
Feb 10 15:53:22 SHAPER em_nbyte
Feb 10 15:53:22 SHAPER cls_tcindex
Feb 10 15:53:22 SHAPER act_gact
Feb 10 15:53:22 SHAPER cls_rsvp
Feb 10 15:53:22 SHAPER sch_htb
Feb 10 15:53:22 SHAPER cls_fw
Feb 10 15:53:22 SHAPER act_mirred
Feb 10 15:53:22 SHAPER em_u32
Feb 10 15:53:22 SHAPER sch_red
Feb 10 15:53:22 SHAPER sch_sfq
Feb 10 15:53:22 SHAPER sch_tbf
Feb 10 15:53:22 SHAPER sch_teql
Feb 10 15:53:22 SHAPER cls_basic
Feb 10 15:53:22 SHAPER act_police
Feb 10 15:53:22 SHAPER sch_gred
Feb 10 15:53:22 SHAPER act_pedit
Feb 10 15:53:22 SHAPER sch_hfsc
Feb 10 15:53:22 SHAPER cls_rsvp6
Feb 10 15:53:22 SHAPER sch_ingress
Feb 10 15:53:22 SHAPER em_meta
Feb 10 15:53:22 SHAPER em_text
Feb 10 15:53:22 SHAPER act_ipt
Feb 10 15:53:22 SHAPER sch_dsmark
Feb 10 15:53:22 SHAPER sch_prio
Feb 10 15:53:22 SHAPER sch_netem
Feb 10 15:53:22 SHAPER act_simple
Feb 10 15:53:22 SHAPER cls_u32
Feb 10 15:53:22 SHAPER em_cmp
Feb 10 15:53:22 SHAPER sch_cbq
Feb 10 15:53:22 SHAPER cls_route
Feb 10 15:53:22 SHAPER xt_TCPMSS
Feb 10 15:53:22 SHAPER iptable_nat
Feb 10 15:53:22 SHAPER nf_conntrack_ipv4
Feb 10 15:53:22 SHAPER ipt_LOG
Feb 10 15:53:22 SHAPER ipt_MASQUERADE
Feb 10 15:53:22 SHAPER ipt_REDIRECT
Feb 10 15:53:22 SHAPER nf_nat
Feb 10 15:53:22 SHAPER nf_conntrack
Feb 10 15:53:22 SHAPER nfnetlink
Feb 10 15:53:22 SHAPER iptable_filter
Feb 10 15:53:22 SHAPER ip_tables
Feb 10 15:53:22 SHAPER x_tables
Feb 10 15:53:22 SHAPER 8021q
Feb 10 15:53:22 SHAPER tun
Feb 10 15:53:22 SHAPER tulip
Feb 10 15:53:22 SHAPER r8169
Feb 10 15:53:22 SHAPER sky2
Feb 10 15:53:22 SHAPER via_velocity
Feb 10 15:53:22 SHAPER via_rhine
Feb 10 15:53:22 SHAPER sis900
Feb 10 15:53:22 SHAPER ne2k_pci
Feb 10 15:53:22 SHAPER 8390
Feb 10 15:53:22 SHAPER skge
Feb 10 15:53:22 SHAPER tg3
Feb 10 15:53:22 SHAPER 8139too
Feb 10 15:53:22 SHAPER e1000
Feb 10 15:53:22 SHAPER e100
Feb 10 15:53:22 SHAPER usb_storage
Feb 10 15:53:22 SHAPER mtdblock
Feb 10 15:53:22 SHAPER mtd_blkdevs
Feb 10 15:53:22 SHAPER usbhid
Feb 10 15:53:22 SHAPER uhci_hcd
Feb 10 15:53:22 SHAPER ehci_hcd
Feb 10 15:53:22 SHAPER ohci_hcd
Feb 10 15:53:22 SHAPER usbcore
Feb 10 15:53:22 SHAPER
Feb 10 15:53:22 SHAPER [ 8271.779291]
Feb 10 15:53:22 SHAPER [ 8271.779307] Pid: 0, comm: swapper Not tainted 
(2.6.24-build-0021 #26)
Feb 10 15:53:22 SHAPER [ 8271.779327] EIP: 0060:[c01f0e5d] EFLAGS: 0082 
CPU: 1
Feb 10 15:53:22 SHAPER [ 8271.779349] EIP is at __rb_rotate_right+0x5/0x50
Feb 10 15:53:22 SHAPER [ 8271.779366] EAX: f76494a4 EBX: f76494a4 ECX: 
f76494a4 EDX: c1ff5f80
Feb 10 15:53:22 SHAPER [ 8271.779386] ESI: f76494a4 EDI: c1ff5f80 EBP: 
 ESP: f7c29c70
Feb 10 15:53:22 SHAPER [ 8271.779406]  DS: 007b ES: 007b FS: 00d8 GS:  
SS: 0068
Feb 10 15:53:22 SHAPER [ 8271.779425] Process swapper (pid: 0, ti=f7c28000 
task=f7c20a60 task.ti=f7c28000)
Feb 10 15:53:22 SHAPER
Feb 10 15:53:22 SHAPER [ 8271.779446] Stack:
Feb 10 15:53:22 SHAPER f76494a4
Feb 10 15:53:22 SHAPER f76494a4
Feb 10 15:53:22 SHAPER f76494a4
Feb 10 15:53:22 SHAPER c01f0ef4
Feb 10 15:53:22 SHAPER c1ff5f80
Feb 10 15:53:22 SHAPER f76494a4
Feb 10 15:53:22 SHAPER f76494a8
Feb 10 15:53:22 SHAPER c1ff5f78
Feb 10 15:53:22 SHAPER
Feb 10 15:53:22 SHAPER [ 8271.779493]
Feb 10 15:53:22 SHAPER [ 8271.779307] Pid: 0, comm: swapper Not tainted 
(2.6.24-build-0021 #26)
Feb 10 15:53:22 SHAPER [ 8271.779327] EIP: 0060:[c01f0e5d] EFLAGS: 0082 
CPU: 1
Feb 10 15:53:22 SHAPER [ 8271.779349] EIP is at __rb_rotate_right+0x5/0x50
Feb 10 15:53:22 SHAPER [ 8271.779366] EAX: f76494a4 EBX: f76494a4 ECX: 
f76494a4 EDX: c1ff5f80
Feb 10 15:53:22 SHAPER [ 8271.779386] ESI: f76494a4 EDI: c1ff5f80 EBP: 
 ESP: f7c29c70
Feb 10 15:53:22 SHAPER [ 8271.779406]  DS: 007b ES: 007b FS: 00d8 GS:  
SS: 0068
Feb 10 15:53:22 SHAPER [ 8271.779425] Process swapper (pid: 0, ti=f7c28000 
task=f7c20a60 task.ti=f7c28000)
Feb 10 15:53:22 SHAPER
Feb 10 15:53:22 SHAPER [ 8271.779446] Stack:
Feb 10 15:53:22 SHAPER f76494a4
Feb 10 15:53:22 SHAPER f76494a4
Feb 10 15:53:22 SHAPER f76494a4
Feb 10 15:53:22 SHAPER c01f0ef4
Feb 10 15:53:22 SHAPER c1ff5f80
Feb 10 15:53:22 SHAPER f76494a4
Feb 10 15:53:22 SHAPER f76494a8
Feb 10 15:53:22 SHAPER c1ff5f78
Feb 10 15:53:22 SHAPER
Feb 10 15:53:22 SHAPER [