Re: [patch] genirq: fix simple and fasteoi irq handlers

2007-08-03 Thread Jarek Poplawski
On Fri, Aug 03, 2007 at 10:04:08AM +0200, Ingo Molnar wrote: * Jarek Poplawski [EMAIL PROTECTED] wrote: I can't guarantee this is all needed to fix this bug, but I think this patch is necessary here. hmmm ... very interesting! Now _this_ is something we'd like to see tested. Could

[patch (take 2)] genirq: fix simple and fasteoi irq handlers

2007-08-06 Thread Jarek Poplawski
, there should be at least possibility to turn this off for level types in config (it should be a visible option, so people could find try this before writing for help or changing a network card). Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.23-rc1-/kernel/irq/chip.c 2.6.23

Re: [patch (take 2)] genirq: fix simple and fasteoi irq handlers

2007-08-06 Thread Jarek Poplawski
On Mon, Aug 06, 2007 at 08:14:59AM +0200, Ingo Molnar wrote: * Jarek Poplawski [EMAIL PROTECTED] wrote: Subject: genirq: fix simple and fasteoi irq handlers After the genirq: do not mask interrupts by default patch interrupts should be disabled not immediately upon request

[PATCH] docs: note about select in kconfig-language.txt

2007-08-06 Thread Jarek Poplawski
Ravnborg about kconfig's select evilness, dependencies and the future (slightly corrected). Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] Cc: Sam Ravnborg [EMAIL PROTECTED] --- diff -Nu9r 2.6.23-rc1-/Documentation/kbuild/kconfig-language.txt 2.6.23-rc1/Documentation/kbuild/kconfig

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-07 Thread Jarek Poplawski
On Mon, Aug 06, 2007 at 05:19:03PM -0400, Chuck Ebbert wrote: On 08/06/2007 04:42 PM, Jean-Baptiste Vignaud wrote: Mmm, bad news, after 4 hours of intensive network stressing, one of the 2 3com card failed with the latest fedora kernel. Aug 6 22:31:09 loki kernel: NETDEV WATCHDOG:

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-07 Thread Jarek Poplawski
On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote: 2007/8/6, Ingo Molnar [EMAIL PROTECTED]: (..) please try Jarek's second patch too - there was a missing unmask. Ingo -- Subject: genirq: fix simple and fasteoi irq handlers From: Jarek Poplawski

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-07 Thread Jarek Poplawski
On Tue, Aug 07, 2007 at 10:10:34AM +0200, Jean-Baptiste Vignaud wrote: BTW: Jean-Babtiste, could you send or point to you current configs? Oops! I'm very sorry for misspelling! I mean at least proc/interrupts, but with dmesg and .config it would be even better. (I assume this last report

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-07 Thread Jarek Poplawski
On Tue, Aug 07, 2007 at 11:21:07AM +0200, Jean-Baptiste Vignaud wrote: * interrupts (i use irqbalance, but problem was the same without) I wonder if you tried without SMP too? No i did not. Do you think that this can be a problem ? To test with no SMP, do i need to recompile kernel or

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-07 Thread Jarek Poplawski
On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote: 2007/8/7, Jarek Poplawski [EMAIL PROTECTED]: On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote: Network card still locks up (tested on 2.6.22.1). I had to upload more data than usual (~350 MB vs ~1-100 MB

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-07 Thread Jarek Poplawski
On Mon, Aug 06, 2007 at 01:43:48PM -0400, Chuck Ebbert wrote: On 08/06/2007 03:03 AM, Ingo Molnar wrote: But, since level types don't need this retriggers too much I think this don't mask interrupts by default idea should be rethinked: is there enough gain to risk such hard to diagnose

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-07 Thread Jarek Poplawski
On Tue, Aug 07, 2007 at 11:52:46AM +0200, Jarek Poplawski wrote: On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote: 2007/8/7, Jarek Poplawski [EMAIL PROTECTED]: On Tue, Aug 07, 2007 at 09:46:36AM +0200, Marcin Ślusarz wrote: Network card still locks up (tested on 2.6.22.1

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-07 Thread Jarek Poplawski
On Tue, Aug 07, 2007 at 02:13:39PM +0200, Jarek Poplawski wrote: On Tue, Aug 07, 2007 at 11:52:46AM +0200, Jarek Poplawski wrote: On Tue, Aug 07, 2007 at 11:37:01AM +0200, Marcin Ślusarz wrote: ... No, i don't need a break. I'll have more time in next weeks. Great! So, I'll try to send

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-08 Thread Jarek Poplawski
On Tue, Aug 07, 2007 at 07:16:33PM +0200, Jean-Baptiste Vignaud wrote: ... So this afternoon i compiled 2.6.23-rc2 with same options as 2.6.23-rc1 and edited grub.conf to add nosmp but after reboot the box did not responded. Back home, i saw that the kernel failed because it was unable to find

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-08 Thread Jarek Poplawski
On Wed, Aug 08, 2007 at 09:21:14AM +0200, Jarek Poplawski wrote: On Tue, Aug 07, 2007 at 07:16:33PM +0200, Jean-Baptiste Vignaud wrote: ... Marcin has done this with successfully using the most professional way: git bisect (which btw. I did learn yet), but, IMHO, it could be ... Let me say

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-08 Thread Jarek Poplawski
On Wed, Aug 08, 2007 at 10:59:22AM +0200, Jean-Baptiste Vignaud wrote: Jean-Baptiste: I'm not sure how much of this testing you can afford? If you can spare some time for this and your box isn't for 'production' it could be very precious to diagnose such reproducible bug. Well i can

[patch] genirq: temporary fix for level-triggered IRQ resend

2007-08-08 Thread Jarek Poplawski
should limit the range of warnings and changes in interrupt handling to minimum. Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] Cc: Marcin Slusarz [EMAIL PROTECTED] Cc: Jean-Baptiste Vignaud [EMAIL PROTECTED] Cc: Thomas Gleixner [EMAIL PROTECTED] Cc: Ingo Molnar [EMAIL PROTECTED] --- diff -Nurp

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-08 Thread Jarek Poplawski
Read below please: On Wed, Aug 08, 2007 at 01:09:36PM +0200, Marcin Ślusarz wrote: 2007/8/7, Jarek Poplawski [EMAIL PROTECTED]: So, the let's try this idea yet: modified Ingo's x86: activate HARDIRQS_SW_RESEND patch. (Don't forget about make oldconfig before make.) For testing only

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-08 Thread Jarek Poplawski
On Wed, Aug 08, 2007 at 01:42:43PM +0200, Jarek Poplawski wrote: ... So, it looks like x86_64 io_apic's IPI code was unused too long... To be fair it's x86_64 lapic's IPI code. Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL

Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-08 Thread Jarek Poplawski
On Wed, Aug 08, 2007 at 10:59:22AM +0200, Jean-Baptiste Vignaud wrote: ... If you would like to read something more about testing (then of course my suggestions could occur invalid - I'm a very bad tester myself...) you can try this: http://www.stardust.webpages.pl/files/handbook/ I'll

[patch (testing)] Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-09 Thread Jarek Poplawski
On Wed, Aug 08, 2007 at 01:42:43PM +0200, Jarek Poplawski wrote: Read below please: On Wed, Aug 08, 2007 at 01:09:36PM +0200, Marcin Ślusarz wrote: 2007/8/7, Jarek Poplawski [EMAIL PROTECTED]: So, the let's try this idea yet: modified Ingo's x86: activate HARDIRQS_SW_RESEND patch

[RFC] Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-09 Thread Jarek Poplawski
It seems, we can start to think about some preferred solutions, already. Here are some of my preliminary conclusions and suggestions. The problem of timeouts with some 'older' network cards seems to hit mainly x86_64 arch, and after diagnosing and testing (still beeing done) it's caused by

Re: [RFC] Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-09 Thread Jarek Poplawski
On Thu, Aug 09, 2007 at 06:04:34PM +0200, Andi Kleen wrote: Jarek Poplawski [EMAIL PROTECTED] writes: It seems, we can start to think about some preferred solutions, already. Here are some of my preliminary conclusions and suggestions. The problem of timeouts with some 'older' network

Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

2007-08-09 Thread Jarek Poplawski
On Thu, Aug 09, 2007 at 11:03:03AM -0400, John Stoffel wrote: Hi, Hi, read below, please... I'm opening this ticket as a new subject, even though it looks like it might be related to the thread Networking dies after random time. Sorry for the wide CC list, but since my network hasn't

Re: [patch (testing)] Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-10 Thread Jarek Poplawski
On Fri, Aug 10, 2007 at 08:33:27AM +0200, Marcin Ślusarz wrote: 2007/8/9, Jarek Poplawski [EMAIL PROTECTED]: ... diff -Nurp 2.6.23-rc1-/kernel/irq/chip.c 2.6.23-rc1/kernel/irq/chip.c --- 2.6.23-rc1-/kernel/irq/chip.c 2007-07-09 01:32:17.0 +0200 +++ 2.6.23-rc1/kernel/irq/chip.c

Re: [patch (testing)] Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-10 Thread Jarek Poplawski
On Fri, Aug 10, 2007 at 12:43:43PM +0200, Marcin Ślusarz wrote: 2007/8/10, Jarek Poplawski [EMAIL PROTECTED]: (..) I think, there is this one possible for your testing yet?: Subject: [patch] genirq: temporary fix for level-triggered IRQ resend Date: Wed, 8 Aug 2007 13:00:37 +0200 I think

Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

2007-08-10 Thread Jarek Poplawski
On Fri, Aug 10, 2007 at 11:33:53AM +0200, Ingo Molnar wrote: * Jarek Poplawski [EMAIL PROTECTED] wrote: + } #ifdef CONFIG_HARDIRQS_SW_RESEND we used the hw-resend method unconditionally, right? Right: unconditionally on a condition they are not edges

Re: [patch (testing)] Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-10 Thread Jarek Poplawski
On Fri, Aug 10, 2007 at 11:08:33AM +0200, Ingo Molnar wrote: * Jarek Poplawski [EMAIL PROTECTED] wrote: On 10-08-2007 10:05, Thomas Gleixner wrote: ... But suppressing the resend is not fixing the driver problem. The problem can show up with spurious interrupts and with interrupts

Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

2007-08-10 Thread Jarek Poplawski
On Fri, Aug 10, 2007 at 10:56:11AM +0200, Ingo Molnar wrote: ... this changes the picture completely and makes the IO-APIC/local-APIC hw retrigger code/logic the main suspect. I think you right that it's quite bogus to hw-retrigger level irqs, and that could be confusing the IO-APIC (or the

Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

2007-08-10 Thread Jarek Poplawski
On Fri, Aug 10, 2007 at 10:30:50AM +0200, Ingo Molnar wrote: * Jarek Poplawski [EMAIL PROTECTED] wrote: Hmm. This solution is still just pampering over the real problem. The delayed disable just re-sends level interrupts unnecessarily. I have a fix (needs some testing

Re: 2.6.23-rc2: WARNING: at kernel/irq/resend.c:70 check_irq_resend()

2007-08-10 Thread Jarek Poplawski
On Fri, Aug 10, 2007 at 10:05:40AM +0200, Thomas Gleixner wrote: On Thu, 2007-08-09 at 17:54 +0200, Jarek Poplawski wrote: I'm not sure I don't miss anything (a little in hurry now), but this warning's aim was purely diagnostical and nothing wrong is meant! Unless there is something wrong

Re: [patch (testing)] Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-10 Thread Jarek Poplawski
On Fri, Aug 10, 2007 at 10:15:53AM +0200, Jean-Baptiste Vignaud wrote: ... I was still testing on -rc2: Subject: [patch] genirq: temporary fix for level-triggered IRQ resend Date: Wed, 8 Aug 2007 13:00:37 +0200 For me after 1day 20hours, the network is still up, with more than 1To of

Re: [patch (testing)] Re: 2.6.20-2.6.21 - networking dies after random time

2007-08-10 Thread Jarek Poplawski
On Fri, Aug 10, 2007 at 10:48:41AM +0200, Ingo Molnar wrote: * Jarek Poplawski [EMAIL PROTECTED] wrote: On Fri, Aug 10, 2007 at 10:15:53AM +0200, Jean-Baptiste Vignaud wrote: ... I was still testing on -rc2: Subject: [patch] genirq: temporary fix for level-triggered IRQ resend

[PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-22 Thread Jarek Poplawski
for them context, and could be vulnerable (especially with softirqs, but probably hardirqs as well). Reported-by: Mariusz Kozlowski [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.23-rc3-mm1-/kernel/irq/manage.c 2.6.23-rc3-mm1/kernel/irq/manage.c --- 2.6.23

[PATCH (take 2)] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-23 Thread Jarek Poplawski
be enough, but it needs more checking on possible races or other special cases). This patch is recommended to all stable versions since 2.6.21, too. Reported-by: Mariusz Kozlowski [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- diff -Nurp 2.6.23-rc3-git6-/kernel/irq

Re: [PATCH (take 2)] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-23 Thread Jarek Poplawski
On Thu, Aug 23, 2007 at 10:44:30AM +0200, Jarek Poplawski wrote: Andrew Morton pointed out that my changelog was unusable. Sorry! Here is a second try with the changelog and kernel version changed. ... (take 2) Subject: request_irq() - fix DEBUG_SHIRQ handling ... Signed-off

Re: [ANNOUNCE] iproute2-2.6.23-rc3

2007-08-24 Thread Jarek Poplawski
On 22-08-2007 20:08, Stephen Hemminger wrote: There have been a lot of changes for 2.6.23, so here is a test release of iproute2 that should capture all the submitted patches http://developer.osdl.org/shemminger/iproute2/download/iproute2-2.6.23-rc3.tar.gz But... isn't it forged, btw?!

Re: [PATCH 2.6.23-rc3-mm1] request_irq fix DEBUG_SHIRQ handling Re: 2.6.23-rc2-mm1: rtl8139 inconsistent lock state

2007-08-26 Thread Jarek Poplawski
On Sat, Aug 25, 2007 at 11:43:08AM +0200, Mariusz Kozlowski wrote: = [ INFO: inconsistent lock state ] 2.6.23-rc2-mm1 #7 - inconsistent {in-hardirq-W} - {hardirq-on-W} usage. ifconfig/5492 [HC0[0]:SC0[0]:HE1:SE1]

Re: net/ipv4/fib_trie.c - compile error (Re: 2.6.23-rc3-mm1)

2007-08-27 Thread Jarek Poplawski
On 22-08-2007 19:03, Paul E. McKenney wrote: On Wed, Aug 22, 2007 at 05:41:11PM +0200, Adrian Bunk wrote: On Wed, Aug 22, 2007 at 05:30:13PM +0200, Gabriel C wrote: Got it with a randconfig ( http://194.231.229.228/kernel/mm/2.6.23-rc3-mm1/r/randconfig-8 ) ... net/ipv4/fib_trie.c: In

Re: [ANNOUNCE] iproute2-2.6.23-rc3

2007-08-27 Thread Jarek Poplawski
On Fri, Aug 24, 2007 at 12:26:28PM -0700, Stephen Hemminger wrote: On Fri, 24 Aug 2007 12:10:44 +0200 Jarek Poplawski [EMAIL PROTECTED] wrote: On 22-08-2007 20:08, Stephen Hemminger wrote: There have been a lot of changes for 2.6.23, so here is a test release of iproute2 that should

Re: [PATCH] netdevice: kernel docbook addition

2007-08-27 Thread Jarek Poplawski
On 22-08-2007 21:33, Stephen Hemminger wrote: Add more kernel doc's for part of the network device API. This is only a start, and needs more work. Applies against net-2.6.24 ... +/** + * napi_disable - prevent NAPI from scheduling + * @n: napi context + * + * Resume NAPI from being

Re: PROBLEM: 2.6.23-rc NETDEV WATCHDOG: eth0: transmit timed out

2007-08-27 Thread Jarek Poplawski
On 21-08-2007 12:56, Karl Meyer wrote: fyi: I do not know whether it is related to the problem, but since using the version you told me there are these entries is my log: frege Hangcheck: hangcheck value past margin! ... BTW, I don't know wheter it's related too, but I think you should try

Re: Tc bug (kernel crash) more info

2007-08-29 Thread Jarek Poplawski
On 29-08-2007 11:34, Badalian Vyacheslav wrote: Again crash. Need more posts of panic or this message have full info that needed to fix bug? Hi, Please, try to not create new threads each time: reply to the previous one if you have something new. And this one doesn't seem to show more. You

Re: Tc bug (kernel crash) more info

2007-08-29 Thread Jarek Poplawski
On Wed, Aug 29, 2007 at 01:34:47PM +0200, Jarek Poplawski wrote: On 29-08-2007 11:34, Badalian Vyacheslav wrote: Again crash. Need more posts of panic or this message have full info that needed to fix bug? ... If it's possible you can try it shortly without e.g. netconsole or even without

Re: Tc bug (kernel crash) more info

2007-08-29 Thread Jarek Poplawski
On Wed, Aug 29, 2007 at 04:53:52PM +0400, Badalian Vyacheslav wrote: ... we have this kernel panic (then delete HTB) at all 2.6.18-x versions. on older kernel (2.6.x) we have another panic (then delete tc filter)... summary we have TC panics 1 year ago ;) Sysctl option reboot on panic I'm

Re: Tc bug (kernel crash) more info

2007-08-30 Thread Jarek Poplawski
On Thu, Aug 30, 2007 at 12:16:32AM +0400, [EMAIL PROTECTED] wrote: Quoting Jarek Poplawski [EMAIL PROTECTED]: On Wed, Aug 29, 2007 at 04:53:52PM +0400, Badalian Vyacheslav wrote: ... we have this kernel panic (then delete HTB) at all 2.6.18-x versions. on older kernel (2.6.x) we have

Re: Tc bug (kernel crash) more info

2007-08-30 Thread Jarek Poplawski
On Thu, Aug 30, 2007 at 08:31:10AM +0200, Jarek Poplawski wrote: On Thu, Aug 30, 2007 at 12:16:32AM +0400, [EMAIL PROTECTED] wrote: ... PS. And also have we have strange bug in another computer (2.6.22-r5). Have computer XEON_CPUx2 (4 CPU) after boot have CPU0 and CPU3 SI = ~50% after

Re: [PATCH 4/5] Net: ath5k, license is GPLv2

2007-08-30 Thread Jarek Poplawski
On 29-08-2007 21:37, Michael Buesch wrote: On Wednesday 29 August 2007 21:33:43 Jon Smirl wrote: What if a patch spans both code that is pure GPL and code imported from BSD, how do you license it? I think it's a valid assumption, if we say that the author of the patch read the license

Re: [PATCH 4/5] Net: ath5k, license is GPLv2

2007-08-30 Thread Jarek Poplawski
On Thu, Aug 30, 2007 at 10:26:52AM +0200, Jarek Poplawski wrote: ... PS: there is probably some mess with gmail addresses in this thread. ...or maybe it's OK... Sorry. Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More

Re: Tc bug (kernel crash) more info

2007-08-30 Thread Jarek Poplawski
On Thu, Aug 30, 2007 at 01:09:11PM +0400, Badalian Vyacheslav wrote: Jarek Poplawski ??: ... On the other hand disabling local interrupts shouldn't be enough here, so it's really strange... Did you get this remotely? Are you sure LOC only? (Anyway this 2.6.23-rc4 should be interesting

Re: [PATCH 4/5] Net: ath5k, license is GPLv2

2007-08-30 Thread Jarek Poplawski
On 30-08-2007 13:59, Johannes Berg wrote: On Wed, 2007-08-29 at 15:13 +0200, Xavier Bestel wrote: How about asking for changes to be dual-licenced too ? In theory, that could work, but in practice relying on functions that the Linux kernel offers in GPLv2-only headers etc. will make the

Re: Tc bug (kernel crash) more info

2007-08-31 Thread Jarek Poplawski
On Fri, Aug 31, 2007 at 12:25:22PM +0400, Badalian Vyacheslav wrote: i not have testing mashine. we have 2 mashine and dynamic routing. if 1 mashine down - all traffic go to second mashine. I can test is on this mashines but i need that testing mashine will reboot on kernel panic (sysctl

Re: Tc bug (kernel crash) more info

2007-08-31 Thread Jarek Poplawski
On Fri, Aug 31, 2007 at 12:25:22PM +0400, Badalian Vyacheslav wrote: i not have testing mashine. we have 2 mashine and dynamic routing. if 1 mashine down - all traffic go to second mashine. I can test is on this mashines but i need that testing mashine will reboot on kernel panic (sysctl

Re: Tc bug (kernel crash) more info

2007-08-31 Thread Jarek Poplawski
On Fri, Aug 31, 2007 at 11:05:09AM +0200, Jarek Poplawski wrote: ... So, maybe you would better try this, 'less testing', version of my patch: Of course, the previous patch should be reverted (patch -p1 -R) or clean 2.6.22.5 used for this. Jarek P. - To unsubscribe from this list: send

Re: Tc bug (kernel crash) more info

2007-08-31 Thread Jarek Poplawski
On Fri, Aug 31, 2007 at 01:33:04PM +0400, Badalian Vyacheslav wrote: i not have testing mashine. we have 2 mashine and dynamic routing. if 1 mashine down - all traffic go to second mashine. I can test is on this mashines but i need that testing mashine will reboot on kernel panic (sysctl

Re: Tc bug (kernel crash) more info

2007-08-31 Thread Jarek Poplawski
On Fri, Aug 31, 2007 at 02:59:55PM +0400, Badalian Vyacheslav wrote: May be this bug eq [PATCH] [NET_SCHED] sch_prio.c: remove duplicate call of tc_classify()? I get kernel panic on 2.6.23-rc4-git2 This is netconsole log! ... So, it looks like you have found a really new (unknown) HTB

Re: Tc bug (kernel crash) more info

2007-08-31 Thread Jarek Poplawski
On Fri, Aug 31, 2007 at 02:48:31PM +0400, Badalian Vyacheslav wrote: ... I can only see that say netconsole. If i look to monitor i look last lines. last line is Scrolling not work netconsole run as module and start after system do full load. Then netconsole is up - i run generator

Re: Tc bug (kernel crash) more info

2007-09-03 Thread Jarek Poplawski
On Fri, Aug 31, 2007 at 06:51:24PM +0400, Badalian Vyacheslav wrote: I found that bug in this place (gdb) l *0xc01c8973 0xc01c8973 is in rb_insert_color (lib/rbtree.c:80). ... if i not wrong understand message unable to handle kernel NULL pointer dereference at virtual address 0008 its

Re: Tc bug (kernel crash) more info

2007-09-03 Thread Jarek Poplawski
On Mon, Sep 03, 2007 at 12:31:39PM +0400, Badalian Vyacheslav wrote: May you also see that i need change to fix this: qdisc handle can = 10 000 i have more then 10 000 qdiscs =( As far as I know qdisc handle is hex, so you can have e.g.: handle 999a (or a999 too). But, does it mean your

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-15 Thread Jarek Poplawski
On 19-09-2007 16:38, Maciej W. Rozycki wrote: ... @@ -661,13 +664,22 @@ int phy_stop_interrupts(struct phy_devic if (err) phy_error(phydev); + free_irq(phydev-irq, phydev); + /* - * Finish any pending work; we might have been scheduled to be called -

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-16 Thread Jarek Poplawski
On Mon, Oct 15, 2007 at 06:03:20PM +0100, Maciej W. Rozycki wrote: On Mon, 15 Oct 2007, Jarek Poplawski wrote: Could you explain why cancel_work_sync() is better here than flush_scheduled_work() wrt. rtnl_lock()? Well, this is actually the bit that made cancel_work_sync() be written

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-17 Thread Jarek Poplawski
On Tue, Oct 16, 2007 at 06:19:32PM +0100, Maciej W. Rozycki wrote: ... Well, enable_irq() and disable_irq() themselves are nesting, so they are not a problem. OTOH, free_irq() does not seem to maintain the depth count correctly, which looks like a bug to me and which could trigger

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-17 Thread Jarek Poplawski
On Wed, Oct 17, 2007 at 10:58:09AM +0200, Jarek Poplawski wrote: ... 5) phy_stop_interrupts(): maybe I miss something, but it seems phy_stop() is required before this, so maybe there should be a comment on this? 6) phy_stop_interrupts(): if I'm not wrong with #3 calling Should be: 6

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Jarek Poplawski
On Wed, Oct 17, 2007 at 10:58:09AM +0200, Jarek Poplawski wrote: ... 8) phy_stop_interrupts(): I'm not sure this additional call from DEBUG_SHIRQ should be so dangerous, eg.: /* * status == PHY_HALTED * interrupts are stopped after phy_stop

[PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-18 Thread Jarek Poplawski
possibly abused by myslef. - Subject: flush_work_sync as an alternative for flush_scheduled_work Similar to cancel_work_sync() but will only busy wait block (without cancel). Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- include/linux/workqueue.h |1 + kernel/workqueue.c

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski
On Fri, Oct 19, 2007 at 12:38:29PM +0100, Maciej W. Rozycki wrote: On Thu, 18 Oct 2007, Maciej W. Rozycki wrote: 1) phy_change() checks PHY_HALTED flag without lock; I think it's racy: eg. if it's done during phy_stop() it can check just before the flag is set and reenable interrupts

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski
On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote: On 10/18, Jarek Poplawski wrote: +/** + * flush_work_sync - block until a work_struct's callback has terminated ^^^ Hmm... + * Similar

Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski
On Thu, Oct 18, 2007 at 12:30:35PM +0100, Maciej W. Rozycki wrote: On Wed, 17 Oct 2007, Jarek Poplawski wrote: ... 2) phy_change() doesn't reenable irq line after it sees returns with errors; IMHO it should at least write some warning, but maybe try some safety plan, so enable_irq() and try

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-19 Thread Jarek Poplawski
On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote: ... sched_work_sync() with rtnl_lock(). It's only less probable to lockup with this than with flush_schedule_work(). ...But, not much less... Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-22 Thread Jarek Poplawski
On Fri, Oct 19, 2007 at 09:50:14AM +0200, Jarek Poplawski wrote: On Thu, Oct 18, 2007 at 07:48:19PM +0400, Oleg Nesterov wrote: On 10/18, Jarek Poplawski wrote: +/** + * flush_work_sync - block until a work_struct's callback has terminated

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-23 Thread Jarek Poplawski
On Mon, Oct 22, 2007 at 10:02:59PM +0400, Oleg Nesterov wrote: On 10/22, Jarek Poplawski wrote: ... OK, I know I'm dumber and dumber everyday, You are not alone. I have the same feeling about myself! Feeling is not the same, only true knowledge counts! these all flushes are rtnl

Re: [PATCH] flush_work_sync vs. flush_scheduled_work Re: [PATCH] PHYLIB: IRQ event workqueue handling fixes

2007-10-23 Thread Jarek Poplawski
On Mon, Oct 22, 2007 at 10:02:59PM +0400, Oleg Nesterov wrote: ... If this work doesn't rearm itself - yes. (otherwise, the same -func can run twice _at the same time_) But again, in this case wait_on_work() after try_to_grab_pending() == 1 doesn't block, so we can just do if

Re: Oops in 2.6.21-rc4, 2.6.23

2007-10-29 Thread Jarek Poplawski
On 15-10-2007 17:39, Darko K. wrote: Hello, after recent upgrade to kernel 2.6.23 (from 2.6.20) I have started seeing kernel oops-es in networking code. The problem is 100% reproducible in my environment. I've seen two slightly different backtraces but both seem to be caused by the same

Re: Oops in 2.6.21-rc4, 2.6.23

2007-10-29 Thread Jarek Poplawski
On Mon, Oct 29, 2007 at 01:41:47AM -0700, David Miller wrote: From: Jarek Poplawski [EMAIL PROTECTED] Date: Mon, 29 Oct 2007 09:42:32 +0100 I hope you've found this by yourself by now, but: 1. These are warnings only - not oopses. 2. It seems this patch you've found to be responsible

Re: Oops in 2.6.21-rc4, 2.6.23

2007-10-30 Thread Jarek Poplawski
On Mon, Oct 29, 2007 at 01:41:47AM -0700, David Miller wrote: ... Actually, this was caused by a real bug in the SKB_WITH_OVERHEAD macro definition, which Herbert Xu quickly spotted and fixed. Which I hope you've found this by yourself by now. ...Btw, of course you have to be right, and I

Re: Oops in 2.6.21-rc4, 2.6.23

2007-10-31 Thread Jarek Poplawski
On Tue, Oct 30, 2007 at 03:11:20PM +0100, Jarek Poplawski wrote: On Mon, Oct 29, 2007 at 01:41:47AM -0700, David Miller wrote: ... Actually, this was caused by a real bug in the SKB_WITH_OVERHEAD macro definition, which Herbert Xu quickly spotted and fixed. Which I hope you've found

Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table

2007-11-01 Thread Jarek Poplawski
Hi, A few doubts below: Eric Dumazet wrote: As done two years ago on IP route cache table (commit 22c047ccbc68fa8f3fa57f0e8f906479a062c426) , we can avoid using one lock per hash bucket for the huge TCP/DCCP hash tables. ... diff --git a/include/net/inet_hashtables.h

Re: Endianness problem with u32 classifier hash masks

2007-11-02 Thread Jarek Poplawski
Radu Rendec wrote: Hi, While trying to implement u32 hashes in my shaping machine I ran into a possible bug in the u32 hash/bucket computing algorithm (net/sched/cls_u32.c). The problem occurs only with hash masks that extend over the octet boundary, on little endian machines (where

Re: urgent! linux 2.6.16 network bridge crash

2007-11-03 Thread Jarek Poplawski
auther_bin wrote, On 11/03/2007 12:11 PM: Hello friends, I have config my linux box works as a network bridge. and up/down switch is both cisco 35xx works with vlan. but when i connect it into the network the box crashed right now. Puzzled! btw, in the down(intranet) switch we connect 8

Re: Endianness problem with u32 classifier hash masks

2007-11-03 Thread Jarek Poplawski
jamal wrote, On 11/03/2007 12:23 AM: On Fri, 2007-02-11 at 18:31 +0100, Jarek Poplawski wrote: Radu Rendec wrote: Hi, While trying to implement u32 hashes in my shaping machine I ran into a possible bug in the u32 hash/bucket computing algorithm (net/sched/cls_u32.c). The problem occurs

Re: Endianness problem with u32 classifier hash masks

2007-11-03 Thread Jarek Poplawski
Jarek Poplawski wrote, On 11/04/2007 12:39 AM: ... OOPS!!! Went too early! I've tried to save not send. Probably my bad pronunciation... But, it seems this could be something like this (instead of Radu's change in u32_classify()). The change of hmask is needed. But it needs more checking

Re: Endianness problem with u32 classifier hash masks

2007-11-03 Thread Jarek Poplawski
Jarek Poplawski wrote, On 11/04/2007 12:58 AM: Jarek Poplawski wrote, On 11/04/2007 12:39 AM: ... OOPS!!! Went too early! I've tried to save not send. Probably my bad pronunciation... But, it seems this could be something like this (instead of Radu's change in u32_classify

Re: Endianness problem with u32 classifier hash masks

2007-11-03 Thread Jarek Poplawski
Jarek Poplawski wrote, On 11/04/2007 01:30 AM: Jarek Poplawski wrote, On 11/04/2007 12:58 AM: ... Other changes seem to be not needed. But it needs more checking... But not much more: it's a piece of fshit! So, even if not full ntohl(), some byte moving seems to be necessary here. Sorry

Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table

2007-11-04 Thread Jarek Poplawski
Eric Dumazet wrote, On 11/04/2007 12:31 PM: David Miller a écrit : From: Andi Kleen [EMAIL PROTECTED] Date: Sun, 4 Nov 2007 00:18:14 +0100 On Thursday 01 November 2007 11:16:20 Eric Dumazet wrote: ... Also the EHASH_LOCK_SZ == 0 special case is a little strange. Why did you add that? He

Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table

2007-11-04 Thread Jarek Poplawski
Jarek Poplawski wrote, On 11/04/2007 06:58 PM: Eric Dumazet wrote, On 11/04/2007 12:31 PM: ... +static inline int inet_ehash_locks_alloc(struct inet_hashinfo *hashinfo) +{ ... +if (sizeof(rwlock_t) != 0) { ... +for (i = 0; i size; i++) +rwlock_init

Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table

2007-11-04 Thread Jarek Poplawski
Eric Dumazet wrote, On 11/04/2007 10:23 PM: Jarek Poplawski a écrit : Jarek Poplawski wrote, On 11/04/2007 06:58 PM: Eric Dumazet wrote, On 11/04/2007 12:31 PM: ... +static inline int inet_ehash_locks_alloc(struct inet_hashinfo *hashinfo) +{ ... + if (sizeof(rwlock_t) != 0

Re: Endianness problem with u32 classifier hash masks

2007-11-05 Thread Jarek Poplawski
On Sun, Nov 04, 2007 at 06:58:13PM -0500, jamal wrote: On Sun, 2007-04-11 at 02:17 +0100, Jarek Poplawski wrote: So, even if not full ntohl(), some byte moving seems to be necessary here. I thinking you were close. I am afraid my brain is congested, even the esspresso didnt help my

Re: Endianness problem with u32 classifier hash masks

2007-11-05 Thread Jarek Poplawski
On Mon, Nov 05, 2007 at 02:59:21PM +0200, Radu Rendec wrote: ... Jamal, I am aware that any computation on the fast path involves some performance loss. However, I don't see any speed gain with your patch, because you just moved the ntohl() call inside u32_hash_fold(). Since u32_hash_fold() is

Re: Endianness problem with u32 classifier hash masks

2007-11-05 Thread Jarek Poplawski
On Mon, Nov 05, 2007 at 08:47:06AM -0500, jamal wrote: On Mon, 2007-05-11 at 10:12 +0100, Jarek Poplawski wrote: BTW: when looking around this I think, maybe, in u32_change(): 1) if (--divisor 0x100) should be probably =, Does it really matter? Divisor can be max of 0xff

Re: Endianness problem with u32 classifier hash masks

2007-11-05 Thread Jarek Poplawski
On Mon, Nov 05, 2007 at 08:43:32AM -0500, jamal wrote: On Mon, 2007-05-11 at 14:59 +0200, Radu Rendec wrote: Jarek, thanks for replying my message on the list and pointing it to the right direction. Your example with 1 bits laying on exact nibble boundary is much easier to analyze than

Re: Endianness problem with u32 classifier hash masks

2007-11-05 Thread Jarek Poplawski
Radu Rendec wrote, On 11/05/2007 06:31 PM: On Mon, 2007-11-05 at 09:06 -0500, jamal wrote: On Mon, 2007-05-11 at 14:52 +0100, Jarek Poplawski wrote: ... If we manage to convince Jamal, IMHO a patch to something current like 2.6.24-rc1-git14 (or maybe -rc2 soon), should suffice (plus some

Re: Endianness problem with u32 classifier hash masks

2007-11-05 Thread Jarek Poplawski
Jarek Poplawski wrote, On 11/05/2007 10:06 PM: Radu Rendec wrote, On 11/05/2007 06:31 PM: ... Jarek, because I have to test anyway, I'll include ffs(mask) in my patch and have it tested too. Thanks! But, I did it wrong: + 1 is unnecessary. And since, ffs() checks for 0 anyway

Re: Endianness problem with u32 classifier hash masks

2007-11-05 Thread Jarek Poplawski
jamal wrote, On 11/05/2007 11:27 PM: On Mon, 2007-05-11 at 22:06 +0100, Jarek Poplawski wrote: Radu Rendec wrote, On 11/05/2007 06:31 PM: But still, Jamal, I need more explanations on what you meant by cutdown on the conversion in u32_change(). I meant that it didnt seem necessary to me

Re: Endianness problem with u32 classifier hash masks

2007-11-05 Thread Jarek Poplawski
Jarek Poplawski wrote, On 11/06/2007 01:02 AM: ... on little endian (net order): f0.0f.00.00 4 gives: 0f.00.0f.00 then ntohl: 00.0f.00.0f with lsb: 0f should be: f0.0f.00.00 4 gives: 0f.00.f0.00 then ntohl: 00.f0.00.0f with lsb: 0f Jarek Sleeping P. - To unsubscribe from this list: send

Re: Endianness problem with u32 classifier hash masks

2007-11-06 Thread Jarek Poplawski
On Tue, Nov 06, 2007 at 08:34:31AM -0500, jamal wrote: On Tue, 2007-06-11 at 10:09 +0200, Radu Rendec wrote: Yup, you're right. Bitwise anding is the same regardless of the byte ordering of the operands. As long as you don't have one operand in host order and the other in net order, it's

Re: Endianness problem with u32 classifier hash masks

2007-11-06 Thread Jarek Poplawski
Radu Rendec wrote, On 11/06/2007 06:00 PM: On Tue, 2007-11-06 at 09:43 -0500, jamal wrote: On Tue, 2007-06-11 at 15:25 +0100, Jarek Poplawski wrote: Yes, it saves one htonl() on the slow path! Would it feel better to say grew down exponentially from version 1 to 3? ;- Sure, but I felt

Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table

2007-11-07 Thread Jarek Poplawski
block anything?! I've written it's OK. So, I'm not sure it's useful or expected, but anyway: Acked-by: Jarek Poplawski [EMAIL PROTECTED] Thanks, Jarek P. - To unsubscribe from this list: send the line unsubscribe netdev in the body of a message to [EMAIL PROTECTED] More majordomo info at http

Re: Endianness problem with u32 classifier hash masks

2007-11-07 Thread Jarek Poplawski
On Wed, Nov 07, 2007 at 01:22:20AM -0800, David Miller wrote: From: Radu Rendec [EMAIL PROTECTED] Date: Tue, 06 Nov 2007 19:00:16 +0200 On Tue, 2007-11-06 at 09:43 -0500, jamal wrote: On Tue, 2007-06-11 at 15:25 +0100, Jarek Poplawski wrote: Yes, it saves one htonl() on the slow

Re: [PATCH] [PKT_SCHED] CLS_U32: Use ffs() instead of C code on hash mask to get first set bit.

2007-11-08 Thread Jarek Poplawski
in assembler). Using the conditional operator on hash mask before applying ntohl() also saves one ntohl() call if mask is 0. Signed-off-by: Radu Rendec [EMAIL PROTECTED] Signed-off-by: Jarek Poplawski [EMAIL PROTECTED] --- net/sched/cls_u32.c | 12 +--- 1 files changed, 1

Re: [BUG] New Kernel Bugs

2007-11-13 Thread Jarek Poplawski
On 13-11-2007 12:15, Andrew Morton wrote: ... Zero responses from developers ... No response from developers ... Andreas did some work, seemed to lose interest. ... Rafael poked Thomas a week ago, to no effect. Thomas has been travelling. Looks like very reproducible! Maybe you should add

Re: [PATCH] via-velocity: don't oops on MTU change.

2007-11-15 Thread Jarek Poplawski
On 15-11-2007 04:38, Stephen Hemminger wrote: Simple mtu change when device is down. Fix http://bugzilla.kernel.org/show_bug.cgi?id=9382. Signed-off-by: Stephen Hemminger [EMAIL PROTECTED] --- a/drivers/net/via-velocity.c 2007-10-22 09:38:11.0 -0700 +++

<    1   2   3   4   5   6   7   >