On Thursday 19 July 2007 21:56, Ingo Molnar wrote:
> nope - with this patch applied the box still has no network, symptoms
> are similar. (should i apply the WARN_ON() patch too?)
Yes, that would be nice. If that doesn't help, you can also throw in
the one below.
Olaf
--
Olaf Kirch |
Does the following help?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
Test patch
---
Index: build-2.6/drivers/net/netconsole.c
iming, but just verifies whether my patch is to blame at all. Can
you give it a try?
Thanks,
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
Test patch
---
include/linux/netdevice.h |2 +-
1
On Thursday 19 July 2007 19:36, Olaf Kirch wrote:
> Can you confirm this by spraying the laptop with arp packets
> or broadcast pings while it's booting?
Sorry for the noise - didn't see your other message where you
described just that.
This sounds more like a hardware issue - Rx interrupt
gs.
Can you confirm this by spraying the laptop with arp packets
or broadcast pings while it's booting?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line
that this means anything under these circumstances ...)
Too bad. Now where do we take it from here? I'm currently thinking of
ways to do this patch differently. But that is kind of relying on a
good testbed to verify whether a different patch works better for you
or not...
Olaf
--
Olaf Kirch |
On Thursday 19 July 2007 14:52, Olaf Kirch wrote:
> On Thursday 19 July 2007 12:58, Ingo Molnar wrote:
> > i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine
> > hickup symptoms, with no other bad symptoms such as lockups or crashes.
>
> Duh, I found it.
not
remove the device from the poll list any longer - and another one
from net_rx_action.
I don't have a fix ready yet - I hope I'll have something later
this afternoon.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.nay
scheduled while we're in poll_napi holding the poll_lock.
net_rx_action would try to take the poll_lock as well, and we'd
be hung for good. The patch with local_bh_disable/enable was
supposed to test that idea (this is the "trickle" patch)
Olaf
--
Olaf Kirch | --- o --- Nous sommes du sol
what
dmesg contains. If there's little to no debug output from the
driver, let it run for 10 seconds or so, in order to catch the
e1000 watchdog timer a few times.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.
to no debug output from the
driver, let it run for 10 seconds or so, in order to catch the
e1000 watchdog timer a few times.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from
holding the poll_lock.
net_rx_action would try to take the poll_lock as well, and we'd
be hung for good. The patch with local_bh_disable/enable was
supposed to test that idea (this is the trickle patch)
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED
the device from the poll list any longer - and another one
from net_rx_action.
I don't have a fix ready yet - I hope I'll have something later
this afternoon.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah
On Thursday 19 July 2007 14:52, Olaf Kirch wrote:
On Thursday 19 July 2007 12:58, Ingo Molnar wrote:
i.e. it's the classic 'eth0 got stuck somehow' tx/rx state machine
hickup symptoms, with no other bad symptoms such as lockups or crashes.
Duh, I found it.
The following patch should
these circumstances ...)
Too bad. Now where do we take it from here? I'm currently thinking of
ways to do this patch differently. But that is kind of relying on a
good testbed to verify whether a different patch works better for you
or not...
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we
confirm this by spraying the laptop with arp packets
or broadcast pings while it's booting?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe
On Thursday 19 July 2007 19:36, Olaf Kirch wrote:
Can you confirm this by spraying the laptop with arp packets
or broadcast pings while it's booting?
Sorry for the noise - didn't see your other message where you
described just that.
This sounds more like a hardware issue - Rx interrupt seems
whether my patch is to blame at all. Can
you give it a try?
Thanks,
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
Test patch
---
include/linux/netdevice.h |2 +-
1 file changed, 1
Does the following help?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
Test patch
---
Index: build-2.6/drivers/net/netconsole.c
On Thursday 19 July 2007 21:56, Ingo Molnar wrote:
nope - with this patch applied the box still has no network, symptoms
are similar. (should i apply the WARN_ON() patch too?)
Yes, that would be nice. If that doesn't help, you can also throw in
the one below.
Olaf
--
Olaf Kirch | --- o
n't change the
timing in a way that makes the bug disappear.
Thanks
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
---
include/linux/netdevice.h |4
net/core/dev.c| 14 ++
t timed out
So, it seems as if for some reason, dev->poll isn't called frequently
enough.
Here's a debugging patch that tries to locate the problem - can you give it
a try, please?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol
as if for some reason, dev-poll isn't called frequently
enough.
Here's a debugging patch that tries to locate the problem - can you give it
a try, please?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah
that makes the bug disappear.
Thanks
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
---
include/linux/netdevice.h |4
net/core/dev.c| 14 ++
2 files changed
nsole output? I don't see any Tx Unit
Hang messages from e1000 or netdev watchdog messages present in your
earlier dmesg logs. So maybe these messages are there, but never
get logged?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dh
,
both with HZ=250 and HZ=1000?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
---
drivers/net/e1000/e1000_main.c |9 -
1 file changed, 8 insertions(+), 1 deletion(-)
Index: bui
if (test_bit(__LINK_STATE_POLL_LIST_FROZEN, >state)) {
dev->quota = dev->weight;
return;
}
This is just a hack to make sure that we don't go to insanely
negative quotas while sending packets through netpoll.
Olaf
--
Olaf Kirch | --- o --
On Tuesday 17 July 2007 09:55, Olaf Kirch wrote:
> What I find more problematic about this portion of code though
> is that once a net_device is over quota, net_rx_action will
> loop for up to one jiffy, even if there's just this one device on
> the poll_list.
Duh, wrong. For every
appen again, since it never goes away.
Sorry, I may be sitting on my brain this morning, but I don't understand
how skipping netif_rx_complete would affect ACKing of interrupts.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.nay
this,
> and net_rx_action returns with __LINK_STATE_RX_SCHED bit set.
I don't think so. dev will remain on poll_list.
What I find more problematic about this portion of code though
is that once a net_device is over quota, net_rx_action will
loop for up to one jiffy, even if there's just this one device on
On Tuesday 17 July 2007 00:08, David Miller wrote:
> Sure, but I thought it would be nice to give Olaf a day or two to
> figure out what's going on rather than have the knee-jerk reaction to
> just revert.
Oh, reverting is fine with me. I'll just resubmit the patch.
Olaf
--
Olaf Kirch
On Tuesday 17 July 2007 00:08, David Miller wrote:
Sure, but I thought it would be nice to give Olaf a day or two to
figure out what's going on rather than have the knee-jerk reaction to
just revert.
Oh, reverting is fine with me. I'll just resubmit the patch.
Olaf
--
Olaf Kirch | --- o
with __LINK_STATE_RX_SCHED bit set.
I don't think so. dev will remain on poll_list.
What I find more problematic about this portion of code though
is that once a net_device is over quota, net_rx_action will
loop for up to one jiffy, even if there's just this one device on
the poll_list.
Olaf
--
Olaf Kirch | --- o
.
Sorry, I may be sitting on my brain this morning, but I don't understand
how skipping netif_rx_complete would affect ACKing of interrupts.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
On Tuesday 17 July 2007 09:55, Olaf Kirch wrote:
What I find more problematic about this portion of code though
is that once a net_device is over quota, net_rx_action will
loop for up to one jiffy, even if there's just this one device on
the poll_list.
Duh, wrong. For every loop, it'll add
(__LINK_STATE_POLL_LIST_FROZEN, dev-state)) {
dev-quota = dev-weight;
return;
}
This is just a hack to make sure that we don't go to insanely
negative quotas while sending packets through netpoll.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we
with HZ=250 and HZ=1000?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
---
---
drivers/net/e1000/e1000_main.c |9 -
1 file changed, 8 insertions(+), 1 deletion(-)
Index: build-2.6/drivers
see any Tx Unit
Hang messages from e1000 or netdev watchdog messages present in your
earlier dmesg logs. So maybe these messages are there, but never
get logged?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah
SMP machine? Are you still getting output
from netconsole, or is the network down completely?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubsc
On Monday 16 July 2007 11:12, Ingo Molnar wrote:
> After a bisection session the bad commit turned out to be:
>
> 29578624e354f56143d92510fff33a8b2aaa2c03 is first bad commit
> commit 29578624e354f56143d92510fff33a8b2aaa2c03
> Author: Olaf Kirch <[EMAIL PROTECTED]>
>
On Monday 16 July 2007 11:12, Ingo Molnar wrote:
After a bisection session the bad commit turned out to be:
29578624e354f56143d92510fff33a8b2aaa2c03 is first bad commit
commit 29578624e354f56143d92510fff33a8b2aaa2c03
Author: Olaf Kirch [EMAIL PROTECTED]
Date: Wed Jul 11 19:32:02 2007
still getting output
from netconsole, or is the network down completely?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel
From: Olaf Kirch <[EMAIL PROTECTED]>
Make skb_seq_read unmap the last fragment
Having walked through the entire skbuff, skb_seq_read would leave the
last fragment mapped. As a consequence, the unwary caller would leak
kmaps, and proceed with preempt_count off by one. The only (kind
From: Olaf Kirch [EMAIL PROTECTED]
Make skb_seq_read unmap the last fragment
Having walked through the entire skbuff, skb_seq_read would leave the
last fragment mapped. As a consequence, the unwary caller would leak
kmaps, and proceed with preempt_count off by one. The only (kind of
non
root file system mounted via NFS? Or does it mean you
booted, and started the NFS server?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "uns
via NFS? Or does it mean you
booted, and started the NFS server?
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body
f code that still uses it, it's most likely something
that hasn't seen a compiler in years - and will likely continue to do
so.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscrib
that still uses it, it's most likely something
that hasn't seen a compiler in years - and will likely continue to do
so.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from
er PKTINFO cmsg when sending the reply. So it would be
much easier to just store the raw control message in the svc_rqst,
without looking at its contents, and send it out along with the reply,
unchanged.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PRO
ddresses
a little awkward too. And I think to be on the safe side, you
should check that you're really looking at a PKTINFO cmsg
rather than something else.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To
to be on the safe side, you
should check that you're really looking at a PKTINFO cmsg
rather than something else.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line
easier to just store the raw control message in the svc_rqst,
without looking at its contents, and send it out along with the reply,
unchanged.
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax
s - but to
do that you need a private key. And the whole point of this exercise is
that the user does not have access to that key.
So as far as support is concerned, you're back in square one.
You cannot tell a "genuine" oops produced on a supported kernel
from a doctored one produced on Jo
is
that the user does not have access to that key.
So as far as support is concerned, you're back in square one.
You cannot tell a genuine oops produced on a supported kernel
from a doctored one produced on Joe Doe's Garage Kernel.
Olaf
--
Olaf Kirch| Anyone who has had to work with X.509 has
On Fri, Feb 04, 2005 at 12:55:39PM +1100, Herbert Xu wrote:
> OK, here is the patch to do that. Let's get rid of kfree_skb_fast
> while we're at it since it's no longer used.
Thanks, I'll give that to the PPC folks and ask the to run with it.
Regards,
Olaf
--
Olaf Kirch | --- o --
On Fri, Feb 04, 2005 at 12:55:39PM +1100, Herbert Xu wrote:
OK, here is the patch to do that. Let's get rid of kfree_skb_fast
while we're at it since it's no longer used.
Thanks, I'll give that to the PPC folks and ask the to run with it.
Regards,
Olaf
--
Olaf Kirch | --- o --- Nous
o make that too slow.
Olaf
--
Olaf Kirch | Things that make Monday morning interesting, #2:
[EMAIL PROTECTED] |"We have 8,000 NFS mount points, why do we keep
---+ running out of privileged ports?"
-
To unsubscribe from this list: send the line "uns
that too slow.
Olaf
--
Olaf Kirch | Things that make Monday morning interesting, #2:
[EMAIL PROTECTED] |We have 8,000 NFS mount points, why do we keep
---+ running out of privileged ports?
-
To unsubscribe from this list: send the line unsubscribe linux-kernel
On Wed, Dec 06, 2000 at 10:35:14AM -0500, James Antill wrote:
> I've just looked at it, but I'm pretty sure this is a bug in your
> code.
Ick. Thanks!
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.nayth
l cases. It can contain multiple errors.
But it doesn't return POLLERR. If it was returning it, pollfd.revents
would be set. pollfd.events is the event mask that's being passed _into_
the poll() call.
You're right about the IP_RETOPS stuff though. I didn't look closely enough;
ip_cmsg_send does exp
:43:02 poll([{fd=4, events=POLLERR}], 1, 5) = 0
...
I.e. the poll call returns as if it had timed out, but it
hasn't.
Any input from network kernel hackers would be greatly appreciated!
Cheers,
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we
:43:02 poll([{fd=4, events=POLLERR}], 1, 5) = 0
...
I.e. the poll call returns as if it had timed out, but it
hasn't.
Any input from network kernel hackers would be greatly appreciated!
Cheers,
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we
return POLLERR. If it was returning it, pollfd.revents
would be set. pollfd.events is the event mask that's being passed _into_
the poll() call.
You're right about the IP_RETOPS stuff though. I didn't look closely enough;
ip_cmsg_send does expect raw options.
Thanks,
Olaf
--
Olaf Kirch
On Wed, Dec 06, 2000 at 10:35:14AM -0500, James Antill wrote:
I've just looked at it, but I'm pretty sure this is a bug in your
code.
Ick. Thanks!
Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah
ves you a single NUL byte
overflow. Whether it's dangerous or not depends on whether your
compiler reserves stack space for the *nls pointer or not...
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin
ngle NUL byte
overflow. Whether it's dangerous or not depends on whether your
compiler reserves stack space for the *nls pointer or not...
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[EMAIL PROTECTED] |/ | \ sol.dhoop.naytheet.ah kin.ir.samse.qu
66 matches
Mail list logo