Ayaz Abdulla wrote:
Attached fix has been submitted to netdev.
I've run my reproducer with this patch applied to be Debian 2.6.32
kernel and so far the problem with nodes becoming unresponsive hasn't
occurred.
NIC settings were left the default so this looks positive
r...@node23:~#
Eric Dumazet wrote:
OK it seems forcedeth has problem with checksums ?
Try to change ethtool -k eth0 settings ?
ethtool -K eth0 tso off tx off
Yes, that makes an unresponsive system responsive again immediately, nice!
Should the driver default to disabling this until we problem is
Le mardi 13 avril 2010 à 11:03 +0100, stephen mulcahy a écrit :
Eric Dumazet wrote:
OK it seems forcedeth has problem with checksums ?
Try to change ethtool -k eth0 settings ?
ethtool -K eth0 tso off tx off
Yes, that makes an unresponsive system responsive again immediately, nice!
Eric Dumazet wrote:
Le mardi 13 avril 2010 à 11:03 +0100, stephen mulcahy a écrit :
Eric Dumazet wrote:
OK it seems forcedeth has problem with checksums ?
Try to change ethtool -k eth0 settings ?
ethtool -K eth0 tso off tx off
Yes, that makes an unresponsive system responsive again
On Tue, 2010-04-13 at 12:00 +0100, stephen mulcahy wrote:
Eric Dumazet wrote:
Le mardi 13 avril 2010 à 11:03 +0100, stephen mulcahy a écrit :
Eric Dumazet wrote:
OK it seems forcedeth has problem with checksums ?
Try to change ethtool -k eth0 settings ?
ethtool -K eth0 tso off tx
Ok, I've tried both of the following with my reproducer
1. ethtool -K eth0 tso off
RESULT: reproducer causes multiple hosts to be come unresponsive on
first run.
2. ethtool -K eth0 tx off
RESULT: reproducer runs three times without any hosts becoming unresponsive.
-stephen
--
To
Le mardi 13 avril 2010 à 15:27 +0100, stephen mulcahy a écrit :
Ok, I've tried both of the following with my reproducer
1. ethtool -K eth0 tso off
RESULT: reproducer causes multiple hosts to be come unresponsive on
first run.
2. ethtool -K eth0 tx off
RESULT: reproducer runs three
Eric Dumazet wrote:
Le mardi 13 avril 2010 à 15:27 +0100, stephen mulcahy a écrit :
Ok, I've tried both of the following with my reproducer
1. ethtool -K eth0 tso off
RESULT: reproducer causes multiple hosts to be come unresponsive on
first run.
2. ethtool -K eth0 tx off
RESULT:
stephen mulcahy wrote:
Now some brave fouls to check the 6410 lines of this driver ? ;)
Question of the day : Why TSO is broken in forcedeth ?
Is it generically broken or is it broken for specific NICS ?
Actually, it is only when tx-checksumming is turned off that the problem
doesn't occur
Le mardi 13 avril 2010 à 15:49 +0100, stephen mulcahy a écrit :
Eric Dumazet wrote:
Le mardi 13 avril 2010 à 15:27 +0100, stephen mulcahy a écrit :
Ok, I've tried both of the following with my reproducer
1. ethtool -K eth0 tso off
RESULT: reproducer causes multiple hosts to be come
Eric Dumazet wrote:
I am scratching my head, but I thought you told me that
ethtool -K eth0 tso off
ethtool -K eth0 tx on
was working ?
No, sorry for the confusion.
ethtool -K eth0 tx off
fixes the problem.
Setting only
ethtool -K eth0 tso off
ethtool -K eth0 tx on
still results in
Le mardi 13 avril 2010 à 16:08 +0100, stephen mulcahy a écrit :
Eric Dumazet wrote:
I am scratching my head, but I thought you told me that
ethtool -K eth0 tso off
ethtool -K eth0 tx on
was working ?
No, sorry for the confusion.
ethtool -K eth0 tx off
fixes the problem.
Eric Dumazet wrote:
OK, thanks for clarification.
Last question, did you tried a vanilla kernel, aka 2.6.33.2 for
example ?
I built a Debian package from the vanilla 2.6.33.2 and installed that on
all nodes and tried my reproducer with the same results - nodes becoming
unresponsive.
I
Le mardi 13 avril 2010 à 16:25 +0100, stephen mulcahy a écrit :
Eric Dumazet wrote:
OK, thanks for clarification.
Last question, did you tried a vanilla kernel, aka 2.6.33.2 for
example ?
I built a Debian package from the vanilla 2.6.33.2 and installed that on
all nodes and tried my
Le mardi 13 avril 2010 à 14:43 -0700, David Miller a écrit :
Do you really come to the conclusion that TSO is broken with the above
test results?
I would conclude that there is a TX checksumming issue, since merely
turning TSO off does not fix the problem whereas turning TX
checksumming off
From: Eric Dumazet eric.duma...@gmail.com
Date: Tue, 13 Apr 2010 16:42:21 +0200
Le mardi 13 avril 2010 à 15:27 +0100, stephen mulcahy a écrit :
Ok, I've tried both of the following with my reproducer
1. ethtool -K eth0 tso off
RESULT: reproducer causes multiple hosts to be come
Attached fix has been submitted to netdev.
Ayaz
Eric Dumazet wrote:
Le mardi 13 avril 2010 à 14:43 -0700, David Miller a écrit :
Do you really come to the conclusion that TSO is broken with the above
test results?
I would conclude that there is a TX checksumming issue, since merely
turning
From: Ayaz Abdulla aabdu...@nvidia.com
Date: Wed, 14 Apr 2010 01:33:15 -0400
Attached fix has been submitted to netdev.
Thanks!
I apply this soon.
--
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Ben Hutchings wrote:
Stephen Mulcahy reported a regression in forcedeth at
http://bugs.debian.org/572201. The system information and some
diagnostic information can be found there. Anyone able to help?
Incidentally, I also tried the 2.6.33.2 kernel with
CONFIG_FORCEDETH_NAPI set to y to see
stephen mulcahy wrote:
It doesn't - further testing over the weekend saw 6 of 45 machines drop
off the network with this problem. Nothing in dmesg or system logs.
Happy to run more tests if someone can advise on what should be run.
I also just tried using the 2.6.30-2-amd64 (Debian) forcedeth
Le lundi 12 avril 2010 à 13:39 +0100, stephen mulcahy a écrit :
stephen mulcahy wrote:
It doesn't - further testing over the weekend saw 6 of 45 machines drop
off the network with this problem. Nothing in dmesg or system logs.
Happy to run more tests if someone can advise on what should
Eric Dumazet wrote:
Le lundi 12 avril 2010 à 13:39 +0100, stephen mulcahy a écrit :
I am not sure I understand. Are you saying that using 2.6.30-2-amd64
kernel also makes your forcedeth adapter being not functional ?
Hi Eric,
If I run my tests with the 2.6.30-2-amd64 kernel the network
stephen mulcahy wrote:
Are both way non functional (RX and TX), or only one side ?
Whats the best way of testing this? (tcpdump listening on both hosts and
then running pings between the systems?)
stephen mulcahy wrote:
Are both way non functional (RX and TX), or only one side ?
Whats
Le lundi 12 avril 2010 à 14:19 +0100, stephen mulcahy a écrit :
Does that help?
Well, yes, because it seems a TCP problem.
r...@node20:~# tcpdump host node20 and node05
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet),
Eric Dumazet wrote:
Le lundi 12 avril 2010 à 14:19 +0100, stephen mulcahy a écrit :
Do you have some netfilters rules ?
Hi Eric,
I don't have any netfilters rules:
r...@node34:~# for table in filter nat mangle raw; do iptables -t $table
-L; done
Chain INPUT (policy ACCEPT)
target
Le lundi 12 avril 2010 à 17:11 +0100, stephen mulcahy a écrit :
Eric Dumazet wrote:
Le lundi 12 avril 2010 à 14:19 +0100, stephen mulcahy a écrit :
Do you have some netfilters rules ?
Hi Eric,
I don't have any netfilters rules:
r...@node34:~# for table in filter nat mangle raw;
Stephen Mulcahy reported a regression in forcedeth at
http://bugs.debian.org/572201. The system information and some
diagnostic information can be found there. Anyone able to help?
Ben.
stephen mulcahy wrote:
When running linux-image-2.6.32-trunk-amd64, the network stops
responding if large
27 matches
Mail list logo