Re: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang

2007-01-09 Thread Sami Farin
On Tue, Jan 09, 2007 at 17:48:04 -0800, Auke Kok wrote:
> Sami Farin wrote:
> >On Tue, Jan 09, 2007 at 15:59:30 -0800, Auke Kok wrote:
> >>Sami Farin wrote:
> >...
> >>>I do "ethtool -K eth0 tso off" now and check if I get the hang again. =)
> >>I'm unsure whether v7.2.x already automatically disables TSO for 100mbit 
> >>speed link, probably not. It should.
> >
> >It disabled it but I enabled it just for fun.
> > 
> >>Please try our updated driver from http://e1000.sf.net/ (7.3.20) against 
> >>the same kernel. There are some changes with regard to the ich8/TSO 
> >>driver that might affect this, so re-testing is worth it for us.
> >
> >I now run 7.3.20-NAPI.
> >
> >BTW. the Makefile is buggy: it does not get CC from kernel's Makefile.
> >Using wrong compiler can cause for example a reboot when loading the 
> >module.
> >(At least that's what happened with gcc-2.95.3 vs 3.x.x some years ago...)
> 
> I'll look into that, do you have any suggestions?

loop-AES has some hacks which figure out the correct CC
http://heanet.dl.sourceforge.net/sourceforge/loop-aes/loop-AES-v3.1e.tar.bz2

> >>also, please always include the full dmesg output. Feel free to CC 
> >>[EMAIL PROTECTED] on this.
> >
> >I enabled TSO again.  I write again if TSO causes problems.
> 
> There are known problems with that configuration, that's why the newer 
> drivers disable TSO for 10/100 speeds.
> 
> do you really think that you can see the performance gain fro musing TSO at 

No.

I was thinking that if TSO does not work at 100, why 1000 would be
any better.  But I can't test at 1000 speeds right now.

But if you say driver is supposed to hang at 100 speed,
I believe you.

Ohh... that was fast.

2007-01-10 04:07:42.303056500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-10 04:07:42.303081500 <4>  Tx Queue <0>
2007-01-10 04:07:42.303082500 <4>  TDH  <48>
2007-01-10 04:07:42.303083500 <4>  TDT  
2007-01-10 04:07:42.303084500 <4>  next_to_use  
2007-01-10 04:07:42.303085500 <4>  next_to_clean<48>
2007-01-10 04:07:42.303086500 <4>buffer_info[next_to_clean]
2007-01-10 04:07:42.303087500 <4>  time_stamp   <9e332d8>
2007-01-10 04:07:42.303088500 <4>  next_to_watch<49>
2007-01-10 04:07:42.303088500 <4>  jiffies  <9e336df>
2007-01-10 04:07:42.303094500 <4>  next_to_watch.status <0>
2007-01-10 04:07:43.302826500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-10 04:07:43.302850500 <4>  Tx Queue <0>
2007-01-10 04:07:43.302851500 <4>  TDH  <48>
2007-01-10 04:07:43.302852500 <4>  TDT  <34>
2007-01-10 04:07:43.302853500 <4>  next_to_use  <34>
2007-01-10 04:07:43.302854500 <4>  next_to_clean<48>
2007-01-10 04:07:43.302855500 <4>buffer_info[next_to_clean]
2007-01-10 04:07:43.302855500 <4>  time_stamp   <9e332d8>
2007-01-10 04:07:43.302856500 <4>  next_to_watch<49>
2007-01-10 04:07:43.302857500 <4>  jiffies  <9e33ac7>
2007-01-10 04:07:43.302862500 <4>  next_to_watch.status <0>
...

> those speeds anyway? we don't ;). In any case you should keep TSO off for 
> 10/100 speeds.

...
> >PS. please do not delete Mail-Followup-To header field.
> 
> I hit "reply-all" and I have no control over which field thunderbird 

I hit list-reply.

> removes or adds. I have to manually add your e-mail address too?

No.

> Maybe your mail client is broken instead?

No.  

> Don't you want to receive replies?

Yes.

I am subscribed to the mailing list.
That's why my email was not in Mail-Followup-To.

But if your thunderbird does not support Mail-Followup-To,
just ignore it.  I can add the header field manually.

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang

2007-01-09 Thread Sami Farin
On Tue, Jan 09, 2007 at 15:59:30 -0800, Auke Kok wrote:
> Sami Farin wrote:
...
> >I do "ethtool -K eth0 tso off" now and check if I get the hang again. =)
> 
> I'm unsure whether v7.2.x already automatically disables TSO for 100mbit 
> speed link, probably not. It should.

It disabled it but I enabled it just for fun.
 
> Please try our updated driver from http://e1000.sf.net/ (7.3.20) against 
> the same kernel. There are some changes with regard to the ich8/TSO driver 
> that might affect this, so re-testing is worth it for us.

I now run 7.3.20-NAPI.

BTW. the Makefile is buggy: it does not get CC from kernel's Makefile.
Using wrong compiler can cause for example a reboot when loading the module.
(At least that's what happened with gcc-2.95.3 vs 3.x.x some years ago...)
 
> also, please always include the full dmesg output. Feel free to CC 
> [EMAIL PROTECTED] on this.

I enabled TSO again.  I write again if TSO causes problems.
Why shouldn't it work with 100 Mbps?  Not that it would help a lot,
but I ask this on principle.

  /* disable TSO for pcie and 10/100 speeds, to avoid
   * some hardware issues */

Issues on the motherboard or the NIC?

2007-01-10 02:39:51.889908500 <6>ACPI: PCI interrupt for device :00:19.0 
disabled
2007-01-10 02:39:54.545194500 <6>Intel(R) PRO/1000 Network Driver - version 
7.3.20-NAPI
2007-01-10 02:39:54.545198500 <6>Copyright (c) 1999-2006 Intel Corporation.
2007-01-10 02:39:54.545395500 <6>ACPI: PCI Interrupt :00:19.0[A] -> GSI 20 
(level, low) -> IRQ 22
2007-01-10 02:39:54.545435500 <7>PCI: Setting latency timer of device 
:00:19.0 to 64
2007-01-10 02:39:54.562905500 <6>e1000: :00:19.0: e1000_probe: (PCI 
Express:2.5Gb/s:Width x1) 00:19:d1:00:5f:01
2007-01-10 02:39:54.638093500 <6>e1000: eth0: e1000_probe: Intel(R) PRO/1000 
Network Connection
2007-01-10 02:40:07.513619500 <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
2007-01-10 02:40:07.614768500 <6>e1000: eth0: e1000_watchdog: NIC Link is Up 
100 Mbps Full Duplex, Flow Control: None
2007-01-10 02:40:07.614770500 <6>e1000: eth0: e1000_watchdog: 10/100 speed: 
disabling TSO
2007-01-10 02:40:07.614771500 <6>ADDRCONF(NETDEV_CHANGE): eth0: link becomes 
ready
2007-01-10 02:40:09.271631500 <3>e1000: eth0: e1000_reset: Hardware Error
2007-01-10 02:40:10.93500 <6>e1000: eth0: e1000_watchdog: NIC Link is Up 
100 Mbps Full Duplex, Flow Control: None
2007-01-10 02:40:10.930049500 <6>e1000: eth0: e1000_watchdog: 10/100 speed: 
disabling TSO

PS. please do not delete Mail-Followup-To header field.

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang

2007-01-09 Thread Auke Kok

Sami Farin wrote:

On Tue, Jan 09, 2007 at 15:59:30 -0800, Auke Kok wrote:

Sami Farin wrote:

...

I do "ethtool -K eth0 tso off" now and check if I get the hang again. =)
I'm unsure whether v7.2.x already automatically disables TSO for 100mbit 
speed link, probably not. It should.


It disabled it but I enabled it just for fun.
 
Please try our updated driver from http://e1000.sf.net/ (7.3.20) against 
the same kernel. There are some changes with regard to the ich8/TSO driver 
that might affect this, so re-testing is worth it for us.


I now run 7.3.20-NAPI.

BTW. the Makefile is buggy: it does not get CC from kernel's Makefile.
Using wrong compiler can cause for example a reboot when loading the module.
(At least that's what happened with gcc-2.95.3 vs 3.x.x some years ago...)


I'll look into that, do you have any suggestions?

also, please always include the full dmesg output. Feel free to CC 
[EMAIL PROTECTED] on this.


I enabled TSO again.  I write again if TSO causes problems.


There are known problems with that configuration, that's why the newer drivers disable 
TSO for 10/100 speeds.


do you really think that you can see the performance gain fro musing TSO at those speeds 
anyway? we don't ;). In any case you should keep TSO off for 10/100 speeds.



Why shouldn't it work with 100 Mbps?  Not that it would help a lot,
but I ask this on principle.

  /* disable TSO for pcie and 10/100 speeds, to avoid
   * some hardware issues */

Issues on the motherboard or the NIC?


we (the e1000 team) don't write drivers for the motherboard, but only for the NIC 
component, so I hope that answers your question.



2007-01-10 02:39:51.889908500 <6>ACPI: PCI interrupt for device :00:19.0 
disabled
2007-01-10 02:39:54.545194500 <6>Intel(R) PRO/1000 Network Driver - version 
7.3.20-NAPI
2007-01-10 02:39:54.545198500 <6>Copyright (c) 1999-2006 Intel Corporation.
2007-01-10 02:39:54.545395500 <6>ACPI: PCI Interrupt :00:19.0[A] -> GSI 20 
(level, low) -> IRQ 22
2007-01-10 02:39:54.545435500 <7>PCI: Setting latency timer of device 
:00:19.0 to 64
2007-01-10 02:39:54.562905500 <6>e1000: :00:19.0: e1000_probe: (PCI 
Express:2.5Gb/s:Width x1) 00:19:d1:00:5f:01
2007-01-10 02:39:54.638093500 <6>e1000: eth0: e1000_probe: Intel(R) PRO/1000 
Network Connection
2007-01-10 02:40:07.513619500 <6>ADDRCONF(NETDEV_UP): eth0: link is not ready
2007-01-10 02:40:07.614768500 <6>e1000: eth0: e1000_watchdog: NIC Link is Up 
100 Mbps Full Duplex, Flow Control: None
2007-01-10 02:40:07.614770500 <6>e1000: eth0: e1000_watchdog: 10/100 speed: 
disabling TSO
2007-01-10 02:40:07.614771500 <6>ADDRCONF(NETDEV_CHANGE): eth0: link becomes 
ready
2007-01-10 02:40:09.271631500 <3>e1000: eth0: e1000_reset: Hardware Error
2007-01-10 02:40:10.93500 <6>e1000: eth0: e1000_watchdog: NIC Link is Up 
100 Mbps Full Duplex, Flow Control: None
2007-01-10 02:40:10.930049500 <6>e1000: eth0: e1000_watchdog: 10/100 speed: 
disabling TSO

PS. please do not delete Mail-Followup-To header field.


I hit "reply-all" and I have no control over which field thunderbird removes or adds. I 
have to manually add your e-mail address too? Maybe your mail client is broken instead? 
Don't you want to receive replies?


Cheers,

Auke
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang

2007-01-09 Thread Auke Kok

Sami Farin wrote:

Linux 2.6.19.1 SMP on Pentium D, Intel DQ965GF mobo.
Got this while bittorrenting knoppix:

2007-01-09 22:53:40.020693500 <4>NETFILTER drop IN=eth0 OUT= MAC=00:19:d1:00:5f:01:00:05:00:1c:58:1c:08:00 SRC=83.46.5.76 DST=80.223.106.128 LEN=121 TOS=0x00 PREC=0x00 TTL=112 ID=53273 PROTO=ICMP TYPE=3 CODE=3 [SRC=80.223.106.128 DST=192.168.1.37 LEN=93 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF PROTO=UDP SPT=6881 DPT=6895 LEN=73 ] 
2007-01-09 22:53:41.660249500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang

2007-01-09 22:53:41.660253500 <4>  Tx Queue <0>
2007-01-09 22:53:41.660254500 <4>  TDH  <3c>
2007-01-09 22:53:41.660255500 <4>  TDT  
2007-01-09 22:53:41.660255500 <4>  next_to_use  
2007-01-09 22:53:41.660256500 <4>  next_to_clean<3c>
2007-01-09 22:53:41.660257500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:41.660258500 <4>  time_stamp   <8c3b8e4>
2007-01-09 22:53:41.660259500 <4>  next_to_watch<3f>
2007-01-09 22:53:41.660274500 <4>  jiffies  <8c3bf13>
2007-01-09 22:53:41.660275500 <4>  next_to_watch.status <0>
2007-01-09 22:53:42.660365500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-09 22:53:42.660368500 <4>  Tx Queue <0>
2007-01-09 22:53:42.660369500 <4>  TDH  <3c>
2007-01-09 22:53:42.660370500 <4>  TDT  
2007-01-09 22:53:42.660370500 <4>  next_to_use  
2007-01-09 22:53:42.660371500 <4>  next_to_clean<3c>
2007-01-09 22:53:42.660372500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:42.660373500 <4>  time_stamp   <8c3b8e4>
2007-01-09 22:53:42.660374500 <4>  next_to_watch<3f>
2007-01-09 22:53:42.660389500 <4>  jiffies  <8c3c2fb>
2007-01-09 22:53:42.660390500 <4>  next_to_watch.status <0>
2007-01-09 22:53:43.660086500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-09 22:53:43.660089500 <4>  Tx Queue <0>
2007-01-09 22:53:43.660090500 <4>  TDH  <3c>
2007-01-09 22:53:43.660091500 <4>  TDT  
2007-01-09 22:53:43.660092500 <4>  next_to_use  
2007-01-09 22:53:43.660093500 <4>  next_to_clean<3c>
2007-01-09 22:53:43.660093500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:43.660094500 <4>  time_stamp       <8c3b8e4>
2007-01-09 22:53:43.660095500 <4>  next_to_watch<3f>
2007-01-09 22:53:43.660110500 <4>  jiffies  <8c3c6e3>
2007-01-09 22:53:43.660111500 <4>  next_to_watch.status <0>
2007-01-09 22:53:44.660001500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-09 22:53:44.660004500 <4>  Tx Queue <0>
2007-01-09 22:53:44.660005500 <4>  TDH  <3c>
2007-01-09 22:53:44.660006500 <4>  TDT  
2007-01-09 22:53:44.660007500 <4>  next_to_use  
2007-01-09 22:53:44.660008500 <4>  next_to_clean<3c>
2007-01-09 22:53:44.660009500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:44.660009500 <4>  time_stamp   <8c3b8e4>
2007-01-09 22:53:44.660010500 <4>  next_to_watch<3f>
2007-01-09 22:53:44.660026500 <4>  jiffies  <8c3cacb>
2007-01-09 22:53:44.660027500 <4>  next_to_watch.status <0>
2007-01-09 22:53:45.659906500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-09 22:53:45.659909500 <4>  Tx Queue <0>
2007-01-09 22:53:45.659909500 <4>  TDH  <3c>
2007-01-09 22:53:45.659910500 <4>  TDT      
2007-01-09 22:53:45.659911500 <4>  next_to_use  
2007-01-09 22:53:45.659912500 <4>  next_to_clean<3c>
2007-01-09 22:53:45.659913500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:45.659914500 <4>  time_stamp   <8c3b8e4>
2007-01-09 22:53:45.659915500 <4>  next_to_watch<3f>
2007-01-09 22:53:45.659930500 <4>  jiffies  <8c3ceb3>
2007-01-09 22:53:45.659931500 <4>  next_to_watch.status <0>
2007-01-09 22:53:46.659784500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-09 22:53:46.659787500 <4>  Tx Queue <0>
2007-01-09 22:53:46.659788500 <4>  TDH  <3c>
2007-01-09 22:53:46.659788500 <4>  TDT  
2007-01-09 22:53:46.659789500 <4>  next_to_use  
2007-01-09 22:53:46.659790500 <4>  next_to_clean<3c>
2007-01-09 22:53:46.659791500 <4>buffer_info[next_to_clean]
2007-01-09 22:53

e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang

2007-01-09 Thread Sami Farin
Linux 2.6.19.1 SMP on Pentium D, Intel DQ965GF mobo.
Got this while bittorrenting knoppix:

2007-01-09 22:53:40.020693500 <4>NETFILTER drop IN=eth0 OUT= 
MAC=00:19:d1:00:5f:01:00:05:00:1c:58:1c:08:00 SRC=83.46.5.76 DST=80.223.106.128 
LEN=121 TOS=0x00 PREC=0x00 TTL=112 ID=53273 PROTO=ICMP TYPE=3 CODE=3 
[SRC=80.223.106.128 DST=192.168.1.37 LEN=93 TOS=0x00 PREC=0x00 TTL=45 ID=0 DF 
PROTO=UDP SPT=6881 DPT=6895 LEN=73 ] 
2007-01-09 22:53:41.660249500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-09 22:53:41.660253500 <4>  Tx Queue <0>
2007-01-09 22:53:41.660254500 <4>  TDH  <3c>
2007-01-09 22:53:41.660255500 <4>  TDT  
2007-01-09 22:53:41.660255500 <4>  next_to_use  
2007-01-09 22:53:41.660256500 <4>  next_to_clean<3c>
2007-01-09 22:53:41.660257500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:41.660258500 <4>  time_stamp   <8c3b8e4>
2007-01-09 22:53:41.660259500 <4>  next_to_watch<3f>
2007-01-09 22:53:41.660274500 <4>  jiffies  <8c3bf13>
2007-01-09 22:53:41.660275500 <4>  next_to_watch.status <0>
2007-01-09 22:53:42.660365500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-09 22:53:42.660368500 <4>  Tx Queue <0>
2007-01-09 22:53:42.660369500 <4>  TDH  <3c>
2007-01-09 22:53:42.660370500 <4>  TDT  
2007-01-09 22:53:42.660370500 <4>  next_to_use  
2007-01-09 22:53:42.660371500 <4>  next_to_clean<3c>
2007-01-09 22:53:42.660372500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:42.660373500 <4>  time_stamp   <8c3b8e4>
2007-01-09 22:53:42.660374500 <4>  next_to_watch<3f>
2007-01-09 22:53:42.660389500 <4>  jiffies  <8c3c2fb>
2007-01-09 22:53:42.660390500 <4>  next_to_watch.status <0>
2007-01-09 22:53:43.660086500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-09 22:53:43.660089500 <4>  Tx Queue <0>
2007-01-09 22:53:43.660090500 <4>  TDH  <3c>
2007-01-09 22:53:43.660091500 <4>  TDT  
2007-01-09 22:53:43.660092500 <4>  next_to_use  
2007-01-09 22:53:43.660093500 <4>  next_to_clean<3c>
2007-01-09 22:53:43.660093500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:43.660094500 <4>  time_stamp       <8c3b8e4>
2007-01-09 22:53:43.660095500 <4>  next_to_watch<3f>
2007-01-09 22:53:43.660110500 <4>  jiffies  <8c3c6e3>
2007-01-09 22:53:43.660111500 <4>  next_to_watch.status <0>
2007-01-09 22:53:44.660001500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-09 22:53:44.660004500 <4>  Tx Queue <0>
2007-01-09 22:53:44.660005500 <4>  TDH  <3c>
2007-01-09 22:53:44.660006500 <4>  TDT  
2007-01-09 22:53:44.660007500 <4>  next_to_use  
2007-01-09 22:53:44.660008500 <4>  next_to_clean<3c>
2007-01-09 22:53:44.660009500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:44.660009500 <4>  time_stamp   <8c3b8e4>
2007-01-09 22:53:44.660010500 <4>  next_to_watch<3f>
2007-01-09 22:53:44.660026500 <4>  jiffies  <8c3cacb>
2007-01-09 22:53:44.660027500 <4>  next_to_watch.status <0>
2007-01-09 22:53:45.659906500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-09 22:53:45.659909500 <4>  Tx Queue <0>
2007-01-09 22:53:45.659909500 <4>  TDH  <3c>
2007-01-09 22:53:45.659910500 <4>  TDT      
2007-01-09 22:53:45.659911500 <4>  next_to_use  
2007-01-09 22:53:45.659912500 <4>  next_to_clean<3c>
2007-01-09 22:53:45.659913500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:45.659914500 <4>  time_stamp   <8c3b8e4>
2007-01-09 22:53:45.659915500 <4>  next_to_watch<3f>
2007-01-09 22:53:45.659930500 <4>  jiffies  <8c3ceb3>
2007-01-09 22:53:45.659931500 <4>  next_to_watch.status <0>
2007-01-09 22:53:46.659784500 <3>e1000: eth0: e1000_clean_tx_irq: Detected Tx 
Unit Hang
2007-01-09 22:53:46.659787500 <4>  Tx Queue <0>
2007-01-09 22:53:46.659788500 <4>  TDH  <3c>
2007-01-09 22:53:46.659788500 <4>  TDT  
2007-01-09 22:53:46.659789500 <4>  next_to_use  
2007-01-09 22:53:46.659790500 <4>  next_to_clean<3c>
2007-01-09 22:53:46.659791500 <4>buffer_info[next_to_clean]
2007-01-09 22:53:46.659792500 <