Lars,

You can change driver msglvl by running 'ethtool -s ethx msglvl 0x2c00' after 
driver loads. So when issue occurs it will log hw ring info into message log.
Please give it a try and send log after issue occurs.

Just wanted to ask is this the only board that has this issue? I looked in our 
lab but couldn't find S1200BLT for reproduction. 

-Tushar

>-----Original Message-----
>From: Lars Maschke [mailto:[email protected]]
>Sent: Wednesday, March 06, 2013 1:58 AM
>To: Dave, Tushar N
>Cc: [email protected]
>Subject: Re-4: e1000e detected hardware unit hang problem
>
>
>
>Hi Tushar,
>
>I tried the "build in" driver of the described kernels 3.3.8, 3.4.32 and
>3.7.9 but I had no success.
>
>I know this because a have seen the error message three or four times on
>the screen. Once it was logged in the kern.log as You see here:
>
>
>Mar  1 00:53:36 lightning kernel: e1000e 0000:00:19.0 eth0: Detected
>Hardware Unit Hang:
>Mar  1 00:53:36 lightning kernel:   TDH                  <a7>
>Mar  1 00:53:36 lightning kernel:   TDT                  <c6>
>Mar  1 00:53:36 lightning kernel:   next_to_use          <c6>
>Mar  1 00:53:36 lightning kernel:   next_to_clean        <a5>
>Mar  1 00:53:36 lightning kernel: buffer_info[next_to_clean]:
>Mar  1 00:53:36 lightning kernel:   time_stamp           <5a0f4f>
>Mar  1 00:53:36 lightning kernel:   next_to_watch        <a7>
>Mar  1 00:53:36 lightning kernel:   jiffies              <5a1199>
>Mar  1 00:53:36 lightning kernel:   next_to_watch.status <0>
>Mar  1 00:53:36 lightning kernel: MAC Status             <40080083>
>Mar  1 00:53:36 lightning kernel: PHY Status             <796d>
>Mar  1 00:53:36 lightning kernel: PHY 1000BASE-T Status  <3c00>
>Mar  1 00:53:36 lightning kernel: PHY Extended Status    <3000>
>Mar  1 00:53:36 lightning kernel: PCI Status             <10>
>Mar  1 00:53:38 lightning kernel: e1000e 0000:00:19.0 eth0: Detected
>Hardware Unit Hang:
>Mar  1 00:53:38 lightning kernel:   TDH                  <a7>
>Mar  1 00:53:38 lightning kernel:   TDT                  <c6>
>Mar  1 00:53:38 lightning kernel:   next_to_use          <c6>
>Mar  1 00:53:38 lightning kernel:   next_to_clean        <a5>
>Mar  1 00:53:38 lightning kernel: buffer_info[next_to_clean]:
>Mar  1 00:53:38 lightning kernel:   time_stamp           <5a0f4f>
>Mar  1 00:53:38 lightning kernel:   next_to_watch        <a7>
>Mar  1 00:53:38 lightning kernel:   jiffies              <5a138d>
>Mar  1 00:53:38 lightning kernel:   next_to_watch.status <0>
>Mar  1 00:53:38 lightning kernel: MAC Status             <40080083>
>Mar  1 00:53:38 lightning kernel: PHY Status             <796d>
>Mar  1 00:53:38 lightning kernel: PHY 1000BASE-T Status  <3c00>
>Mar  1 00:53:38 lightning kernel: PHY Extended Status    <3000>
>Mar  1 00:53:38 lightning kernel: PCI Status             <10>
>Mar  1 00:53:40 lightning kernel: e1000e 0000:00:19.0 eth0: Detected
>Hardware Unit Hang:
>Mar  1 00:53:40 lightning kernel:   TDH                  <a7>
>Mar  1 00:53:40 lightning kernel:   TDT                  <c6>
>Mar  1 00:53:40 lightning kernel:   next_to_use          <c6>
>Mar  1 00:53:40 lightning kernel:   next_to_clean        <a5>
>Mar  1 00:53:40 lightning kernel: buffer_info[next_to_clean]:
>Mar  1 00:53:40 lightning kernel:   time_stamp           <5a0f4f>
>Mar  1 00:53:40 lightning kernel:   next_to_watch        <a7>
>Mar  1 00:53:40 lightning kernel:   jiffies              <5a1581>
>Mar  1 00:53:40 lightning kernel:   next_to_watch.status <0>
>Mar  1 00:53:40 lightning kernel: MAC Status             <40080083>
>Mar  1 00:53:40 lightning kernel: PHY Status             <796d>
>Mar  1 00:53:40 lightning kernel: PHY 1000BASE-T Status  <3c00>
>Mar  1 00:53:40 lightning kernel: PHY Extended Status    <3000>
>Mar  1 00:53:40 lightning kernel: PCI Status             <10>
>Mar  1 00:53:42 lightning kernel: e1000e 0000:00:19.0 eth0: Detected
>Hardware Unit Hang:
>Mar  1 00:53:42 lightning kernel:   TDH                  <a7>
>Mar  1 00:53:42 lightning kernel:   TDT                  <c6>
>Mar  1 00:53:42 lightning kernel:   next_to_use          <c6>
>Mar  1 00:53:42 lightning kernel:   next_to_clean        <a5>
>Mar  1 00:53:42 lightning kernel: buffer_info[next_to_clean]:
>Mar  1 00:53:42 lightning kernel:   time_stamp           <5a0f4f>
>Mar  1 00:53:42 lightning kernel:   next_to_watch        <a7>
>Mar  1 00:53:42 lightning kernel:   jiffies              <5a1775>
>Mar  1 00:53:42 lightning kernel:   next_to_watch.status <0>
>Mar  1 00:53:42 lightning kernel: MAC Status             <40080083>
>Mar  1 00:53:42 lightning kernel: PHY Status             <796d>
>Mar  1 00:53:42 lightning kernel: PHY 1000BASE-T Status  <3c00>
>Mar  1 00:53:42 lightning kernel: PHY Extended Status    <3000>
>Mar  1 00:53:42 lightning kernel: PCI Status             <10>
>Mar  1 00:53:43 lightning kernel: ------------[ cut here ]------------ Mar
>1 00:53:43 lightning kernel: WARNING: at net/sched/sch_generic.c:255
>dev_watchdog+0x1da/0x1f0() Mar  1 00:53:43 lightning kernel: Hardware
>name: S1200BTL Mar  1 00:53:43 lightning kernel: NETDEV WATCHDOG: eth0
>(e1000e): transmit queue 0 timed out Mar  1 00:53:43 lightning kernel:
>Modules linked in: af_packet xt_REDIRECT xt_DSCP xt_dscp xt_statistic
>xt_CT xt_NFLOG nfnetlink_log nfnetlink ipt_ULOG xt_LOG xt_time
>xt_connlimit xt_helper xt_realm xt_NFQUEUE xt_tcpmss xt_tcpudp xt_addrtype
>xt_pkttype iptable_raw xt_TPROXY nf_tproxy_core xt_CLASSIFY xt_mark
>xt_hashlimit xt_comment ipt_REJECT xt_length xt_connmark xt_owner
>xt_recent xt_iprange xt_physdev xt_policy iptable_mangle xt_nat
>xt_multiport xt_conntrack nf_nat_ftp iptable_nat nf_conntrack_ipv4
>nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack_ftp nf_conntrack
>iptable_filter ip_tables x_tables ipmi_poweroff ipmi_devintf ipmi_si
>ipmi_watchdog ipmi_msghandler minix ppp_mppe ppp_generic slhc tun e1000
>usb_storage mousedev usbhid dm_mod microcode i2c_i801 i2c_core sr_mod
>ehci_hcd e1000e(O) firmware_class cdrom usbcore usb_common evdev unix
>Mar  1 00:53:43 lightning kernel: Pid: 0, comm: swapper/2 Tainted: G
>O 3.7.9 #2
>Mar  1 00:53:43 lightning kernel: Call Trace:
>Mar  1 00:53:43 lightning kernel:  [<c013696d>]
>warn_slowpath_common+0x6d/0xa0 Mar  1 00:53:43 lightning kernel:
>[<c046a06a>] ? dev_watchdog+0x1da/0x1f0 Mar  1 00:53:43 lightning kernel:
>[<c046a06a>] ? dev_watchdog+0x1da/0x1f0 Mar  1 00:53:43 lightning kernel:
>[<c0136a1e>] warn_slowpath_fmt+0x2e/0x30 Mar  1 00:53:43 lightning kernel:
>[<c046a06a>] dev_watchdog+0x1da/0x1f0 Mar  1 00:53:43 lightning kernel:
>[<c0469e90>] ? pfifo_fast_dequeue+0xe0/0xe0 Mar  1 00:53:43 lightning
>kernel:  [<c0142e7d>] call_timer_fn.isra.32+0x1d/0x80 Mar  1 00:53:43
>lightning kernel:  [<c0143051>] run_timer_softirq+0x171/0x180 Mar  1
>00:53:43 lightning kernel:  [<c0469e90>] ? pfifo_fast_dequeue+0xe0/0xe0
>Mar  1 00:53:43 lightning kernel:  [<c013d6c0>] __do_softirq+0x90/0x140
>Mar  1 00:53:43 lightning kernel:  [<c013d630>] ?
>__tasklet_schedule+0x60/0x60 Mar  1 00:53:43 lightning kernel:  <IRQ>
>[<c013d875>] ? irq_exit+0x65/0x70 Mar  1 00:53:43 lightning kernel:
>[<c0127894>] ? smp_apic_timer_interrupt+0x54/0x90
>Mar  1 00:53:43 lightning kernel:  [<c04eee3d>] ?
>apic_timer_interrupt+0x2d/0x34 Mar  1 00:53:43 lightning kernel:
>[<c02e45b6>] ? acpi_idle_enter_bm+0x251/0x286 Mar  1 00:53:43 lightning
>kernel:  [<c042a335>] ? cpuidle_enter+0x15/0x20 Mar  1 00:53:43 lightning
>kernel:  [<c042a8fe>] ? cpuidle_idle_call+0x6e/0xd0 Mar  1 00:53:43
>lightning kernel:  [<c0111305>] ? cpu_idle+0x55/0xa0 Mar  1 00:53:43
>lightning kernel:  [<c04e522b>] ? start_secondary+0x19b/0x1a1 Mar  1
>00:53:43 lightning kernel: ---[ end trace a49ef186404ae76d ]--- Mar  1
>00:53:43 lightning kernel: e1000e 0000:00:19.0 eth0: Reset adapter
>unexpectedly
>
>Greets
>Lars
>
>
>
>Original Message processed by david(r)
>RE: Re-2: e1000e detected hardware unit hang problem (06-Mrz-2013 0:17)
>From:   Dave, Tushar N
>To:Lars Maschke
>Cc:[email protected]
>
>
>
>>-----Original Message-----
>>From: Lars Maschke [mailto:[email protected]]
>>Sent: Monday, March 04, 2013 3:11 PM
>>To: Dave, Tushar N
>>Cc: [email protected]
>>Subject: Re-2: e1000e detected hardware unit hang problem
>>
>>Hello Tushar,
>>
>>first of all. Thanks for Your quick reply.
>>
>>That's the point. I don't know why this occurs. If I have the chance I
>>see a failure of the e1000e driver on the console. The server is
>>completly down and I can't logon to get any other information.
>>
>>The error isn't logged at dmesg.log or syslog on my debian system.
>>There is no logging at all after the crash.
>
>We at the least need the dmesg from the kernel to start with. How would
>you know that you're getting detected hardware unit hang message. Is it
>possible to take a picture or connect serial console to the server to
>retrieve the message after crash/hang?
>>
>>Only a full reset solves the problem. I get this error every two, three
>>or four days. At the crash time no special cron job is running. It
>>occurs only an night between 0:00h and 2:30h.
>>From now on I try to reset networking with the following bash-script
>every >night and I hope that it's a good idea:
>
>>---
>>#!/bin/sh
>
>>/etc/init.d/networking stop
>>/sbin/rmmod e1000e
>>/sbin/modprobe e1000e RxIntDelay=0,0 IntMode=1,1 /etc/init.d/networking
>>start >/sbin/ethtool -K eth0 tso off /sbin/shorewall restart
>>---
>How many e1000e devices you have in the system?
>
>>Do You think that the problem can occur when the other Intel "e1000"
>>driver is also loaded on the machine?
>I don't think so however if you can, give it a try. And let me know if
>anything changes!
>Have you tried the in-kernel e1000e driver?
>
>-Tushar
>
>To: [email protected]
>Cc: [email protected]
>


------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to