I was able to consistently reproduce this issue in Debian Wheezy by setting up
two Dell PowerEdge R620 servers directly connected and doing constant scp
transfer of large files back and forth while also setting the interface up &
down in a loop until it breaks (while [ true ]; do ip link set ${DEV} down;
sleep 1; ip link set ${DEV} up; sleep 9; done).
I have then updated the tg3 driver to version 3.137h, but the issue was still
reproducible.
I have tried RedHat 7.1 and it works fine (Kernel: 3.10.0-229.el7.x86_64, tg3
3.137).
Then I have tried Debian Jessie (kernel 3.16, tg3 3.137) and the issue is not
reproducible.
Logs from Debian Wheezy:
Aug 28 14:16:27 bond-111 kernel: [ 519.336495] ------------[ cut here
]------------
Aug 28 14:16:27 bond-111 kernel: [ 519.336505] WARNING: at
/build/linux-l1NKWv/linux-3.2.68/net/sched/sch_generic.c:256
dev_watchdog+0xf2/0x151()
Aug 28 14:16:27 bond-111 kernel: [ 519.336508] Hardware name: PowerEdge R620
Aug 28 14:16:27 bond-111 kernel: [ 519.336510] NETDEV WATCHDOG: eth0 (tg3):
transmit queue 0 timed out
Aug 28 14:16:27 bond-111 kernel: [ 519.336513] Modules linked in: drbd
lru_cache nf_conntrack_tftp nf_conntrack virtio_net virtio_blk virtio_pci
virtio_ri
ng virtio kvm bonding sb_edac coretemp snd_pcm crc32c_intel ghash_clmulni_intel
aesni_intel snd_page_alloc snd_timer snd soundcore aes_x86_64 edac_core joy
dev pcspkr shpchp iTCO_wdt iTCO_vendor_support evdev dcdbas aes_generic cryptd
processor button thermal_sys acpi_power_meter wmi ext3 mbcache jbd microcode
usbhid hid sg sr_mod sd_mod cdrom crc_t10dif ahci libahci libata ehci_hcd
megaraid_sas scsi_mod usbcore usb_common tg3 libphy [last unloaded: drbd]
Aug 28 14:16:27 bond-111 kernel: [ 519.336580] Pid: 29511, comm: scp Not
tainted 3.2.0-4-amd64 #1 Debian 3.2.68-1+deb7u2
Aug 28 14:16:27 bond-111 kernel: [ 519.336583] Call Trace:
Aug 28 14:16:27 bond-111 kernel: [ 519.336585] <IRQ> [<ffffffff81046dbd>] ?
warn_slowpath_common+0x78/0x8c
Aug 28 14:16:27 bond-111 kernel: [ 519.336599] [<ffffffff81046e69>] ?
warn_slowpath_fmt+0x45/0x4a
Aug 28 14:16:27 bond-111 kernel: [ 519.336605] [<ffffffff812a8c91>] ?
netif_tx_lock+0x40/0x75
Aug 28 14:16:27 bond-111 kernel: [ 519.336609] [<ffffffff812a8e01>] ?
dev_watchdog+0xf2/0x151
Aug 28 14:16:27 bond-111 kernel: [ 519.336613] [<ffffffff810525f4>] ?
run_timer_softirq+0x19a/0x261
Aug 28 14:16:27 bond-111 kernel: [ 519.336615] [<ffffffff812a8d0f>] ?
netif_tx_unlock+0x49/0x49
Aug 28 14:16:27 bond-111 kernel: [ 519.336618] [<ffffffff8104c46a>] ?
__do_softirq+0xb9/0x177
Aug 28 14:16:27 bond-111 kernel: [ 519.336622] [<ffffffff813583ec>] ?
call_softirq+0x1c/0x30
Aug 28 14:16:27 bond-111 kernel: [ 519.336627] [<ffffffff8100fa91>] ?
do_softirq+0x3c/0x7b
Aug 28 14:16:27 bond-111 kernel: [ 519.336630] [<ffffffff8104c6d2>] ?
irq_exit+0x3c/0x99
Aug 28 14:16:27 bond-111 kernel: [ 519.336632] [<ffffffff8100f66a>] ?
do_IRQ+0x82/0x98
Aug 28 14:16:27 bond-111 kernel: [ 519.336637] [<ffffffff813513ee>] ?
common_interrupt+0x6e/0x6e
Aug 28 14:16:27 bond-111 kernel: [ 519.336638] <EOI> [<ffffffff811b43cd>] ?
copy_user_generic_string+0x2d/0x40
Aug 28 14:16:27 bond-111 kernel: [ 519.336647] [<ffffffff810b4d6a>] ?
iov_iter_copy_from_user_atomic+0x70/0x93
Aug 28 14:16:27 bond-111 kernel: [ 519.336651] [<ffffffff810b580f>] ?
generic_file_buffered_write+0x143/0x259
Aug 28 14:16:27 bond-111 kernel: [ 519.336656] [<ffffffff810b6618>] ?
__generic_file_aio_write+0x248/0x278
Aug 28 14:16:27 bond-111 kernel: [ 519.336659] [<ffffffff81036638>] ?
should_resched+0x5/0x23
Aug 28 14:16:27 bond-111 kernel: [ 519.336663] [<ffffffff810b66a5>] ?
generic_file_aio_write+0x5d/0xb5
Aug 28 14:16:27 bond-111 kernel: [ 519.336668] [<ffffffff810fae68>] ?
do_sync_write+0xb4/0xec
Aug 28 14:16:27 bond-111 kernel: [ 519.336672] [<ffffffff811656d1>] ?
security_file_permission+0x16/0x2d
Aug 28 14:16:27 bond-111 kernel: [ 519.336674] [<ffffffff810fb559>] ?
vfs_write+0xa2/0xe9
Aug 28 14:16:27 bond-111 kernel: [ 519.336677] [<ffffffff810fb736>] ?
sys_write+0x45/0x6b
Aug 28 14:16:27 bond-111 kernel: [ 519.336680] [<ffffffff813561b2>] ?
system_call_fastpath+0x16/0x1b
Aug 28 14:16:27 bond-111 kernel: [ 519.336682] ---[ end trace 5a2a84798c0630dd
]---
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1331513
Title:
14e4:165f tg3 eth1: transmit timed out, resetting on BCM5720
To manage notifications about this bug go to:
https://bugs.launchpad.net/dell-poweredge/+bug/1331513/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs