Hi, I found that, with the recent pve kernel the network controller just crash and cannot back to life again without power off/on the system.
This occurs on a almost no-traffic interface after 1-2 days. I found the following conversations on the net, exactly the same situation: https://bugzilla.redhat.com/show_bug.cgi?id=625776 https://lkml.org/lkml/2012/3/17/48 http://lists.centos.org/pipermail/centos/2011-September/118027.html http://sourceforge.net/p/e1000/bugs/358/ Background info: latest proxmox 3.1, fresh install, up to date no heavy traffic, almost nothing 6 x gigabit LAN cards: 07:00.0 Ethernet controller: Intel Corporation 82574L Gigabit Network Connection Subsystem: Intel Corporation Device 0000 Flags: bus master, fast devsel, latency 0, IRQ 40 Memory at e8500000 (32-bit, non-prefetchable) [size=128K] I/O ports at 3000 [size=32] Memory at e8520000 (32-bit, non-prefetchable) [size=16K] Expansion ROM at e8d00000 [disabled] [size=2K] Capabilities: [c8] Power Management version 2 Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [e0] Express Endpoint, MSI 00 Capabilities: [a0] MSI-X: Enable- Count=1 Masked- Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 00-03-1d-ff-ff-0b-8a-e7 Kernel driver in use: e1000e ethtool -i eth1 driver: e1000e version: 2.4.14-NAPI firmware-version: 3.1-1 bus-info: 0000:03:00.0 supports-statistics: yes supports-test: yes supports-eeprom-access: yes supports-register-dump: yes supports-priv-flags: no ifconfig eth1 eth1 Link encap:Ethernet HWaddr 00:03:1d:0b:8a:e3 inet6 addr: fe80::203:1dff:fe0b:8ae3/64 Scope:Link UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:70200 errors:354515190463890 dropped:59085865077315 overruns:0 frame:236343460309260 TX packets:13254 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:28083621 (26.7 MiB) TX bytes:2211516 (2.1 MiB) Interrupt:17 Memory:e8900000-e8920000 Kernel Command line: BOOT_IMAGE=/vmlinuz-2.6.32-23-pve root=UUID=dd4c475c-b71d-4497-aeb1-c36a06a8c46f ro quiet dmesg report: ------------[ cut here ]------------ WARNING: at net/sched/sch_generic.c:267 dev_watchdog+0x28a/0x2a0() (Tainted: P --------------- ) Hardware name: HuronRiver Platform NETDEV WATCHDOG: eth1 (e1000e): transmit queue 0 timed out Modules linked in: fuse vzethdev vznetdev pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop simfs vzrst nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 vzcpt nf_conntrack vzdquota vzmon vzdev ip6t_REJECT ip6table_mangle ip6table_filter ip6_tables xt_length xt_hl xt_tcpmss xt_TCPMSS iptable_mangle iptable_filter xt_multiport xt_limit xt_dscp ipt_REJECT ip_tables vhost_net tun macvtap macvlan kvm_intel kvm vzevent ib_iser rdma_cm ib_addr iw_cm ib_cm ib_sa ib_mad ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd nfs nfs_acl auth_rpcgss fscache lockd sunrpc ipv6 ext3 jbd snd_hda_codec_realtek i915 snd_pcsp snd_hda_intel snd_hda_codec snd_hwdep zfs(P) zunicode(P) snd_pcm snd_page_alloc drm_kms_helper zavl(P) zcommon(P) parport_pc i2c_i801 video snd_timer iTCO_wdt iTCO_vendor_support drm i2c_algo_bit serio_raw parport shpchp snd i2c_core output soundcore ext4 jbd2 mbcache znvpair(P) spl zlib_deflate sg e1000e ahci [last unloaded: scsi_wait_scan] Pid: 0, comm: swapper veid: 0 Tainted: P --------------- 2.6.32-23-pve #1 Call Trace: [] ? warn_slowpath_common+0x87/0xe0 [] ? warn_slowpath_fmt+0x46/0x50 [] ? dev_watchdog+0x28a/0x2a0 [] ? internal_add_timer+0xcb/0x130 [] ? dev_watchdog+0x0/0x2a0 [] ? run_timer_softirq+0x176/0x370 [] ? native_apic_msr_write+0x35/0x40 [] ? __do_softirq+0x11b/0x260 [] ? tick_dev_program_event+0x65/0xc0 [] ? tick_program_event+0x2a/0x30 [] ? call_softirq+0x1c/0x30 [] ? do_softirq+0x75/0xb0 [] ? irq_exit+0xc5/0xd0 [] ? smp_apic_timer_interrupt+0x70/0x9b [] ? apic_timer_interrupt+0x13/0x20 [] ? intel_idle+0xdb/0x160 [] ? intel_idle+0xb9/0x160 [] ? cpuidle_idle_call+0x94/0x130 [] ? cpu_idle+0xa9/0x100 [] ? rest_init+0x85/0x94 [] ? start_kernel+0x40b/0x417 [] ? x86_64_start_reservations+0x126/0x12a [] ? x86_64_start_kernel+0xf7/0x106 ---[ end trace c5f8a6b8504af481 ]--- e1000e 0000:03:00.0: eth1: Reset adapter unexpectedly e1000e 0000:03:00.0: eth1: Timesync Tx Control register not set as expected e1000e 0000:03:00.0: eth1: Error reading PHY register e1000e: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx .... e1000e 0000:03:00.0: eth1: Reset adapter unexpectedly e1000e 0000:03:00.0: eth1: Timesync Tx Control register not set as expected vmbr1: port 1(eth1) entering disabled state Any idea? Thanks, István
_______________________________________________ pve-user mailing list [email protected] http://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-user
