------- Comment From [email protected] 2015-03-11 19:15 EDT-------
This looks fixed with  3.19.0-8-generic #8-Ubuntu
it was able to recover from EEH.

[ 2694.622586] EEH: Notify device drivers to shutdown
[ 2694.622587] mlx4_core 0004:01:00.0: device was reset successfully
[ 2694.622589] mlx4_core 0004:01:00.0: mlx4_pci_err_detected was called
[ 2694.622594] mlx4_en 0004:01:00.0: Internal error detected, restarting device
[ 2694.622786] mlx4_en: eth14: Close port called
[ 2694.846830] mlx4_en 0004:01:00.0: removed PHC
[ 2694.874036] EEH: Collect temporary log
[ 2694.879101] EEH: of node=/pciex@3fffe42000000/pci@0/ethernet@0
[ 2694.879465] EEH: PCI device/vendor: 100715b3
[ 2694.879478] EEH: PCI cmd/status register: 00100142
[ 2694.879479] EEH: PCI-E capabilities and status follow:
[ 2694.879544] EEH: PCI-E 00: 00020010 10008e02 0020204e 0843f483
[ 2694.879597] EEH: PCI-E 10: 10830040 00000000 00000000 00000000
[ 2694.879598] EEH: PCI-E 20: 00000000
[ 2694.879599] EEH: PCI-E AER capability register set follows:
[ 2694.879666] EEH: PCI-E AER 00: 18c20001 00000000 00000000 00062010
[ 2694.879719] EEH: PCI-E AER 10: 00000000 00002000 000001e0 00000000
[ 2694.879772] EEH: PCI-E AER 20: 00000000 00000000 00000000 00000000
[ 2694.879785] EEH: PCI-E AER 30: 00000000 00000000
[ 2694.879787] PHB3 PHB#4 Diag-data (Version: 1)
[ 2694.879789] brdgCtl:     00000002
[ 2694.879790] UtlSts:      00200000 00000000 00000000
[ 2694.879791] RootSts:     00000040 00400000 f0830048 00100147 00000000
[ 2694.879792] PhbSts:      0000001c00000000 0000001c00000000
[ 2694.879793] Lem:         0000000000100000 42498e327f502eae 0000000000000000
[ 2694.879795] InAErr:      8000000000000000 8000000000000000 0402008000000000 
0000000000000000
[ 2694.879796] PE[  1] A/B: 8480002b00000000 8000000000000000
[ 2694.879797] PE[  2] A/B: 8000000000000000 8000000000000000
[ 2694.879798] PE[  3] A/B: 8000000000000000 8000000000000000
[ 2694.879799] PE[  4] A/B: 8000000000000000 8000000000000000
[ 2694.879800] PE[  5] A/B: 8000000000000000 8000000000000000
[ 2694.879801] EEH: Reset without hotplug activity
[ 2698.898176] EEH: Notify device drivers the completion of reset
[ 2698.898181] mlx4_core 0004:01:00.0: mlx4_pci_slot_reset was called
[ 2698.898218] mlx4_core 0004:01:00.0: enabling device (0140 -> 0142)
[ 2705.396286] mlx4_core 0004:01:00.0: PCIe link speed is 8.0GT/s, device 
supports 8.0GT/s
[ 2705.396288] mlx4_core 0004:01:00.0: PCIe link width is x8, device supports x8
[ 2706.143789] mlx4_en 0004:01:00.0: registered PHC clock
[ 2706.143864] mlx4_en 0004:01:00.0: Activating port:1
[ 2706.159496] mlx4_en: eth11: Using 256 TX rings
[ 2706.159504] mlx4_en: eth11: Using 8 RX rings
[ 2706.159506] mlx4_en: eth11:   frag:0 - size:1518 prefix:0 stride:1536
[ 2706.159722] mlx4_en: eth11: Initializing port
[ 2706.160022] mlx4_en 0004:01:00.0: Activating port:2
[ 2706.165214] mlx4_core 0004:01:00.0 eth14: renamed from eth11
[ 2706.188419] mlx4_en: eth11: Using 256 TX rings
[ 2706.188427] mlx4_en: eth11: Using 8 RX rings
[ 2706.188430] mlx4_en: eth11:   frag:0 - size:1518 prefix:0 stride:1536
[ 2706.188660] mlx4_en: eth11: Initializing port
[ 2706.197316] EEH: Notify device driver to resume
[ 2706.525987] mlx4_core 0004:01:00.0 eth16: renamed from eth11
[ 2707.487156] mlx4_en: eth14: Link Up
[ 2707.542052] mlx4_en: eth16: Link Up

thanks.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1422481

Title:
  mlx4 not recovering from EEH in Ubuntu 15.04 (Mellanox)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1422481/+subscriptions

-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to