On 06/13/2016 07:40 PM, Lutz Vieweg wrote: > On 06/13/2016 04:46 AM, Wan ZongShun wrote: >> Firstly, I need to know if your ethernet card works well now or not >> after you set iommu=pt. > > Too early to tell - the NIC worked for the last 4 days now without > failing, however, that is only about the same time as it took after > the upgrade to linux-4.6.1 before the bug was encountered, first.
I can now say that after using the option iommu=pt with linux-4.6.1, the machine ran for > 2 months without problems. For other reasons (btrfs-stuff) I had to upgrade the machine to linux-4.7.2 last week, and the "iommu=pt" option wasn't active after this upgrade. It only took 4 days until the "AMD-Vi: Event logged IO_PAGE_FAULT... ixgbe Detected Tx Unit Hang" issue occured again. So this evening, I'll reboot linux-4.7.2 with "iommu=pt" again, as that really seemed to help. Regards, Lutz Vieweg >> If your ethernet card with 64bit(not 32bit) DMA addressable cap, that >> is ok, you will not be impacted by bounce buffer. > >> But iommu=pt is a terrible option, that make all devices bypass the iommu. > > Why is that terrible? The documentation I found on what iommu=pt actually > means were pretty scarce, but I noticed how many places recommended to use > this option for 10G NICs. > >> If you want to get further help, Please try: >> >> (1)Please add 'amd_iommu_dump' option in your kernel boot option, and >> send your full kernel logs, lspci info, don't add iommu=pt. >> (2) Add amd_iommu=fullflush option to kernel boot option, just try it. > > Will try that when the NIC becomes unavailable again. > >>> One more thing I find curious, but this didn't change with "iommu=pt": >>>> >>>> [ 0.000000] AGP: Checking aperture... >>>> [ 0.000000] AGP: No AGP bridge found >>>> [ 0.000000] AGP: Node 0: aperture [bus addr 0x00000000-0x01ffffff] >>>> (32MB) >>>> [ 0.000000] AGP: Your BIOS doesn't leave an aperture memory hole >>>> [ 0.000000] AGP: Please enable the IOMMU option in the BIOS setup >>>> [ 0.000000] AGP: This costs you 64MB of RAM >>>> [ 0.000000] AGP: Mapping aperture over RAM [mem 0xcc000000-0xcfffffff] >>>> (65536KB) >>> >>> I checked and the IOMMU-option is definitely enabled in the BIOS setup. >>> So I assume right that these message are irrelevant (since AGP as a whole >>> is irrelevant on this server)? >> >> Please cat /proc/iomem, send the information. > > Here it is: >> 00000000-00000fff : reserved >> 00001000-00097bff : System RAM >> 00097c00-0009ffff : reserved >> 000a0000-000bffff : PCI Bus 0000:00 >> 000c0000-000c7fff : Video ROM >> 000ce800-000d43ff : Adapter ROM >> 000d4800-000d57ff : Adapter ROM >> 000e6000-000fffff : reserved >> 000f0000-000fffff : System ROM >> 00100000-d7e7ffff : System RAM >> 01000000-01688c05 : Kernel code >> 01688c06-01d4f53f : Kernel data >> 01eea000-02174fff : Kernel bss >> d7e80000-d7e8dfff : RAM buffer >> d7e8e000-d7e8ffff : reserved >> d7e90000-d7eb3fff : ACPI Tables >> d7eb4000-d7edffff : ACPI Non-volatile Storage >> d7ee0000-d7ffffff : reserved >> d9000000-daffffff : PCI Bus 0000:40 >> d9000000-d90003ff : IOAPIC 2 >> d9010000-d9013fff : amd_iommu >> db000000-dcffffff : PCI Bus 0000:00 >> db000000-dbffffff : PCI Bus 0000:01 >> db000000-dbffffff : 0000:01:04.0 >> db000000-dbffffff : mgadrmfb_vram >> dcd00000-dcffffff : PCI Bus 0000:04 >> dcdfc000-dcdfffff : 0000:04:00.0 >> dcdfc000-dcdfffff : ixgbe >> dce00000-dcffffff : 0000:04:00.0 >> dce00000-dcffffff : ixgbe >> dd000000-dfffffff : PCI Bus 0000:00 >> def00000-df7fffff : PCI Bus 0000:01 >> deffc000-deffffff : 0000:01:04.0 >> deffc000-deffffff : mgadrmfb_mmio >> df000000-df7fffff : 0000:01:04.0 >> dfaf6000-dfaf6fff : 0000:00:12.1 >> dfaf6000-dfaf6fff : ohci_hcd >> dfaf7000-dfaf7fff : 0000:00:12.0 >> dfaf7000-dfaf7fff : ohci_hcd >> dfaf8400-dfaf87ff : 0000:00:11.0 >> dfaf8400-dfaf87ff : ahci >> dfaf8800-dfaf88ff : 0000:00:12.2 >> dfaf8800-dfaf88ff : ehci_hcd >> dfaf8c00-dfaf8cff : 0000:00:13.2 >> dfaf8c00-dfaf8cff : ehci_hcd >> dfaf9000-dfaf9fff : 0000:00:13.1 >> dfaf9000-dfaf9fff : ohci_hcd >> dfafa000-dfafafff : 0000:00:13.0 >> dfafa000-dfafafff : ohci_hcd >> dfafb000-dfafbfff : 0000:00:14.5 >> dfafb000-dfafbfff : ohci_hcd >> dfb00000-dfbfffff : PCI Bus 0000:02 >> dfb1c000-dfb1ffff : 0000:02:00.1 >> dfb1c000-dfb1ffff : igb >> dfb20000-dfb3ffff : 0000:02:00.1 >> dfb40000-dfb5ffff : 0000:02:00.1 >> dfb40000-dfb5ffff : igb >> dfb60000-dfb7ffff : 0000:02:00.1 >> dfb60000-dfb7ffff : igb >> dfb9c000-dfb9ffff : 0000:02:00.0 >> dfb9c000-dfb9ffff : igb >> dfba0000-dfbbffff : 0000:02:00.0 >> dfbc0000-dfbdffff : 0000:02:00.0 >> dfbc0000-dfbdffff : igb >> dfbe0000-dfbfffff : 0000:02:00.0 >> dfbe0000-dfbfffff : igb >> dfc00000-dfcfffff : PCI Bus 0000:03 >> dfc3c000-dfc3ffff : 0000:03:00.0 >> dfc3c000-dfc3ffff : mpt2sas >> dfc40000-dfc7ffff : 0000:03:00.0 >> dfc40000-dfc7ffff : mpt2sas >> dfc80000-dfcfffff : 0000:03:00.0 >> dfd00000-dfdfffff : PCI Bus 0000:04 >> dfd80000-dfdfffff : 0000:04:00.0 >> dfe00000-dfffffff : PCI Bus 0000:05 >> dfeb0000-dfebffff : 0000:05:00.0 >> dfeb0000-dfebffff : mpt2sas >> dfec0000-dfefffff : 0000:05:00.0 >> dfec0000-dfefffff : mpt2sas >> dff00000-dfffffff : 0000:05:00.0 >> e0000000-efffffff : PCI MMCONFIG 0000 [bus 00-ff] >> e0000000-efffffff : reserved >> e0000000-efffffff : pnp 00:0a >> f6000000-f6003fff : amd_iommu >> fec00000-fec003ff : IOAPIC 0 >> fec10000-fec1001f : pnp 00:04 >> fec20000-fec203ff : IOAPIC 1 >> fed00000-fed003ff : HPET 2 >> fed00000-fed003ff : PNP0103:00 >> fed40000-fed44fff : PCI Bus 0000:00 >> fee00000-fee00fff : Local APIC >> fee00000-fee00fff : pnp 00:03 >> ffb80000-ffbfffff : pnp 00:04 >> ffe00000-ffffffff : reserved >> ffe50000-ffe5e05f : pnp 00:04 >> 100000000-2026ffffff : System RAM >> 2027000000-2027ffffff : RAM buffer > > Regards, > > Lutz Vieweg > ------------------------------------------------------------------------------ _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired