Re: EEH error in doing DMA with PEX 8619
On Tue, Apr 11, 2017 at 5:37 PM, Benjamin Herrenschmidt [via linuxppc]wrote: > > Another possibility would be if the requests from the PLX have a > different initiator ID on the bus than the device you are setting up > the DMA for. > Here is the problem, I think. There are three PEX 8619 devices given in lspci, and there are supported by two different modules: [root@localhost PlxSdk]# lspci -nnn | grep 8619 0001:01:00.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8619 16-lane, 16-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA [10b5:8619] (rev ba) 0001:01:00.1 System peripheral [0880]: PLX Technology, Inc. PEX 8619 16-lane, 16-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA [10b5:8619] (rev ba) 0001:02:01.0 Bridge [0680]: PLX Technology, Inc. PEX 8619 16-lane, 16-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA [10b5:8619] (rev ba) [root@localhost PlxSdk]# lsmod | grep 8000 Plx8000_DMA69021 0 Plx8000_NT 73848 0 [root@localhost ~]# dmesg | grep Probe [ 1875.493576] Plx8000_NT: Probe: 8619 10B5 [D1 01:00.1] [ 1875.493584] Plx8000_NT: Probe: -- Unsupported Device -- ... [ 1875.493867] Plx8000_NT: Probe: 8619 10B5 [D1 02:01.0] [ 1876.973489] Plx8000_DMA: Probe: 8619 10B5 [D1 01:00.1] In my test, DMA buffers are allocated with (bus 2, device 1, function 0) in module Plx8000_NT, but DMA is issued by (bus 1 device 0 function 1) in module Plx8000_DMA. And error of (bus 1 device 0 function 1) is reported by EEH. [ 1908.426579] Plx8000_DMA: Ch 0 - DMA _6060 --> _6080 (65536 bytes) [root@localhost ~]# dmesg | grep Bus\ Phy [ 1875.495524] Plx8000_NT: Bus Phys Addr: 605f [ 1878.096744] Plx8000_DMA: Bus Phys Addr: 6001 [ 1892.745698] Plx8000_NT: Bus Phys Addr: 6060 [ 1892.746348] Plx8000_NT: Bus Phys Addr: 6080 [root@localhost ~]# dmesg | grep bus [ 1875.495463] Debug Plx_dma_buffer_alloc: bus 2, device 1, function 0 [ 1876.973699] Debug AddDevice: Device bus 1 device 0 function 1 [ 1876.975155] Debug AddDevice: Device bus 1 device 0 function 1 [ 1876.976641] Debug AddDevice: Device bus 1 device 0 function 2 [ 1877.360606] Debug AddDevice: Device bus 1 device 0 function 3 [ 1877.763869] Debug AddDevice: Device bus 1 device 0 function 4 [ 1878.069865] Debug Plx_dma_buffer_alloc: bus 1, device 0, function 1 [ 1892.745446] Debug Plx_dma_buffer_alloc: bus 2, device 1, function 0 [ 1892.746109] Debug Plx_dma_buffer_alloc: bus 2, device 1, function 0 [ 1908.426649] Debug PlxDmaTransferBlock: DMA device bus 1 device 0 function 1 [ 1908.428483] Debug plx_err_detected: Device bus 1 device 0 function 1 [ 1917.490481] Debug plx_slot_reset: Device bus 1 device 0 function 1 [ 1917.490625] Debug plx_resume: Device bus 1 device 0 function 1 -- View this message in context: http://linuxppc.10917.n7.nabble.com/EEH-error-in-doing-DMA-with-PEX-8619-tp121121p121259.html Sent from the linuxppc-dev mailing list archive at Nabble.com.
Re: EEH error in doing DMA with PEX 8619
On Tue, Apr 11, 2017 at 5:37 PM, Benjamin Herrenschmidt [via linuxppc]wrote: > Another possibility would be if the requests from the PLX have a > different initiator ID on the bus than the device you are setting up > the DMA for. Is there a way to check out the initiator ID in the driver? I'd like to make sure of this. -- View this message in context: http://linuxppc.10917.n7.nabble.com/EEH-error-in-doing-DMA-with-PEX-8619-tp121121p121224.html Sent from the linuxppc-dev mailing list archive at Nabble.com.
Re: EEH error in doing DMA with PEX 8619
I did another test: - Call dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32)) in probe; - Use DMA address or BUS address in DMA But EHH error remains. All sources are based on PLX SDK 7.25. Note: Sample test is in user space. It allocates memory and starts DMA through PLX API. The original sample NT_DmaTest does DMA between BARx and Host memory. I change this for simple: Allocate two host memory buffers and try to do DMA between them. Device probe === (Driver/Source.Plx8000_DMA/Driver.c) int AddDevice( DRIVER_OBJECT *pDriverObject, struct pci_dev *pPciDev ) { U8channel; int status; U32 RegValue; DEVICE_OBJECT*fdo; DEVICE_OBJECT*pDevice; DEVICE_EXTENSION *pdx; // Allocate memory for the device object fdo = kmalloc( sizeof(DEVICE_OBJECT), GFP_KERNEL ); if (fdo == NULL) { ErrorPrintf(("ERROR - memory allocation for device object failed\n")); return (-ENOMEM); } // Initialize device object RtlZeroMemory( fdo, sizeof(DEVICE_OBJECT) ); fdo->DriverObject= pDriverObject; // Save parent driver object fdo->DeviceExtension = &(fdo->DeviceInfo); // Enable the device if (pci_enable_device( pPciDev ) == 0) { DebugPrintf(("Enabled PCI device\n")); } else { ErrorPrintf(("WARNING - PCI device enable failed\n")); } #if 1 /* New added: Set DMA mask as suggestied on linuxppc */ { int err; printk("Debug %s: dma_set_mask_and_coherent()...\n", __func__); err = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32)); if (err != 0) { printk("Error %s: Failed dma_set_mask_and_coherent(). ret = %d\n", __func__, err); return err; } } #endif // Enable bus mastering pci_set_master( pPciDev ); // // Initialize the device extension // pdx = fdo->DeviceExtension; // Clear device extension RtlZeroMemory( pdx, sizeof(DEVICE_EXTENSION) ); // Store parent device object pdx->pDeviceObject = fdo; // Save the OS-supplied PCI object pdx->pPciDevice = pPciDev; // Set initial device device state pdx->State = PLX_STATE_STOPPED; // Set initial power state pdx->PowerState = PowerDeviceD0; // Store device location information pdx->Key.domain = pci_domain_nr(pPciDev->bus); pdx->Key.bus = pPciDev->bus->number; pdx->Key.slot = PCI_SLOT(pPciDev->devfn); pdx->Key.function = PCI_FUNC(pPciDev->devfn); pdx->Key.DeviceId = pPciDev->device; pdx->Key.VendorId = pPciDev->vendor; pdx->Key.SubVendorId = pPciDev->subsystem_vendor; pdx->Key.SubDeviceId = pPciDev->subsystem_device; pdx->Key.DeviceNumber = pDriverObject->DeviceCount; // Set API access mode pdx->Key.ApiMode = PLX_API_MODE_PCI; // Update Revision ID PLX_PCI_REG_READ( pdx, PCI_REG_CLASS_REV, ); pdx->Key.Revision = (U8)(RegValue & 0xFF); // Set device mode pdx->Key.DeviceMode = PLX_CHIP_MODE_STANDARD; // Set PLX-specific port type pdx->Key.PlxPortType = PLX_SPEC_PORT_DMA; // Build device name sprintf( pdx->LinkName, PLX_DRIVER_NAME "-%d", pDriverObject->DeviceCount ); // Initialize work queue for ISR DPC queueing PLX_INIT_WORK( &(pdx->Task_DpcForIsr), DpcForIsr,// DPC routine &(pdx->Task_DpcForIsr)// DPC parameter (pre-2.6.20 only) ); // Initialize ISR spinlock spin_lock_init( &(pdx->Lock_Isr) ); // Initialize interrupt wait list INIT_LIST_HEAD( &(pdx->List_WaitObjects) ); spin_lock_init( &(pdx->Lock_WaitObjectsList) ); // Initialize physical memories list INIT_LIST_HEAD( &(pdx->List_PhysicalMem) ); spin_lock_init( &(pdx->Lock_PhysicalMemList) ); // Set the DMA mask if (Plx_dma_set_mask( pdx, PLX_DMA_BIT_MASK(48) ) == 0) { DebugPrintf(("Set DMA bit mask to 48-bits\n")); } else { DebugPrintf(("ERROR - Unable to set DMA mask to 48-bits, revert to 32-bit\n")); Plx_dma_set_mask( pdx, PLX_DMA_BIT_MASK(32) ); } // Set buffer allocation mask if (Plx_dma_set_coherent_mask( pdx, PLX_DMA_BIT_MASK(32) ) != 0) { ErrorPrintf(("WARNING - Set DMA coherent mask failed\n")); } // Initialize DMA spinlocks for (channel = 0; channel < MAX_DMA_CHANNELS; channel++) { spin_lock_init( &(pdx->Lock_Dma[channel]) ); } // // Add to driver device list // // Acquire Device List lock spin_lock( &(pDriverObject->Lock_DeviceList) ); // Get device list head pDevice = pDriverObject->DeviceObject; if (pDevice == NULL) { // Add device as first in list pDriverObject->DeviceObject = fdo; } else { // Go to end of list while (pDevice->NextDevice != NULL)
Re: EEH error in doing DMA with PEX 8619
Thanks for your replay. I fixed my test according your suggestion. The CPU physical addresses (0x 1f9e40 and 0x 1f82c0) converted with virt_to_phys() are used , instead of DMA addresses, or BUS physical addresses (0x 60a0 and 0x 60c0). However, EEH still reports error. Memory info. == [130508.050783] Plx8000_NT: Received PLX message ===> [130508.050784] Plx8000_NT: PLX_IOCTL_PHYSICAL_MEM_ALLOCATE [130508.050785] Plx8000_NT: Attempt to allocate physical memory (1953KB) [130508.051165] Plx8000_NT: Allocated physical memory... [130508.051167] Plx8000_NT: CPU Phys Addr: 1f9e40 [130508.051168] Plx8000_NT: Bus Phys Addr: 60a0 [130508.051170] Plx8000_NT: Kernel VA: c01f9e40 [130508.051171] Plx8000_NT: Size : 1E8480h (1MB) [130508.051173] Plx8000_NT: ...Completed message [130508.051184] Plx8000_NT: [130508.051185] Plx8000_NT: Received message ===> MMAP [130508.051187] Plx8000_NT: Mapped Phys (1f9e40) ==> User VA (3fff83ad) [130508.051189] Plx8000_NT: ...Completed message [130508.051196] Plx8000_NT: [130508.051198] Plx8000_NT: Received PLX message ===> [130508.051199] Plx8000_NT: PLX_IOCTL_PHYSICAL_MEM_ALLOCATE [130508.051200] Plx8000_NT: Attempt to allocate physical memory (1953KB) [130508.051562] Plx8000_NT: Allocated physical memory... [130508.051564] Plx8000_NT: CPU Phys Addr: 1f82c0 [130508.051565] Plx8000_NT: Bus Phys Addr: 60c0 [130508.051566] Plx8000_NT: Kernel VA: c01f82c0 [130508.051568] Plx8000_NT: Size : 1E8480h (1MB) [130508.051569] Plx8000_NT: ...Completed message [130508.051580] Plx8000_NT: [130508.051581] Plx8000_NT: Received message ===> MMAP [130508.051583] Plx8000_NT: Mapped Phys (1f82c0) ==> User VA (3fff838e) [130508.051585] Plx8000_NT: ...Completed message [130508.051600] Plx8000_NT: EEH info. [130515.365924] Plx8000_DMA: Received PLX message ===> [130515.365972] Plx8000_DMA: PLX_IOCTL_DMA_TRANSFER_BLOCK [130515.366033] PLX DMA[PlxDmaTransferBlock-2479] [130515.366084] PLX DMA[PlxDmaTransferBlock-2488] [130515.366131] PLX DMA[PlxDmaTransferBlock-2495] [130515.366181] Plx8000_DMA: Ch 0 - DMA 001F_9E40 --> 001F_82C0 (65536 bytes) [130515.366250] PLX DMA[PlxDmaTransferBlock-2503] [130515.366296] PLX DMA[PlxDmaTransferBlock-2511] [130515.366343] PLX DMA[PlxDmaTransferBlock-2516] [130515.366392] PLX DMA[PlxDmaTransferBlock-2521] [130515.366440] PLX DMA[PlxDmaTransferBlock-2532] [130515.366487] PLX DMA[PlxDmaTransferBlock-2535] [130515.366537] PLX DMA[PlxDmaTransferBlock-2539] [130515.366584] PLX DMA[PlxDmaTransferBlock-2550] [130515.366632] PLX DMA[PlxDmaTransferBlock-2557] [130515.366681] PLX DMA[PlxDmaTransferBlock-2562] [130515.366728] Plx8000_DMA: Start DMA transfer... [130515.366775] PLX DMA[PlxDmaTransferBlock-2565] [130515.366826] PLX DMA[PlxDmaTransferBlock-2569] [130515.366868] EEH: Frozen PE#1 on PHB#1 detected [130515.366872] EEH: PE location: Slot4, PHB location: N/A [130515.367997] EEH: This PCI device has failed 1 times in the last hour [130515.367997] EEH: Notify device drivers to shutdown [130515.368006] EEH: Collect temporary log [130515.368072] EEH: of node=0001:01:00:0 [130515.368075] EEH: PCI device/vendor: 861910b5 [130515.368077] EEH: PCI cmd/status register: 00100547 [130515.368079] EEH: Bridge secondary status: [130515.368081] EEH: Bridge control: 0002 [130515.368081] EEH: PCI-E capabilities and status follow: [130515.368091] EEH: PCI-E 00: 0052a410 8004 0046 cc82 [130515.368098] EEH: PCI-E 10: 0082 [130515.368099] EEH: PCI-E 20: [130515.368100] EEH: PCI-E AER capability register set follows: [130515.368109] EEH: PCI-E AER 00: 13810001 00062030 [130515.368116] EEH: PCI-E AER 10: 2000 00ff [130515.368122] EEH: PCI-E AER 20: [130515.368125] EEH: PCI-E AER 30: 0e0e0e0e [130515.368127] EEH: of node=0001:01:00:1 [130515.368294] Plx8000_DMA: ...Completed message [130515.368295] PLX DMA[Dispatch_IoControl-1053] [130515.368295] PLX DMA[Dispatch_IoControl-1061] [130515.368297] Plx8000_DMA: [130515.368298] Plx8000_DMA: Received PLX message ===> [130515.368298] Plx8000_DMA: PLX_IOCTL_NOTIFICATION_WAIT [130515.368299] Plx8000_DMA: Waiting for Interrupt wait object (c03c0705f880) to wake-up [130515.369283] EEH: PCI device/vendor: 861910b5 [130515.369336] EEH: PCI cmd/status register: 10100546 [130515.369384] EEH: PCI-E capabilities and status follow: [130515.369440] EEH: PCI-E 00: 0002a410 8fe4 0020204e cc82 [130515.369506] EEH: PCI-E 10: 0082 [130515.369564] EEH: PCI-E 20: [130515.393162] EEH: PCI-E AER capability register set follows: [130515.420590] EEH: PCI-E AER 00: 1f410001 00062030 [130515.441475] EEH: PCI-E AER 10: 2000 01ff [130515.454700] EEH: PCI-E AER 20:
EEH error in doing DMA with PEX 8619
Hi all! I am porting PLX driver for PEX 8619 to a power8 machine with CentOS-7.3. The PEX 8619 is used as an NTB (Non-Transparent Bridge). First, two DMA buffer are allocated with dma_alloc_coherent() and the physical address are: src: 0x _6060 dst: 0x _6080 Then, a DMA transfer is started and an EEH is reported in dmesg. This DMA test is OK at an x86_64 platform. Here are the details. Any suggestion is appreciated! [root@localhost ~]# uname -r 3.10.0-514.10.2.el7.ppc64le [root@localhost ~]# cat /etc/system-release CentOS Linux release 7.3.1611 (AltArch) [root@localhost ~]# dmesg --clear [root@localhost ~]# dmesg -w [72579.982217] usb 1-1.3: USB disconnect, device number 61 [72581.516186] usb 1-1.3: new low-speed USB device number 62 using xhci_hcd [72581.643767] usb 1-1.3: New USB device found, idVendor=04ca, idProduct=0061 [72581.644045] usb 1-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber =0 [72581.644135] usb 1-1.3: Product: USB Optical Mouse [72581.644184] usb 1-1.3: Manufacturer: PixArt [72581.680383] input: PixArt USB Optical Mouse as /devices/pci0003:00/0003:00:00 .0/0003:01:00.0/0003:02:09.0/0003:0d:00.0/usb1/1-1/1-1.3/1-1.3:1.0/input/input12 46 [72581.680806] hid-generic 0003:04CA:0061.04DF: input,hidraw1: USB HID v1.11 Mou se [PixArt USB Optical Mouse] on usb-0003:0d:00.0-1.3/input0 [72582.424769] Plx8000_NT: < > [72582.425013] Plx8000_NT: PLX 8000_NT driver v7.25 (64-bit) [72582.425058] Plx8000_NT: Supports Linux kernel v3.10.0-514.10.2.el7.ppc64le [72582.425115] Plx8000_NT: Allocated global driver object (c03c8427cc00) [72582.425120] Plx8000_NT: Registered driver (MajorID = 247) [72582.425161] Plx8000_NT: [72582.425167] Plx8000_NT: Probe: 8619 10B5 [D1 01:00.1] [72582.425180] Plx8000_NT: Probe: -- Unsupported Device -- [72582.425204] Plx8000_NT: [72582.425206] Plx8000_NT: Probe: 8619 10B5 [D1 02:01.0] [72582.425222] Plx8000_NT: Enabled PCI device [72582.425233] Plx8000_NT: Created Device (Plx8000_NT-0) [72582.425235] Plx8000_NT: Start: 8619 10B5 [D1 02:01.0] [72582.425237] Debug StartDevice 723: Reading PCI header command... [72582.425385] Debug StartDevice 725: Reading PCI header command... = 0x100146 [72582.425445] Plx8000_NT:Resource 00 [72582.425447] Plx8000_NT: Type : Memory [72582.425452] Plx8000_NT: PCI BAR 0: 8100 [72582.425454] Plx8000_NT: Phys Addr: 3FE08100 [72582.425456] Plx8000_NT: Size : 2h (128KB) [72582.425458] Plx8000_NT: Property : Non-Prefetchable 32-bit [72582.425475] Plx8000_NT: Kernel VA: d8008148 [72582.425478] Debug StartDevice 841: Read BAR0[0xd8008148] after map... [72582.425551] Debug StartDevice 843: Read BAR0[0xd8008148] after map... = 0x861910b5 [72582.425621] Plx8000_NT:Resource 01 [72582.425622] Plx8000_NT: Type : Memory [72582.425627] Plx8000_NT: PCI BAR 2: 8000 [72582.425629] Plx8000_NT: Phys Addr: 3FE08000 [72582.425631] Plx8000_NT: Size : 40h (4MB) [72582.425633] Plx8000_NT: Property : Non-Prefetchable 32-bit [72582.425639] Plx8000_NT: Kernel VA: d8008400 [72582.425641] Debug StartDevice 849: Read BAR2[0xd8008400] after map... [72582.425727] Debug StartDevice 851: Read BAR2[0xd8008400] after map... = 0xf000eef3 [72582.425798] Plx8000_NT:Resource 02 [72582.425799] Plx8000_NT: Type : Memory [72582.425804] Plx8000_NT: PCI BAR 3: 8040 [72582.425806] Plx8000_NT: Phys Addr: 3FE08040 [72582.425808] Plx8000_NT: Size : 40h (4MB) [72582.425809] Plx8000_NT: Property : Non-Prefetchable 32-bit [72582.425813] Plx8000_NT: Kernel VA: d8008480 [72582.425815] Plx8000_NT:Resource 03 [72582.425816] Plx8000_NT: Type : Memory [72582.425821] Plx8000_NT: PCI BAR 4: 8080 [72582.425822] Plx8000_NT: Phys Addr: 3FE08080 [72582.425824] Plx8000_NT: Size : 40h (4MB) [72582.425826] Plx8000_NT: Property : Non-Prefetchable 32-bit [72582.425830] Plx8000_NT: Kernel VA: d8008500 [72582.425831] Plx8000_NT:Resource 04 [72582.425832] Plx8000_NT: Type : Memory [72582.425837] Plx8000_NT: PCI BAR 5: 80C0 [72582.425839] Plx8000_NT: Phys Addr: 3FE080C0 [72582.425841] Plx8000_NT: Size : 40h (4MB) [72582.425842] Plx8000_NT: Property : Non-Prefetchable 32-bit [72582.425846] Plx8000_NT: Kernel VA: d8008580 [72582.425848] Debug StartDevice 862: Reading PCI header command... [72582.425911] Debug StartDevice 864: Reading PCI header