Re: EEH error in doing DMA with PEX 8619

2017-04-12 Thread IanJiang
On Tue, Apr 11, 2017 at 5:37 PM, Benjamin Herrenschmidt [via linuxppc]
 wrote:
>
> Another possibility would be if the requests from the PLX have a
> different initiator ID on the bus than the device you are setting up
> the DMA for.
>

Here is the problem, I think.
There are three PEX 8619 devices given in lspci, and there are
supported by two different modules:

[root@localhost PlxSdk]# lspci -nnn | grep 8619
0001:01:00.0 PCI bridge [0604]: PLX Technology, Inc. PEX 8619 16-lane,
16-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA [10b5:8619] (rev
ba)
0001:01:00.1 System peripheral [0880]: PLX Technology, Inc. PEX 8619
16-lane, 16-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA
[10b5:8619] (rev ba)
0001:02:01.0 Bridge [0680]: PLX Technology, Inc. PEX 8619 16-lane,
16-Port PCI Express Gen 2 (5.0 GT/s) Switch with DMA [10b5:8619] (rev
ba)

[root@localhost PlxSdk]# lsmod | grep 8000
Plx8000_DMA69021  0
Plx8000_NT 73848  0


[root@localhost ~]# dmesg | grep Probe
[ 1875.493576] Plx8000_NT: Probe: 8619 10B5 [D1 01:00.1]
[ 1875.493584] Plx8000_NT: Probe: -- Unsupported Device --
...
[ 1875.493867] Plx8000_NT: Probe: 8619 10B5 [D1 02:01.0]
[ 1876.973489] Plx8000_DMA: Probe: 8619 10B5 [D1 01:00.1]

In my test, DMA buffers are allocated with  (bus 2, device 1, function
0) in module Plx8000_NT, but DMA is issued by (bus 1 device 0 function
1) in module Plx8000_DMA. And error of (bus 1 device 0 function 1) is
reported by EEH.


[ 1908.426579] Plx8000_DMA: Ch 0 - DMA _6060 -->
_6080 (65536 bytes)

[root@localhost ~]# dmesg | grep Bus\ Phy
[ 1875.495524] Plx8000_NT: Bus Phys Addr: 605f
[ 1878.096744] Plx8000_DMA: Bus Phys Addr: 6001
[ 1892.745698] Plx8000_NT: Bus Phys Addr: 6060
[ 1892.746348] Plx8000_NT: Bus Phys Addr: 6080


[root@localhost ~]# dmesg | grep bus
[ 1875.495463] Debug Plx_dma_buffer_alloc: bus 2, device 1, function 0
[ 1876.973699] Debug AddDevice: Device bus 1 device 0 function 1
[ 1876.975155] Debug AddDevice: Device bus 1 device 0 function 1
[ 1876.976641] Debug AddDevice: Device bus 1 device 0 function 2
[ 1877.360606] Debug AddDevice: Device bus 1 device 0 function 3
[ 1877.763869] Debug AddDevice: Device bus 1 device 0 function 4
[ 1878.069865] Debug Plx_dma_buffer_alloc: bus 1, device 0, function 1
[ 1892.745446] Debug Plx_dma_buffer_alloc: bus 2, device 1, function 0
[ 1892.746109] Debug Plx_dma_buffer_alloc: bus 2, device 1, function 0
[ 1908.426649] Debug PlxDmaTransferBlock: DMA device bus 1 device 0 function 1
[ 1908.428483] Debug plx_err_detected: Device bus 1 device 0 function 1
[ 1917.490481] Debug plx_slot_reset: Device bus 1 device 0 function 1
[ 1917.490625] Debug plx_resume: Device bus 1 device 0 function 1




--
View this message in context: 
http://linuxppc.10917.n7.nabble.com/EEH-error-in-doing-DMA-with-PEX-8619-tp121121p121259.html
Sent from the linuxppc-dev mailing list archive at Nabble.com.

Re: EEH error in doing DMA with PEX 8619

2017-04-11 Thread IanJiang
On Tue, Apr 11, 2017 at 5:37 PM, Benjamin Herrenschmidt [via linuxppc]
 wrote:

> Another possibility would be if the requests from the PLX have a 
> different initiator ID on the bus than the device you are setting up 
> the DMA for. 

Is there a way to check out the initiator ID in the driver? I'd like to make
sure of this.



--
View this message in context: 
http://linuxppc.10917.n7.nabble.com/EEH-error-in-doing-DMA-with-PEX-8619-tp121121p121224.html
Sent from the linuxppc-dev mailing list archive at Nabble.com.


Re: EEH error in doing DMA with PEX 8619

2017-04-11 Thread IanJiang
I did another test:
- Call dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32)) in probe;
- Use DMA address or BUS address in DMA
But EHH error remains.

All sources are based on PLX SDK 7.25.
Note: Sample test is in user space. It allocates memory and starts DMA
through PLX API.
The original sample NT_DmaTest does DMA between BARx and Host memory.
I change this for simple: Allocate two host memory buffers and try to do DMA
between them.

Device probe
===
(Driver/Source.Plx8000_DMA/Driver.c)

int
AddDevice(
DRIVER_OBJECT  *pDriverObject,
struct pci_dev *pPciDev
)
{
U8channel;
int   status;
U32   RegValue;
DEVICE_OBJECT*fdo;
DEVICE_OBJECT*pDevice;
DEVICE_EXTENSION *pdx;


// Allocate memory for the device object
fdo =
kmalloc(
sizeof(DEVICE_OBJECT),
GFP_KERNEL
);

if (fdo == NULL)
{
ErrorPrintf(("ERROR - memory allocation for device object
failed\n"));
return (-ENOMEM);
}

// Initialize device object
RtlZeroMemory( fdo, sizeof(DEVICE_OBJECT) );

fdo->DriverObject= pDriverObject; // Save parent driver
object
fdo->DeviceExtension = &(fdo->DeviceInfo);

// Enable the device
if (pci_enable_device( pPciDev ) == 0)
{
DebugPrintf(("Enabled PCI device\n"));
}
else
{
ErrorPrintf(("WARNING - PCI device enable failed\n"));
}

#if 1
/* New added: Set DMA mask as suggestied on linuxppc */
{
int err;
printk("Debug %s: dma_set_mask_and_coherent()...\n", __func__);
err = dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(32));
if (err != 0) {
printk("Error %s: Failed dma_set_mask_and_coherent(). ret = %d\n",
__func__, err);
return err;
}

}
#endif
// Enable bus mastering
pci_set_master( pPciDev );

//
// Initialize the device extension
//

pdx = fdo->DeviceExtension;

// Clear device extension
RtlZeroMemory( pdx, sizeof(DEVICE_EXTENSION) );

// Store parent device object
pdx->pDeviceObject = fdo;

// Save the OS-supplied PCI object
pdx->pPciDevice = pPciDev;

// Set initial device device state
pdx->State = PLX_STATE_STOPPED;

// Set initial power state
pdx->PowerState = PowerDeviceD0;

// Store device location information
pdx->Key.domain   = pci_domain_nr(pPciDev->bus);
pdx->Key.bus  = pPciDev->bus->number;
pdx->Key.slot = PCI_SLOT(pPciDev->devfn);
pdx->Key.function = PCI_FUNC(pPciDev->devfn);
pdx->Key.DeviceId = pPciDev->device;
pdx->Key.VendorId = pPciDev->vendor;
pdx->Key.SubVendorId  = pPciDev->subsystem_vendor;
pdx->Key.SubDeviceId  = pPciDev->subsystem_device;
pdx->Key.DeviceNumber = pDriverObject->DeviceCount;

// Set API access mode
pdx->Key.ApiMode = PLX_API_MODE_PCI;

// Update Revision ID
PLX_PCI_REG_READ( pdx, PCI_REG_CLASS_REV,  );
pdx->Key.Revision = (U8)(RegValue & 0xFF);

// Set device mode
pdx->Key.DeviceMode = PLX_CHIP_MODE_STANDARD;

// Set PLX-specific port type
pdx->Key.PlxPortType = PLX_SPEC_PORT_DMA;

// Build device name
sprintf(
pdx->LinkName,
PLX_DRIVER_NAME "-%d",
pDriverObject->DeviceCount
);

// Initialize work queue for ISR DPC queueing
PLX_INIT_WORK(
&(pdx->Task_DpcForIsr),
DpcForIsr,// DPC routine
&(pdx->Task_DpcForIsr)// DPC parameter (pre-2.6.20 only)
);

// Initialize ISR spinlock
spin_lock_init( &(pdx->Lock_Isr) );

// Initialize interrupt wait list
INIT_LIST_HEAD( &(pdx->List_WaitObjects) );
spin_lock_init( &(pdx->Lock_WaitObjectsList) );

// Initialize physical memories list
INIT_LIST_HEAD( &(pdx->List_PhysicalMem) );
spin_lock_init( &(pdx->Lock_PhysicalMemList) );

// Set the DMA mask
if (Plx_dma_set_mask( pdx, PLX_DMA_BIT_MASK(48) ) == 0)
{
DebugPrintf(("Set DMA bit mask to 48-bits\n"));
}
else
{
DebugPrintf(("ERROR - Unable to set DMA mask to 48-bits, revert to
32-bit\n"));
Plx_dma_set_mask( pdx, PLX_DMA_BIT_MASK(32) );
}

// Set buffer allocation mask
if (Plx_dma_set_coherent_mask( pdx, PLX_DMA_BIT_MASK(32) ) != 0)
{
ErrorPrintf(("WARNING - Set DMA coherent mask failed\n"));
}

// Initialize DMA spinlocks
for (channel = 0; channel < MAX_DMA_CHANNELS; channel++)
{
spin_lock_init( &(pdx->Lock_Dma[channel]) );
}

//
// Add to driver device list
//

// Acquire Device List lock
spin_lock( &(pDriverObject->Lock_DeviceList) );

// Get device list head
pDevice = pDriverObject->DeviceObject;

if (pDevice == NULL)
{
// Add device as first in list
pDriverObject->DeviceObject = fdo;
}
else
{
// Go to end of list
while (pDevice->NextDevice != NULL)
 

Re: EEH error in doing DMA with PEX 8619

2017-04-10 Thread IanJiang
Thanks for your replay.

I fixed my test according your suggestion. The CPU physical addresses (0x
1f9e40 and 0x 1f82c0) converted with virt_to_phys() are used ,
instead of DMA addresses, or BUS physical addresses (0x 60a0 and 0x
60c0). However, EEH still reports error.

Memory info.
==

[130508.050783] Plx8000_NT: Received PLX message ===> 
[130508.050784] Plx8000_NT: PLX_IOCTL_PHYSICAL_MEM_ALLOCATE
[130508.050785] Plx8000_NT: Attempt to allocate physical memory (1953KB)
[130508.051165] Plx8000_NT: Allocated physical memory...
[130508.051167] Plx8000_NT: CPU Phys Addr: 1f9e40
[130508.051168] Plx8000_NT: Bus Phys Addr: 60a0
[130508.051170] Plx8000_NT: Kernel VA: c01f9e40
[130508.051171] Plx8000_NT: Size : 1E8480h (1MB)
[130508.051173] Plx8000_NT: ...Completed message
[130508.051184] Plx8000_NT: 
[130508.051185] Plx8000_NT: Received message ===> MMAP
[130508.051187] Plx8000_NT: Mapped Phys (1f9e40) ==> User VA
(3fff83ad)
[130508.051189] Plx8000_NT: ...Completed message
[130508.051196] Plx8000_NT: 
[130508.051198] Plx8000_NT: Received PLX message ===> 
[130508.051199] Plx8000_NT: PLX_IOCTL_PHYSICAL_MEM_ALLOCATE
[130508.051200] Plx8000_NT: Attempt to allocate physical memory (1953KB)
[130508.051562] Plx8000_NT: Allocated physical memory...
[130508.051564] Plx8000_NT: CPU Phys Addr: 1f82c0
[130508.051565] Plx8000_NT: Bus Phys Addr: 60c0
[130508.051566] Plx8000_NT: Kernel VA: c01f82c0
[130508.051568] Plx8000_NT: Size : 1E8480h (1MB)
[130508.051569] Plx8000_NT: ...Completed message
[130508.051580] Plx8000_NT: 
[130508.051581] Plx8000_NT: Received message ===> MMAP
[130508.051583] Plx8000_NT: Mapped Phys (1f82c0) ==> User VA
(3fff838e)
[130508.051585] Plx8000_NT: ...Completed message
[130508.051600] Plx8000_NT: 

EEH info.


[130515.365924] Plx8000_DMA: Received PLX message ===> 
[130515.365972] Plx8000_DMA: PLX_IOCTL_DMA_TRANSFER_BLOCK
[130515.366033] PLX DMA[PlxDmaTransferBlock-2479]
[130515.366084] PLX DMA[PlxDmaTransferBlock-2488]
[130515.366131] PLX DMA[PlxDmaTransferBlock-2495]
[130515.366181] Plx8000_DMA: Ch 0 - DMA 001F_9E40 -->
001F_82C0 (65536 bytes)
[130515.366250] PLX DMA[PlxDmaTransferBlock-2503]
[130515.366296] PLX DMA[PlxDmaTransferBlock-2511]
[130515.366343] PLX DMA[PlxDmaTransferBlock-2516]
[130515.366392] PLX DMA[PlxDmaTransferBlock-2521]
[130515.366440] PLX DMA[PlxDmaTransferBlock-2532]
[130515.366487] PLX DMA[PlxDmaTransferBlock-2535]
[130515.366537] PLX DMA[PlxDmaTransferBlock-2539]
[130515.366584] PLX DMA[PlxDmaTransferBlock-2550]
[130515.366632] PLX DMA[PlxDmaTransferBlock-2557]
[130515.366681] PLX DMA[PlxDmaTransferBlock-2562]
[130515.366728] Plx8000_DMA: Start DMA transfer...
[130515.366775] PLX DMA[PlxDmaTransferBlock-2565]
[130515.366826] PLX DMA[PlxDmaTransferBlock-2569]
[130515.366868] EEH: Frozen PE#1 on PHB#1 detected
[130515.366872] EEH: PE location: Slot4, PHB location: N/A
[130515.367997] EEH: This PCI device has failed 1 times in the last hour
[130515.367997] EEH: Notify device drivers to shutdown
[130515.368006] EEH: Collect temporary log
[130515.368072] EEH: of node=0001:01:00:0
[130515.368075] EEH: PCI device/vendor: 861910b5
[130515.368077] EEH: PCI cmd/status register: 00100547
[130515.368079] EEH: Bridge secondary status: 
[130515.368081] EEH: Bridge control: 0002
[130515.368081] EEH: PCI-E capabilities and status follow:
[130515.368091] EEH: PCI-E 00: 0052a410 8004 0046 cc82 
[130515.368098] EEH: PCI-E 10: 0082    
[130515.368099] EEH: PCI-E 20:  
[130515.368100] EEH: PCI-E AER capability register set follows:
[130515.368109] EEH: PCI-E AER 00: 13810001   00062030 
[130515.368116] EEH: PCI-E AER 10:  2000 00ff  
[130515.368122] EEH: PCI-E AER 20:     
[130515.368125] EEH: PCI-E AER 30:  0e0e0e0e 
[130515.368127] EEH: of node=0001:01:00:1
[130515.368294] Plx8000_DMA: ...Completed message
[130515.368295] PLX DMA[Dispatch_IoControl-1053]
[130515.368295] PLX DMA[Dispatch_IoControl-1061]
[130515.368297] Plx8000_DMA: 
[130515.368298] Plx8000_DMA: Received PLX message ===> 
[130515.368298] Plx8000_DMA: PLX_IOCTL_NOTIFICATION_WAIT
[130515.368299] Plx8000_DMA: Waiting for Interrupt wait object
(c03c0705f880) to wake-up
[130515.369283] EEH: PCI device/vendor: 861910b5
[130515.369336] EEH: PCI cmd/status register: 10100546
[130515.369384] EEH: PCI-E capabilities and status follow:
[130515.369440] EEH: PCI-E 00: 0002a410 8fe4 0020204e cc82 
[130515.369506] EEH: PCI-E 10: 0082    
[130515.369564] EEH: PCI-E 20:  
[130515.393162] EEH: PCI-E AER capability register set follows:
[130515.420590] EEH: PCI-E AER 00: 1f410001   00062030 
[130515.441475] EEH: PCI-E AER 10:  2000 01ff  
[130515.454700] EEH: PCI-E AER 20: 

EEH error in doing DMA with PEX 8619

2017-04-10 Thread IanJiang
Hi all!

I am porting PLX driver for PEX 8619 to a power8 machine with CentOS-7.3.
The PEX 8619 is used as an NTB (Non-Transparent Bridge).

First, two DMA buffer are allocated with dma_alloc_coherent() and the
physical address are:
src: 0x _6060
dst: 0x _6080
Then, a DMA transfer is started and an EEH is reported in dmesg.

This DMA test is OK at an x86_64 platform.

Here are the details. Any suggestion is appreciated! 

[root@localhost ~]# uname -r
3.10.0-514.10.2.el7.ppc64le
[root@localhost ~]# cat /etc/system-release
CentOS Linux release 7.3.1611 (AltArch)
[root@localhost ~]# dmesg --clear
[root@localhost ~]# dmesg -w
[72579.982217] usb 1-1.3: USB disconnect, device number 61
[72581.516186] usb 1-1.3: new low-speed USB device number 62 using xhci_hcd
[72581.643767] usb 1-1.3: New USB device found, idVendor=04ca,
idProduct=0061
[72581.644045] usb 1-1.3: New USB device strings: Mfr=1, Product=2,
SerialNumber  =0
[72581.644135] usb 1-1.3: Product: USB Optical Mouse
[72581.644184] usb 1-1.3: Manufacturer: PixArt
[72581.680383] input: PixArt USB Optical Mouse as
/devices/pci0003:00/0003:00:00 
.0/0003:01:00.0/0003:02:09.0/0003:0d:00.0/usb1/1-1/1-1.3/1-1.3:1.0/input/input12
 
46
[72581.680806] hid-generic 0003:04CA:0061.04DF: input,hidraw1: USB HID v1.11
Mou  se [PixArt USB Optical Mouse]
on usb-0003:0d:00.0-1.3/input0

[72582.424769] Plx8000_NT:
<   
  
>
[72582.425013] Plx8000_NT: PLX 8000_NT driver v7.25 (64-bit)
[72582.425058] Plx8000_NT: Supports Linux kernel
v3.10.0-514.10.2.el7.ppc64le
[72582.425115] Plx8000_NT: Allocated global driver object (c03c8427cc00)
[72582.425120] Plx8000_NT: Registered driver (MajorID = 247)
[72582.425161] Plx8000_NT:
[72582.425167] Plx8000_NT: Probe: 8619 10B5 [D1 01:00.1]
[72582.425180] Plx8000_NT: Probe: -- Unsupported Device --
[72582.425204] Plx8000_NT:
[72582.425206] Plx8000_NT: Probe: 8619 10B5 [D1 02:01.0]
[72582.425222] Plx8000_NT: Enabled PCI device
[72582.425233] Plx8000_NT: Created Device (Plx8000_NT-0)
[72582.425235] Plx8000_NT: Start: 8619 10B5 [D1 02:01.0]
[72582.425237] Debug StartDevice 723: Reading PCI header command...
[72582.425385] Debug StartDevice 725: Reading PCI header command... =
0x100146
[72582.425445] Plx8000_NT:Resource 00
[72582.425447] Plx8000_NT:  Type : Memory
[72582.425452] Plx8000_NT:  PCI BAR 0: 8100
[72582.425454] Plx8000_NT:  Phys Addr: 3FE08100
[72582.425456] Plx8000_NT:  Size : 2h (128KB)
[72582.425458] Plx8000_NT:  Property : Non-Prefetchable 32-bit
[72582.425475] Plx8000_NT:  Kernel VA: d8008148
[72582.425478] Debug StartDevice 841: Read BAR0[0xd8008148] after
map...
[72582.425551] Debug StartDevice 843: Read BAR0[0xd8008148] after
map...   = 0x861910b5
[72582.425621] Plx8000_NT:Resource 01
[72582.425622] Plx8000_NT:  Type : Memory
[72582.425627] Plx8000_NT:  PCI BAR 2: 8000
[72582.425629] Plx8000_NT:  Phys Addr: 3FE08000
[72582.425631] Plx8000_NT:  Size : 40h (4MB)
[72582.425633] Plx8000_NT:  Property : Non-Prefetchable 32-bit
[72582.425639] Plx8000_NT:  Kernel VA: d8008400
[72582.425641] Debug StartDevice 849: Read BAR2[0xd8008400] after
map...
[72582.425727] Debug StartDevice 851: Read BAR2[0xd8008400] after
map...   = 0xf000eef3
[72582.425798] Plx8000_NT:Resource 02
[72582.425799] Plx8000_NT:  Type : Memory
[72582.425804] Plx8000_NT:  PCI BAR 3: 8040
[72582.425806] Plx8000_NT:  Phys Addr: 3FE08040
[72582.425808] Plx8000_NT:  Size : 40h (4MB)
[72582.425809] Plx8000_NT:  Property : Non-Prefetchable 32-bit
[72582.425813] Plx8000_NT:  Kernel VA: d8008480
[72582.425815] Plx8000_NT:Resource 03
[72582.425816] Plx8000_NT:  Type : Memory
[72582.425821] Plx8000_NT:  PCI BAR 4: 8080
[72582.425822] Plx8000_NT:  Phys Addr: 3FE08080
[72582.425824] Plx8000_NT:  Size : 40h (4MB)
[72582.425826] Plx8000_NT:  Property : Non-Prefetchable 32-bit
[72582.425830] Plx8000_NT:  Kernel VA: d8008500
[72582.425831] Plx8000_NT:Resource 04
[72582.425832] Plx8000_NT:  Type : Memory
[72582.425837] Plx8000_NT:  PCI BAR 5: 80C0
[72582.425839] Plx8000_NT:  Phys Addr: 3FE080C0
[72582.425841] Plx8000_NT:  Size : 40h (4MB)
[72582.425842] Plx8000_NT:  Property : Non-Prefetchable 32-bit
[72582.425846] Plx8000_NT:  Kernel VA: d8008580
[72582.425848] Debug StartDevice 862: Reading PCI header command...
[72582.425911] Debug StartDevice 864: Reading PCI header