Michael, Frank found the cause to the problem in the implementation of arch/ppc/kernel/pci.c , and asked the IBM kernel group to send a bug fix to the Linux kernel group.
The problem is : 1. This bug fix will not enter SLES10 as it is closed. 2. It also will not enter SLES9 :-) or Redhate as4 u4 . So we need a bug fix that will enable the use of mstflint on js21 PPC64 + backport to old systems . Franks fix is based on two points (if I understand the code with no errors) - 1. It opens /proc/bus/pci... And not /sys/bus/pci/... 2. It perform an ictl(fd, PCIIOC_MMAP_IS_MEM) ; Frank - am I write ? Can we enter these two small changes to the mstflint to have it working on the PPC64 js21 ? Moshe ____________________________________________________________ Moshe Katzir | +972-9971-8639 (o) | +972-52-860-6042 (m) Voltaire - The Grid Backbone www.voltaire.com -----Original Message----- From: Michael S. Tsirkin [mailto:[EMAIL PROTECTED] Sent: Thursday, September 28, 2006 4:41 PM To: Moshe Kazir Cc: Tseng-Hui (Frank) Lin; [EMAIL PROTECTED]; [email protected] Subject: Re: FW: Mstflint - not working on ppc64 andwhendriver is not loaded on AMD Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > > Quoting r. Moshe Kazir <[EMAIL PROTECTED]>: > > Subject: RE: FW: Mstflint - not working on ppc64 andwhendriver is > > not > > loaded on AMD > > > > > > # ls /sys/class/infiniband/mthca0/device/resource0 > > /sys/class/infiniband/mthca0/device/resource0 > > OK, so can you try this please: > > strace -f -v -o log mstflint -d > /sys/class/infiniband/mthca0/device/resource0 q > > cat log > > -- > MST > 30463 open("/sys/class/infiniband/mthca0/device/resource0", O_RDWR|O_SYNC|O_LARGEFILE) = 3 > 30463 mmap2(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = -1 EINVAL (Invalid argument) So we see that mmap is failing with EINVAL. But why? We seem to be passing all valid parameters to it. I'm looking at arch/ppc/kernel/pci.c at the moment. It seems that EINVAL is returned if __pci_mmap_make_offset fails, and that seems to be only looking for a valid resource size. Are you up to finding the root cause of the problem in arch/ppc/kernel/pci.c? Maybe the resource offsets are wrong? What does cat /sys/class/infiniband/mthca0/device/resource show? Maybe there's some problem to map a full megabyte? Here's a test that only maps 4K. Could you strace it please? >>>>>>>>>>> #define _XOPEN_SOURCE 500 #define _FILE_OFFSET_BITS 64 #include <stdio.h> #include <unistd.h> #include <netinet/in.h> #include <endian.h> #include <byteswap.h> #include <errno.h> #include <fcntl.h> #include <string.h> #include <stdlib.h> #include <sys/pci.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <sys/pci.h> #include <sys/stat.h> /* #include <sys/ioctl.h> * #include <sys/types.h> */ int main() { int fd; unsigned value; volatile void *ptr; fd = open("/proc/bus/pci/00/00.0" ,O_RDWR | O_SYNC); /* ioctl(fd, PCIIOC_MMAP_IS_MEM); */ ptr = mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0xf0000); memcpy(&value, (void*)(ptr + 0x14), sizeof value); printf("0x%x\n"); return 0; } -- MST _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
