Re: [PATCH v5 09/34] cxlflash: Correct naming of limbo state and waitq

2015-10-02 Thread Tomas Henzl
On 1.10.2015 17:55, Matthew R. Ochs wrote:
> Limbo is not an accurate representation of this state and is
> also not consistent with the terminology that other drivers
> use to represent this concept. Rename the state and and its
> associated waitq to 'reset'.
>
> Signed-off-by: Matthew R. Ochs 
> Signed-off-by: Manoj N. Kumar 
> Reviewed-by: Brian King 
> Reviewed-by: Daniel Axtens 

Reviewed-by: Tomas Henzl 

Tomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 03/34] cxlflash: Fix read capacity timeout

2015-10-02 Thread Tomas Henzl
On 1.10.2015 17:54, Matthew R. Ochs wrote:
> From: Manoj Kumar 
>
> The timeout value for read capacity is too small. Certain devices
> may take longer to respond and thus the command may prematurely
> timeout. Additionally the literal used for the timeout is stale.
>
> Update the timeout to 30 seconds (matches the value used in sd.c)
> and rework the timeout literal to a more appropriate description.
>
> Signed-off-by: Matthew R. Ochs 
> Signed-off-by: Manoj N. Kumar 
> Reviewed-by: Brian King 

Reviewed-by: Tomas Henzl 

Tomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 04/34] cxlflash: Fix potential oops following LUN removal

2015-10-02 Thread Tomas Henzl
On 1.10.2015 17:55, Matthew R. Ochs wrote:
> When a LUN is removed, the sdev that is associated with the LUN
> remains intact until its reference count drops to 0. In order
> to prevent an sdev from being removed while a context is still
> associated with it, obtain an additional reference per-context
> for each LUN attached to the context.
>
> This resolves a potential Oops in the release handler when a
> dealing with a LUN that has already been removed.
>
> Signed-off-by: Matthew R. Ochs 
> Signed-off-by: Manoj N. Kumar 
> Reviewed-by: Brian King 

Reviewed-by: Tomas Henzl 

Tomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 07/34] cxlflash: Fix context encode mask width

2015-10-02 Thread Tomas Henzl
On 1.10.2015 17:55, Matthew R. Ochs wrote:
> The context encode mask covers more than 32-bits, making it
> a long integer. This should be noted by appending the ULL
> width suffix to the mask.
>
> Signed-off-by: Matthew R. Ochs 
> Signed-off-by: Manoj N. Kumar 
> Reviewed-by: Brian King 
> Reviewed-by: Daniel Axtens 

Reviewed-by: Tomas Henzl 

Tomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V4 0/6] Redesign SR-IOV on PowerNV

2015-10-02 Thread Alexey Kardashevskiy

On 08/19/2015 12:01 PM, Wei Yang wrote:

In original design, it tries to group VFs to enable more number of VFs in the
system, when VF BAR is bigger than 64MB. This design has a flaw in which one
error on a VF will interfere other VFs in the same group.

This patch series change this design by using M64 BAR in Single PE mode to
cover only one VF BAR. By doing so, it gives absolute isolation between VFs.

v4:
* rebase the code on top of v4.2-rc7
* switch back to use the dynamic version of pe_num_map and m64_map
* split the memory allocation and PE assignment of pe_num_map to make it
  more easy to read
* check pe_num_map value before free PE.
* add the rename reason for pe_num_map and m64_map in change log
v3:
* return -ENOSPC when a VF has non-64bit prefetchable BAR
* rename offset to pe_num_map and define it staticly
* change commit log based on comments
* define m64_map staticly
v2:
* clean up iov bar alignment calculation
* change m64s to m64_bars
* add a field to represent M64 Single PE mode will be used
* change m64_wins to m64_map
* calculate the gate instead of hard coded
* dynamically allocate m64_map
* dynamically allocate PE#
* add a case to calculate iov bar alignment when M64 Single PE is used
* when M64 Single PE is used, compare num_vfs with M64 BAR available number
  in system at first



Wei Yang (6):
   powerpc/powernv: don't enable SRIOV when VF BAR has non
 64bit-prefetchable BAR
   powerpc/powernv: simplify the calculation of iov resource alignment
   powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR
   powerpc/powernv: replace the hard coded boundary with gate
   powerpc/powernv: boundary the total VF BAR size instead of the
 individual one
   powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE
 mode

  arch/powerpc/include/asm/pci-bridge.h |7 +-
  arch/powerpc/platforms/powernv/pci-ioda.c |  328 +++--
  2 files changed, 175 insertions(+), 160 deletions(-)



I have posted few comments but in general the patchset makes things simpler 
by removing a compound PE and does not seem to make things worse so:


Acked-by: Alexey Kardashevskiy 



--
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 08/34] cxlflash: Fix to avoid CXL services during EEH

2015-10-02 Thread Tomas Henzl
On 1.10.2015 17:55, Matthew R. Ochs wrote:
> During an EEH freeze event, certain CXL services should not be
> called until after the hardware reset has taken place. Doing so
> can result in unnecessary failures and possibly cause other ill
> effects by triggering hardware accesses. This translates to a
> requirement to quiesce all threads that may potentially use CXL
> runtime service during this window. In particular, multiple ioctls
> make use of the CXL services when acting on contexts on behalf of
> the user. Thus, it is essential to 'drain' running ioctls _before_
> proceeding with handling the EEH freeze event.
>
> Create the ability to drain ioctls by wrapping the ioctl handler
> call in a read semaphore and then implementing a small routine that
> obtains the write semaphore, effectively creating a wait point for
> all currently executing ioctls.
>
> Signed-off-by: Matthew R. Ochs 
> Signed-off-by: Manoj N. Kumar 
> Reviewed-by: Brian King 
> Reviewed-by: Daniel Axtens 

Reviewed-by: Tomas Henzl 

Tomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] cxl: Fix number of allocated pages in SPA

2015-10-02 Thread Christophe Lombard
This moves the initialisation of the num_procs to before the SPA
allocation.

Signed-off-by: Christophe Lombard 
---
 drivers/misc/cxl/native.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/cxl/native.c b/drivers/misc/cxl/native.c
index b37f2e8..d2e75c8 100644
--- a/drivers/misc/cxl/native.c
+++ b/drivers/misc/cxl/native.c
@@ -457,6 +457,7 @@ static int activate_afu_directed(struct cxl_afu *afu)
 
dev_info(>dev, "Activating AFU directed mode\n");
 
+   afu->num_procs = afu->max_procs_virtualised;
if (afu->spa == NULL) {
if (cxl_alloc_spa(afu))
return -ENOMEM;
@@ -468,7 +469,6 @@ static int activate_afu_directed(struct cxl_afu *afu)
cxl_p1n_write(afu, CXL_PSL_ID_An, CXL_PSL_ID_An_F | CXL_PSL_ID_An_L);
 
afu->current_mode = CXL_MODE_DIRECTED;
-   afu->num_procs = afu->max_procs_virtualised;
 
if ((rc = cxl_chardev_m_afu_add(afu)))
return rc;
-- 
1.9.1

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 01/34] cxlflash: Fix to avoid invalid port_sel value

2015-10-02 Thread Tomas Henzl
On 1.10.2015 17:54, Matthew R. Ochs wrote:
> From: Manoj Kumar 
>
> If two concurrent MANAGE_LUN ioctls are issued with the same
> WWID parameter, it would result in an incorrect value of port_sel.
>
> This is because port_sel is modified without any locks being
> held. If the first caller stalls after the return from
> find_and_create_lun(), the value of port_sel will be set
> incorrectly to indicate a single port, though in this case
> it should have been set to both ports.
>
> To fix, use the global mutex to serialize the lookup of the
> WWID and the subsequent modification of port_sel.
>
> Signed-off-by: Matthew R. Ochs 
> Signed-off-by: Manoj N. Kumar 
> Reviewed-by: Brian King 

Reviewed-by: Tomas Henzl 

Tomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 02/34] cxlflash: Replace magic numbers with literals

2015-10-02 Thread Tomas Henzl
On 1.10.2015 17:54, Matthew R. Ochs wrote:
> From: Manoj Kumar 
>
> Magic numbers are not meaningful and can create confusion. As a
> remedy, replace them with descriptive literals.
>
> Replace 512 with literal MAX_SECTOR_UNIT.
> Replace 5 with literal CMD_RETRIES.
>
> Signed-off-by: Matthew R. Ochs 
> Signed-off-by: Manoj N. Kumar 
> Reviewed-by: Brian King 
> Reviewed-by: Andrew Donnellan 

Reviewed-by: Tomas Henzl 

Tomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 10/34] cxlflash: Make functions static

2015-10-02 Thread Tomas Henzl
On 1.10.2015 17:55, Matthew R. Ochs wrote:
> Found during code inspection, that the following functions are not
> being used outside of the file where they are defined. Make them static.
>
> int cxlflash_send_cmd(struct afu *, struct afu_cmd *);
> void cxlflash_wait_resp(struct afu *, struct afu_cmd *);
> int cxlflash_afu_reset(struct cxlflash_cfg *);
> struct afu_cmd *cxlflash_cmd_checkout(struct afu *);
> void cxlflash_cmd_checkin(struct afu_cmd *);
> void init_pcr(struct cxlflash_cfg *);
> int init_global(struct cxlflash_cfg *);
>
> Signed-off-by: Matthew R. Ochs 
> Signed-off-by: Manoj N. Kumar 
> Reviewed-by: Brian King 

Reviewed-by: Tomas Henzl 

Tomas

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] powerpc: Fix _ALIGN_* errors due to type difference.

2015-10-02 Thread Aneesh Kumar K.V
This avoid errors like

unsigned int usize = 1 << 30;
int size = 1 << 30;
unsigned long addr = 64UL << 30 ;

value = _ALIGN_DOWN(addr, usize); -> 0
value = _ALIGN_DOWN(addr, size);  -> 0x10

Signed-off-by: Aneesh Kumar K.V 
---
 arch/powerpc/boot/page.h| 4 ++--
 arch/powerpc/include/asm/page.h | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/boot/page.h b/arch/powerpc/boot/page.h
index 14eca30fef64..87c42d7d283d 100644
--- a/arch/powerpc/boot/page.h
+++ b/arch/powerpc/boot/page.h
@@ -22,8 +22,8 @@
 #define PAGE_MASK  (~(PAGE_SIZE-1))
 
 /* align addr on a size boundary - adjust address up/down if needed */
-#define _ALIGN_UP(addr,size)   (((addr)+((size)-1))&(~((size)-1)))
-#define _ALIGN_DOWN(addr,size) ((addr)&(~((size)-1)))
+#define _ALIGN_UP(addr, size)  
(((addr)+((size)-1))&(~((typeof(addr))(size)-1)))
+#define _ALIGN_DOWN(addr, size)((addr)&(~((typeof(addr))(size)-1)))
 
 /* align addr on a size boundary - adjust address up if needed */
 #define _ALIGN(addr,size) _ALIGN_UP(addr,size)
diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h
index 71294a6e976e..1dd69774a31c 100644
--- a/arch/powerpc/include/asm/page.h
+++ b/arch/powerpc/include/asm/page.h
@@ -240,8 +240,8 @@ extern long long virt_phys_offset;
 #endif
 
 /* align addr on a size boundary - adjust address up/down if needed */
-#define _ALIGN_UP(addr,size)   (((addr)+((size)-1))&(~((size)-1)))
-#define _ALIGN_DOWN(addr,size) ((addr)&(~((size)-1)))
+#define _ALIGN_UP(addr, size)  
(((addr)+((size)-1))&(~((typeof(addr))(size)-1)))
+#define _ALIGN_DOWN(addr, size)((addr)&(~((typeof(addr))(size)-1)))
 
 /* align addr on a size boundary - adjust address up if needed */
 #define _ALIGN(addr,size) _ALIGN_UP(addr,size)
-- 
2.5.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v2, 2/2] powerpc/vdso: Avoid link stack corruption in __get_datapage()

2015-10-02 Thread Benjamin Herrenschmidt
On Fri, 2015-10-02 at 17:47 +1000, Michael Ellerman wrote:
> On Fri, 2015-25-09 at 04:01:40 UTC, Michael Neuling wrote:
> > powerpc has a link register (lr) used for calling functions. We "bl
> > " to call a function, and "blr" to return back to the call
> > site.
> 
> 
> 
> > For the benchmark in
> > tools/testing/selftests/powerpc/benchmarks/gettimeofday.c
> >   POWER8:
> > 64bit gets ~4% improvement
> > 32bit gets ~9% improvement
> >   POWER7:
> > 64bit gets ~7% improvement
> > 
> > Signed-off-by: Michael Neuling 
> > Reported-by: Aaron Sawdey 
> 
> Applied to powerpc next, thanks.
> 
> https://git.kernel.org/powerpc/c/c974809a26a13e40254dbe3c

I'd argue this is a bug fix and should hit stable too..

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V4 6/6] powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE mode

2015-10-02 Thread Alexey Kardashevskiy

On 08/19/2015 12:01 PM, Wei Yang wrote:

When M64 BAR is set to Single PE mode, the PE# assigned to VF could be
sparse.

This patch restructures the patch to allocate sparse PE# for VFs when M64


This patch restructures the code ;)



BAR is set to Single PE mode. Also it rename the offset to pe_num_map to
reflect the content is the PE number.

Signed-off-by: Wei Yang 
---
  arch/powerpc/include/asm/pci-bridge.h |2 +-
  arch/powerpc/platforms/powernv/pci-ioda.c |   79 ++---
  2 files changed, 61 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index 8aeba4c..b3a226b 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -213,7 +213,7 @@ struct pci_dn {
  #ifdef CONFIG_PCI_IOV
u16 vfs_expanded;   /* number of VFs IOV BAR expanded */
u16 num_vfs;/* number of VFs enabled*/
-   int offset; /* PE# for the first VF PE */
+   int *pe_num_map;/* PE# for the first VF PE or array */
boolm64_single_mode;/* Use M64 BAR in Single Mode */
  #define IODA_INVALID_M64(-1)
int (*m64_map)[PCI_SRIOV_NUM_BARS];
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 4bc83b8..779f52a 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1243,7 +1243,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)

/* Map the M64 here */
if (pdn->m64_single_mode) {
-   pe_num = pdn->offset + j;
+   pe_num = pdn->pe_num_map[j];
rc = opal_pci_map_pe_mmio_window(phb->opal_id,
pe_num, OPAL_M64_WINDOW_TYPE,
pdn->m64_map[j][i], 0);
@@ -1347,7 +1347,7 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev)
struct pnv_phb*phb;
struct pci_dn *pdn;
struct pci_sriov  *iov;
-   u16 num_vfs;
+   u16num_vfs, i;

bus = pdev->bus;
hose = pci_bus_to_host(bus);
@@ -1361,14 +1361,21 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev)

if (phb->type == PNV_PHB_IODA2) {
if (!pdn->m64_single_mode)
-   pnv_pci_vf_resource_shift(pdev, -pdn->offset);
+   pnv_pci_vf_resource_shift(pdev, -*pdn->pe_num_map);

/* Release M64 windows */
pnv_pci_vf_release_m64(pdev, num_vfs);

/* Release PE numbers */
-   bitmap_clear(phb->ioda.pe_alloc, pdn->offset, num_vfs);
-   pdn->offset = 0;
+   if (pdn->m64_single_mode) {
+   for (i = 0; i < num_vfs; i++) {
+   if (pdn->pe_num_map[i] != IODA_INVALID_PE)
+   pnv_ioda_free_pe(phb, 
pdn->pe_num_map[i]);
+   }
+   } else
+   bitmap_clear(phb->ioda.pe_alloc, *pdn->pe_num_map, 
num_vfs);
+   /* Releasing pe_num_map */
+   kfree(pdn->pe_num_map);
}
  }

@@ -1394,7 +1401,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, 
u16 num_vfs)

/* Reserve PE for each VF */
for (vf_index = 0; vf_index < num_vfs; vf_index++) {
-   pe_num = pdn->offset + vf_index;
+   if (pdn->m64_single_mode)
+   pe_num = pdn->pe_num_map[vf_index];
+   else
+   pe_num = *pdn->pe_num_map + vf_index;

pe = >ioda.pe_array[pe_num];
pe->pe_number = pe_num;
@@ -1436,6 +1446,7 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 
num_vfs)
struct pnv_phb*phb;
struct pci_dn *pdn;
intret;
+   u16i;

bus = pdev->bus;
hose = pci_bus_to_host(bus);
@@ -1458,20 +1469,42 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 
num_vfs)
return -EBUSY;
}

+   /* Allocating pe_num_map */
+   if (pdn->m64_single_mode)
+   pdn->pe_num_map = kmalloc(sizeof(*pdn->pe_num_map) * 
num_vfs,
+   GFP_KERNEL);
+   else
+   pdn->pe_num_map = kmalloc(sizeof(*pdn->pe_num_map), 
GFP_KERNEL);
+
+   if (!pdn->pe_num_map)
+   return -ENOMEM;


[*]


+
/* Calculate available PE for required VFs */
-   mutex_lock(>ioda.pe_alloc_mutex);
-   pdn->offset = bitmap_find_next_zero_area(
-   

Re: [PATCH v3] powerpc: msi: mark bitmap with kmemleak_not_leak()

2015-10-02 Thread Denis Kirjanov
On 9/17/15, Catalin Marinas  wrote:
> On Wed, Sep 16, 2015 at 10:26:14PM +0300, Denis Kirjanov wrote:
>> During the MSI bitmap test on boot kmemleak spews the following trace:
>>
>> unreferenced object 0xc0016e86c900 (size 64):
>> comm "swapper/0", pid 1, jiffies 4294893173 (age 518.024s)
>> hex dump (first 32 bytes):
>>  00 00 01 ff 7f ff 7f 37 00 00 00 00 00 00 00 00
>>  ...7
>>  ff ff ff ff ff ff ff ff 01 ff ff ff ff
>>  ff ff ff
>>  
>>  backtrace:
>>  [] .zalloc_maybe_bootmem+0x3c/0x380
>>  [] .msi_bitmap_alloc+0x3c/0xb0
>>  [] .msi_bitmap_selftest+0x30/0x2b4
>>  [] .do_one_initcall+0xd4/0x270
>>  [] .kernel_init_freeable+0x1a0/0x280
>>  [] .kernel_init+0x1c/0x120
>>  [] .ret_from_kernel_thread+0x58/0x9c
>>
>> Added a flag to msi_bitmap for tracking allocations
>> from slab and memblock so we can properly free/handle
>> memory in msi_bitmap_free().
>>
>> Signed-off-by: Denis Kirjanov 
>
> Reviewed-by: Catalin Marinas 
>
Hi Michael,
could you please apply the patch to fixes branch as well?

Thanks!
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] agp/uninorth: fix a memleak in create_gatt_table

2015-10-02 Thread Denis Kirjanov
On 9/8/15, Michael Ellerman  wrote:
> On Mon, 2015-09-07 at 13:39 +0300, Denis Kirjanov wrote:
>> On 9/7/15, Michael Ellerman  wrote:
>> > On Mon, 2015-09-07 at 10:30 +0300, Denis Kirjanov wrote:
>> >> On 6/19/15, Benjamin Herrenschmidt  wrote:
>> >> > On Thu, 2015-06-18 at 17:34 +0300, Denis Kirjanov wrote:
>> >> >> On 6/12/15, Denis Kirjanov  wrote:
>> >> >> > Fix the memory leak in create_gatt_table:
>> >> >> > we've lost a kfree on the exit path for the pages array allocated
>> >> >> > in uninorth_create_gatt_table
>> >> >> >
>> >> >> > Signed-off-by: Denis Kirjanov 
>> >> >>
>> >> >> Hi Ben, Michael
>> >> >>
>> >> >> Will you take the patch through your trees or do I need to send it
>> >> >> to
>> >> >> Dave Airlie?
>> >> >
>> >> > I haven't had a chance to review yet...
>> >>
>> >> Ping...
>> >
>> > Hi Denis,
>> >
>> > Have you built and/or boot tested this?
>>
>> Hi,
>>
>> yes, sure. Actually I've spotted this through the kmemleak. With the
>> patch applied scanner is happy :)
>
> OK thanks. I'll merge it then, and if Ben ever reviews it and dislikes your
> changes he can send patches to fix it :)

Hi Michael,
could you please apply the patch to fixes branch?

Thanks!
>
> cheers
>
>
>
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V4 1/6] powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BAR

2015-10-02 Thread Alexey Kardashevskiy

On 08/19/2015 12:01 PM, Wei Yang wrote:

On PHB_IODA2, we enable SRIOV devices by mapping IOV BAR with M64 BARs. If
a SRIOV device's IOV BAR is not 64bit-prefetchable, this is not assigned
from 64bit prefetchable window, which means M64 BAR can't work on it.



Please change the commit log to explain what limit came from where.
Something like:

PCI bridges support only 2 windows and the kernel code programs bridges in 
the way that one window is 32bit-nonprefetchable and another one is 
64bit-prefetchable. So if devices' IOV BAR is 64bit and non-prefetchable, 
it will be mapped into 32bit space and therefore M64 cannot be used for it.





This patch makes this explicit.

Signed-off-by: Wei Yang 
Reviewed-by: Gavin Shan 
---
  arch/powerpc/platforms/powernv/pci-ioda.c |   25 +
  1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 85cbc96..8c031b5 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -908,9 +908,6 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, 
int offset)
if (!res->flags || !res->parent)
continue;

-   if (!pnv_pci_is_mem_pref_64(res->flags))
-   continue;
-
/*
 * The actual IOV BAR range is determined by the start address
 * and the actual size for num_vfs VFs BAR.  This check is to
@@ -939,9 +936,6 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, 
int offset)
if (!res->flags || !res->parent)
continue;

-   if (!pnv_pci_is_mem_pref_64(res->flags))
-   continue;
-
size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES);
res2 = *res;
res->start += size * offset;
@@ -1221,9 +1215,6 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
if (!res->flags || !res->parent)
continue;

-   if (!pnv_pci_is_mem_pref_64(res->flags))
-   continue;
-
for (j = 0; j < vf_groups; j++) {
do {
win = 
find_next_zero_bit(>ioda.m64_bar_alloc,
@@ -1510,6 +1501,12 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 
num_vfs)
pdn = pci_get_pdn(pdev);

if (phb->type == PNV_PHB_IODA2) {
+   if (!pdn->vfs_expanded) {


The patch claims it does make the limitation explicit but it is not clear 
at all how to trace from vfs_expanded==0 to "non 64bit-prefetchable IOV BAR".




+   dev_info(>dev, "don't support this SRIOV device"
+   " with non 64bit-prefetchable IOV BAR\n");
+   return -ENOSPC;
+   }
+
/* Calculate available PE for required VFs */
mutex_lock(>ioda.pe_alloc_mutex);
pdn->offset = bitmap_find_next_zero_area(
@@ -2775,9 +2772,10 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
if (!res->flags || res->parent)
continue;
if (!pnv_pci_is_mem_pref_64(res->flags)) {
-   dev_warn(>dev, " non M64 VF BAR%d: %pR\n",
+   dev_warn(>dev, "Don't support SR-IOV with"
+   " non M64 VF BAR%d: %pR. \n",
 i, res);
-   continue;
+   return;
}

size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES);
@@ -2796,11 +2794,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
res = >resource[i + PCI_IOV_RESOURCES];
if (!res->flags || res->parent)
continue;
-   if (!pnv_pci_is_mem_pref_64(res->flags)) {



And this check was quite clear. I'd keep this one.



-   dev_warn(>dev, "Skipping expanding VF BAR%d: 
%pR\n",
-i, res);
-   continue;
-   }

dev_dbg(>dev, " Fixing VF BAR%d: %pR to\n", i, res);
size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES);




--
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [v3 1/8] devres: add devm_alloc_percpu()

2015-10-02 Thread Madalin-Cristian Bucur
> -Original Message-
> From: Wood Scott-B07421
> Sent: Friday, October 02, 2015 4:01 AM
> 
> On Thu, Sep 24, 2015 at 06:00:12PM +0300, Madalin Bucur wrote:
> > Introduce managed counterparts for alloc_percpu() and free_percpu().
> > Add devm_alloc_percpu() and devm_free_percpu() into the managed
> > interfaces list.
> >
> > Signed-off-by: Madalin Bucur 
> > Tested-by: Madalin-Cristian Bucur 
> > ---
> >  Documentation/driver-model/devres.txt |  4 +++
> >  drivers/base/devres.c | 64
> +++
> >  include/linux/device.h| 19 +++
> >  3 files changed, 87 insertions(+)
> 
> Greg KH needs to be CCed on any drivers/base changes.
> 
> -Scott

Thank you, I will add him.

Madalin
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V4 3/6] powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR

2015-10-02 Thread Alexey Kardashevskiy

On 08/19/2015 12:01 PM, Wei Yang wrote:

In current implementation, when VF BAR is bigger than 64MB, it uses 4 M64
BARs in Single PE mode to cover the number of VFs required to be enabled.
By doing so, several VFs would be in one VF Group and leads to interference
between VFs in the same group.

And in this patch, m64_wins is renamed to m64_map, which means index number
of the M64 BAR used to map the VF BAR. Based on Gavin's comments.

This patch changes the design by using one M64 BAR in Single PE mode for
one VF BAR. This gives absolute isolation for VFs.

Signed-off-by: Wei Yang 
---
  arch/powerpc/include/asm/pci-bridge.h |5 +-
  arch/powerpc/platforms/powernv/pci-ioda.c |  178 -
  2 files changed, 74 insertions(+), 109 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index 712add5..8aeba4c 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -214,10 +214,9 @@ struct pci_dn {
u16 vfs_expanded;   /* number of VFs IOV BAR expanded */
u16 num_vfs;/* number of VFs enabled*/
int offset; /* PE# for the first VF PE */
-#define M64_PER_IOV 4
-   int m64_per_iov;
+   boolm64_single_mode;/* Use M64 BAR in Single Mode */
  #define IODA_INVALID_M64(-1)
-   int m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
+   int (*m64_map)[PCI_SRIOV_NUM_BARS];
  #endif /* CONFIG_PCI_IOV */
  #endif
struct list_head child_list;
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index e3e0acb..de7db1d 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1148,29 +1148,36 @@ static void pnv_pci_ioda_setup_PEs(void)
  }

  #ifdef CONFIG_PCI_IOV
-static int pnv_pci_vf_release_m64(struct pci_dev *pdev)
+static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs)
  {
struct pci_bus*bus;
struct pci_controller *hose;
struct pnv_phb*phb;
struct pci_dn *pdn;
inti, j;
+   intm64_bars;

bus = pdev->bus;
hose = pci_bus_to_host(bus);
phb = hose->private_data;
pdn = pci_get_pdn(pdev);

+   if (pdn->m64_single_mode)
+   m64_bars = num_vfs;
+   else
+   m64_bars = 1;
+
for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
-   for (j = 0; j < M64_PER_IOV; j++) {
-   if (pdn->m64_wins[i][j] == IODA_INVALID_M64)
+   for (j = 0; j < m64_bars; j++) {
+   if (pdn->m64_map[j][i] == IODA_INVALID_M64)
continue;
opal_pci_phb_mmio_enable(phb->opal_id,
-   OPAL_M64_WINDOW_TYPE, pdn->m64_wins[i][j], 0);
-   clear_bit(pdn->m64_wins[i][j], 
>ioda.m64_bar_alloc);
-   pdn->m64_wins[i][j] = IODA_INVALID_M64;
+   OPAL_M64_WINDOW_TYPE, pdn->m64_map[j][i], 0);
+   clear_bit(pdn->m64_map[j][i], >ioda.m64_bar_alloc);
+   pdn->m64_map[j][i] = IODA_INVALID_M64;
}

+   kfree(pdn->m64_map);
return 0;
  }

@@ -1187,8 +1194,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
inttotal_vfs;
resource_size_tsize, start;
intpe_num;
-   intvf_groups;
-   intvf_per_group;
+   intm64_bars;

bus = pdev->bus;
hose = pci_bus_to_host(bus);
@@ -1196,26 +1202,26 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
pdn = pci_get_pdn(pdev);
total_vfs = pci_sriov_get_totalvfs(pdev);

-   /* Initialize the m64_wins to IODA_INVALID_M64 */
-   for (i = 0; i < PCI_SRIOV_NUM_BARS; i++)
-   for (j = 0; j < M64_PER_IOV; j++)
-   pdn->m64_wins[i][j] = IODA_INVALID_M64;
+   if (pdn->m64_single_mode)
+   m64_bars = num_vfs;
+   else
+   m64_bars = 1;
+
+   pdn->m64_map = kmalloc(sizeof(*pdn->m64_map) * m64_bars, GFP_KERNEL);
+   if (!pdn->m64_map)
+   return -ENOMEM;
+   /* Initialize the m64_map to IODA_INVALID_M64 */
+   for (i = 0; i < m64_bars ; i++)
+   for (j = 0; j < PCI_SRIOV_NUM_BARS; j++)
+   pdn->m64_map[i][j] = IODA_INVALID_M64;

-   if (pdn->m64_per_iov == M64_PER_IOV) {
-   vf_groups = (num_vfs <= M64_PER_IOV) ? num_vfs: M64_PER_IOV;
-   vf_per_group = (num_vfs <= M64_PER_IOV)? 1:
-   roundup_pow_of_two(num_vfs) / pdn->m64_per_iov;
-   } else {
-

Re: [PATCH V4 2/6] powerpc/powernv: simplify the calculation of iov resource alignment

2015-10-02 Thread Alexey Kardashevskiy

On 08/19/2015 12:01 PM, Wei Yang wrote:

The alignment of IOV BAR on PowerNV platform is the total size of the IOV
BAR. No matter whether the IOV BAR is extended with number of
roundup_pow_of_two(total_vfs) or number of max PE number (256), the total
size could be calculated by (vfs_expanded * VF_BAR_size).

This patch simplifies the pnv_pci_iov_resource_alignment() by removing the
first case.

Signed-off-by: Wei Yang 
Reviewed-by: Gavin Shan 
---
  arch/powerpc/platforms/powernv/pci-ioda.c |   14 +-
  1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 8c031b5..e3e0acb 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2988,12 +2988,16 @@ static resource_size_t 
pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
  int resno)
  {
struct pci_dn *pdn = pci_get_pdn(pdev);
-   resource_size_t align, iov_align;
-
-   iov_align = resource_size(>resource[resno]);
-   if (iov_align)
-   return iov_align;
+   resource_size_t align;

+   /*
+* On PowerNV platform, IOV BAR is mapped by M64 BAR to enable the
+* SR-IOV. While from hardware perspective, the range mapped by M64
+* BAR should be size aligned.



Out of curiosity - IOV BAR does NOT have to be aligned on other platforms?



+*
+* This function returns the total IOV BAR size if expanded or just the
+* individual size if not.


Expanded vs. non-expanded means "using shared M64" (when it is split by 256 
segments) vs. "using entire M64"?




+*/
align = pci_iov_resource_size(pdev, resno);
if (pdn->vfs_expanded)
return pdn->vfs_expanded * align;




--
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V4 5/6] powerpc/powernv: boundary the total VF BAR size instead of the individual one

2015-10-02 Thread Alexey Kardashevskiy

On 08/19/2015 12:01 PM, Wei Yang wrote:

Each VF could have 6 BARs at most. When the total BAR size exceeds the
gate, after expanding it will also exhaust the M64 Window.

This patch limits the boundary by checking the total VF BAR size instead of
the individual BAR.


The gate is the biggest segment size in PE in shared mode, right? And this 
is 64MB. Also, BARs with the same number of all VFs of the same physical 
adapter will be mapper contiguously (as one huge IOV BAR), for example, 2 
VFs, 2 BARs each, mapping will look like:

VF0-BAR0, VF1-BAR0, VF0-BAR1, VF1-BAR1
but not like this:
VF0-BAR0, VF0-BAR1, VF1-BAR0, VF1-BAR1
Is this correct?





Signed-off-by: Wei Yang 
Reviewed-by: Gavin Shan 
---
  arch/powerpc/platforms/powernv/pci-ioda.c |   14 --
  1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index b8bc51f..4bc83b8 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2701,7 +2701,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
const resource_size_t gate = phb->ioda.m64_segsize >> 2;
struct resource *res;
int i;
-   resource_size_t size;
+   resource_size_t size, total_vf_bar_sz;
struct pci_dn *pdn;
int mul, total_vfs;

@@ -2714,6 +2714,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)

total_vfs = pci_sriov_get_totalvfs(pdev);
mul = phb->ioda.total_pe;
+   total_vf_bar_sz = 0;

for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) {
res = >resource[i + PCI_IOV_RESOURCES];
@@ -2726,7 +2727,8 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
return;
}

-   size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES);
+   total_vf_bar_sz += pci_iov_resource_size(pdev,
+   i + PCI_IOV_RESOURCES);



Is @pdev a physical device in this context? I assume it is so 
pci_iov_resource_size() returns the entire IOV BAR size.
For example, I have a Mellanox card with 16 VFs, each has a single 32MB BAR 
so total_vf_bar_sz will be 16*32=512MB and this will exceed the @gate size 
and we end up having m64_single_mode = true. What do I miss here?





/*
 * If bigger than quarter of M64 segment size, just round up
@@ -2740,11 +2742,11 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
 * limit the system flexibility.  This is a design decision to
 * set the boundary to quarter of the M64 segment size.
 */
-   if (size > gate) {
-   dev_info(>dev, "PowerNV: VF BAR%d: %pR IOV size "
-   "is bigger than %lld, roundup power2\n",
-i, res, gate);
+   if (total_vf_bar_sz > gate) {
mul = roundup_pow_of_two(total_vfs);
+   dev_info(>dev,
+   "VF BAR Total IOV size %llx > %llx, roundup to %d 
VFs\n",
+   total_vf_bar_sz, gate, mul);
pdn->m64_single_mode = true;
break;
}




--
Alexey
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Missing operand for tlbie instruction on Power7

2015-10-02 Thread Laura Abbott

Hi,

We received a report (https://bugzilla.redhat.com/show_bug.cgi?id=1267395) of 
bad assembly
when compiling on powerpc with little endian

[labbott@labbott-redhat-machine linux_upstream]$ make ARCH=powerpc 
CROSS_COMPILE=powerpc64-linux-gnu-
  CHK include/config/kernel.release
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  CHK include/generated/bounds.h
  CHK include/generated/timeconst.h
  CHK include/generated/asm-offsets.h
  CALLscripts/checksyscalls.sh
  CHK include/generated/compile.h
  CALLarch/powerpc/kernel/systbl_chk.sh
  AS  arch/powerpc/kernel/swsusp_asm64.o
arch/powerpc/kernel/swsusp_asm64.S: Assembler messages:
arch/powerpc/kernel/swsusp_asm64.S:188: Error: missing operand
scripts/Makefile.build:294: recipe for target 
'arch/powerpc/kernel/swsusp_asm64.o' failed
make[1]: *** [arch/powerpc/kernel/swsusp_asm64.o] Error 1
Makefile:941: recipe for target 'arch/powerpc/kernel' failed
make: *** [arch/powerpc/kernel] Error 2

This problem started happening after a binutils update:

[labbott@labbott-redhat-machine linux_upstream]$ powerpc64-linux-gnu-as 
--version
GNU assembler version 2.25.1-1.fc22
Copyright (C) 2014 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `powerpc64-linux-gnu'.
[labbott@labbott-redhat-machine linux_upstream]$

After some discussion with the binutils folks, it turns out that the tlbie
instruction actually requires another operand and binutils was updated to
check for this https://sourceware.org/ml/binutils/2015-05/msg00133.html .

The code sequence in arch/powerpc/include/asm/ppc_asm.h now needs to be updated:

#if !defined(CONFIG_4xx) && !defined(CONFIG_8xx)
#define tlbia   \
li  r4,1024;\
mtctr   r4; \
lis r4,KERNELBASE@h;\
0:  tlbie   r4; \
addir4,r4,0x1000;   \
bdnz0b
#endif

I don't know enough ppc assembly to properly fix this but I can test.

Thanks,
Laura

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power

2015-10-02 Thread Nishanth Aravamudan
On 02.10.2015 [10:25:44 -0700], Christoph Hellwig wrote:
> Hi Nishanth,
> 
> please expose this value through the generic DMA API instead of adding
> architecture specific hacks to drivers.

Ok, I'm happy to do that instead -- what I struggled with is that I
don't have enough knowledge of the various architectures to provide the
right default implementation. It should be sufficient for the default to
return PAGE_SHIFT, and on Power just override that to return the IOMMU
table's page size? Since the only user will be the NVMe driver
currently, that should be fine?

Sorry for the less-than-ideal patch!

-Nish

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Nishanth Aravamudan
We received a bug report recently when DDW (64-bit direct DMA on Power)
is not enabled for NVMe devices. In that case, we fall back to 32-bit
DMA via the IOMMU, which is always done via 4K TCEs (Translation Control
Entries).

The NVMe device driver, though, assumes that the DMA alignment for the
PRP entries will match the device's page size, and that the DMA aligment
matches the kernel's page aligment. On Power, the the IOMMU page size,
as mentioned above, can be 4K, while the device can have a page size of
8K, while the kernel has a page size of 64K. This eventually trips the
BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple
of 4K but not 8K (e.g., 0xF000).

In this particular case, and generally, we want to use the IOMMU's page
size for the default device page size, rather than the kernel's page
size.

This series consists of two patches, one of which exposes the IOMMU's
page shift on Power (currently only the page size is exposed, and it
seems unnecessary to ilog2 that value in the driver). The second patch
leverages this value on Power in the NVMe driver.

With these patches, a NVMe device survives our internal hardware
exerciser; the kernel BUGs within a few seconds without the patch.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power

2015-10-02 Thread Nishanth Aravamudan
We received a bug report recently when DDW (64-bit direct DMA on Power)
is not enabled for NVMe devices. In that case, we fall back to 32-bit
DMA via the IOMMU, which is always done via 4K TCEs (Translation Control
Entries).

The NVMe device driver, though, assumes that the DMA alignment for the
PRP entries will match the device's page size, and that the DMA aligment
matches the kernel's page aligment. On Power, the the IOMMU page size,
as mentioned above, can be 4K, while the device can have a page size of
8K, while the kernel has a page size of 64K. This eventually trips the
BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple
of 4K but not 8K (e.g., 0xF000).

In this particular case, and generally, we want to use the IOMMU's page
size for the default device page size, rather than the kernel's page
size.

With this patch, a NVMe device survives our internal hardware
exerciser; the kernel BUGs within a few seconds without the patch.

Signed-off-by: Nishanth Aravamudan 

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 7920c27..969a95e 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define NVME_MINORS(1U << MINORBITS)
 #define NVME_Q_DEPTH   1024
@@ -1680,6 +1681,11 @@ static int nvme_configure_admin_queue(struct nvme_dev 
*dev)
unsigned page_shift = PAGE_SHIFT;
unsigned dev_page_min = NVME_CAP_MPSMIN(cap) + 12;
unsigned dev_page_max = NVME_CAP_MPSMAX(cap) + 12;
+#ifdef CONFIG_PPC64
+   struct iommu_table *tbl = get_iommu_table_base(dev->dev);
+   if (tbl)
+   page_shift = IOMMU_PAGE_SHIFT(tbl);
+#endif
 
if (page_shift < dev_page_min) {
dev_err(dev->dev,

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power

2015-10-02 Thread Christoph Hellwig
On Fri, Oct 02, 2015 at 10:39:47AM -0700, Nishanth Aravamudan wrote:
> Ok, I'm happy to do that instead -- what I struggled with is that I
> don't have enough knowledge of the various architectures to provide the
> right default implementation. It should be sufficient for the default to
> return PAGE_SHIFT, and on Power just override that to return the IOMMU
> table's page size? Since the only user will be the NVMe driver
> currently, that should be fine?

I think that's fine.

> Sorry for the less-than-ideal patch!

Np, it's a reasonable first attempt that we just need to refine.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v5 00/34] cxlflash: Miscellaneous bug fixes and corrections

2015-10-02 Thread Matthew R. Ochs
Hi James,

This series has been fairly well vetted. With cxlflash being a new driver the 
majority
of these patches fix critical bugs. Is there anything else you're looking for 
in order to
get this set pulled into 4.3-rc?


-matt

> On Oct 1, 2015, at 10:52 AM, Matthew R. Ochs  
> wrote:
> 
> This patch set contains various fixes and corrections for issues that
> were found during test and code review. The series is based upon the
> code upstreamed in 4.3 and is intended for the rc phase. The entire
> set is bisectable. Please reference the changelog below for details
> on what has been altered from previous versions of this patch set.
> 
> v5 Changes:
> - Incorporate comments from Daniel Axtens
> - Incorporate comments from Andrew Donnellan 
> - Added additional clarifications to several commit messages
> - Specified some return codes as failures in "Fix function prolog..."
> - Made port online failure noisier in "Remove dual port online..."
> - Added patch to properly cleanup when encountering an unsupported AFU
> - Added patch to escalate a link reset on login timeout
> 
> v4 Changes:
> - Incorporate comments from Brian King
> - Removed unnecessary check_state() parameter from "Fix to avoid CXL..."
> - Added patch to fix potential deadlock on EEH
> - Removed patch to avoid state change collision
> - Changed fops initialization location in "Fix to avoid corrupting..."
> 
> v3 Changes:
> - Rebased the series on top of patch by Dan Carpenter ("a couple off...")
> - Incorporate comments from David Laight
> - Incorporate comments from Tomas Henzl
> - Incorporate comments from Brian King
> - Removed patch to stop interrupt processing on remove
> - Removed double scsi_device_put() from "Fix potential oops"
> - Fixed usage of scnprintf() in "Refine host/device attributes"
> - Removed unnecessary parenthesis from "Fix read capacity timeout"
> - Added patch to use correct operator for doubling delay
> - Changed location of cancel_work_sync() in "Fix to prevent workq..."
> - Removed local mutex from cxlflash_afu_sync() in "Fix to avoid state..."
> - Added patch to correctly identify a failed function in a trace
> - Added patch to fix a fops corruption bug
> 
> v2 Changes:
> - Incorporate comments from Ian Munsie
> - Rework commit messages to be more descriptive
> - Add state change serialization patch
> 
> Manoj Kumar (5):
>  cxlflash: Fix to avoid invalid port_sel value
>  cxlflash: Replace magic numbers with literals
>  cxlflash: Fix read capacity timeout
>  cxlflash: Fix to double the delay each time
>  cxlflash: Fix to escalate to LINK_RESET on login timeout
> 
> Matthew R. Ochs (29):
>  cxlflash: Fix potential oops following LUN removal
>  cxlflash: Fix data corruption when vLUN used over multiple cards
>  cxlflash: Fix to avoid sizeof(bool)
>  cxlflash: Fix context encode mask width
>  cxlflash: Fix to avoid CXL services during EEH
>  cxlflash: Correct naming of limbo state and waitq
>  cxlflash: Make functions static
>  cxlflash: Refine host/device attributes
>  cxlflash: Fix to avoid spamming the kernel log
>  cxlflash: Fix to avoid stall while waiting on TMF
>  cxlflash: Fix location of setting resid
>  cxlflash: Fix host link up event handling
>  cxlflash: Fix async interrupt bypass logic
>  cxlflash: Remove dual port online dependency
>  cxlflash: Fix AFU version access/storage and add check
>  cxlflash: Correct usage of scsi_host_put()
>  cxlflash: Fix to prevent workq from accessing freed memory
>  cxlflash: Correct behavior in device reset handler following EEH
>  cxlflash: Remove unnecessary scsi_block_requests
>  cxlflash: Fix function prolog parameters and return codes
>  cxlflash: Fix MMIO and endianness errors
>  cxlflash: Fix to prevent EEH recovery failure
>  cxlflash: Correct spelling, grammar, and alignment mistakes
>  cxlflash: Fix to prevent stale AFU RRQ
>  MAINTAINERS: Add cxlflash driver
>  cxlflash: Fix to avoid corrupting adapter fops
>  cxlflash: Correct trace string
>  cxlflash: Fix to avoid potential deadlock on EEH
>  cxlflash: Fix to avoid leaving dangling interrupt resources
> 
> MAINTAINERS   |9 +
> drivers/scsi/cxlflash/common.h|   30 +-
> drivers/scsi/cxlflash/lunmgt.c|9 +-
> drivers/scsi/cxlflash/main.c  | 1549 -
> drivers/scsi/cxlflash/main.h  |1 +
> drivers/scsi/cxlflash/sislite.h   |8 +-
> drivers/scsi/cxlflash/superpipe.c |  204 +++--
> drivers/scsi/cxlflash/superpipe.h |   13 +-
> drivers/scsi/cxlflash/vlun.c  |   68 +-
> 9 files changed, 1055 insertions(+), 836 deletions(-)
> 
> -- 
> 2.1.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] powerpc/iommu: expose IOMMU page shift

2015-10-02 Thread Nishanth Aravamudan
We will leverage this macro in the NVMe driver, which needs to know the
configured IOMMU page shift to properly configure its device's page
size.

Signed-off-by: Nishanth Aravamudan 

---
Given this is available, it seems reasonable to expose -- and it doesn't
really make sense to make the driver do a log2 call on the existing
IOMMU_PAGE_SIZE() value.

diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index ca18cff..6fdf857 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -36,6 +36,7 @@
 #define IOMMU_PAGE_MASK_4K   (~((1 << IOMMU_PAGE_SHIFT_4K) - 1))
 #define IOMMU_PAGE_ALIGN_4K(addr) _ALIGN_UP(addr, IOMMU_PAGE_SIZE_4K)
 
+#define IOMMU_PAGE_SHIFT(tblptr) (tblptr)->it_page_shift
 #define IOMMU_PAGE_SIZE(tblptr) (ASM_CONST(1) << (tblptr)->it_page_shift)
 #define IOMMU_PAGE_MASK(tblptr) (~((1 << (tblptr)->it_page_shift) - 1))
 #define IOMMU_PAGE_ALIGN(addr, tblptr) _ALIGN_UP(addr, IOMMU_PAGE_SIZE(tblptr))

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power

2015-10-02 Thread Christoph Hellwig
Hi Nishanth,

please expose this value through the generic DMA API instead of adding
architecture specific hacks to drivers.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 3/5 v2] powerpc/dma: implement per-platform dma_get_page_shift

2015-10-02 Thread Nishanth Aravamudan
The IOMMU page size is not always stored in struct iommu on Power.
Specifically if a device is configured for DDW (Dynamic DMA Windows aka.
64-bit direct DMA), the used TCE (Translation Control Entry) size is
stored in a special device property created at run-time by the DDW
configuration code. DDW is a pseries-specific feature, so allow
platforms to override the implementation of dma_get_page_shift if
desired.

Signed-off-by: Nishanth Aravamudan 

diff --git a/arch/powerpc/include/asm/machdep.h 
b/arch/powerpc/include/asm/machdep.h
index cab6753..5c372e3 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -78,9 +78,10 @@ struct machdep_calls {
 #endif
 #endif /* CONFIG_PPC64 */
 
-   /* Platform set_dma_mask and dma_get_required_mask overrides */
+   /* Platform overrides */
int (*dma_set_mask)(struct device *dev, u64 dma_mask);
u64 (*dma_get_required_mask)(struct device *dev);
+   unsigned long   (*dma_get_page_shift)(struct device *dev);
 
int (*probe)(void);
void(*setup_arch)(void); /* Optional, may be NULL */
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index e805af2..c363896 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -338,6 +338,8 @@ EXPORT_SYMBOL(dma_set_mask);
 unsigned long dma_get_page_shift(struct device *dev)
 {
struct iommu_table *tbl = get_iommu_table_base(dev);
+   if (ppc_md.dma_get_page_shift)
+   return ppc_md.dma_get_page_shift(dev);
if (tbl)
return tbl->it_page_shift;
return PAGE_SHIFT;

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 0/5 v2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Nishanth Aravamudan
We received a bug report recently when DDW (64-bit direct DMA on Power)
is not enabled for NVMe devices. In that case, we fall back to 32-bit
DMA via the IOMMU, which is always done via 4K TCEs (Translation Control
Entries).
 
The NVMe device driver, though, assumes that the DMA alignment for the
PRP entries will match the device's page size, and that the DMA aligment
matches the kernel's page aligment. On Power, the the IOMMU page size,
as mentioned above, can be 4K, while the device can have a page size of
8K, while the kernel has a page size of 64K. This eventually trips the
BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple
of 4K but not 8K (e.g., 0xF000).
 
In this particular case, and generally, we want to use the IOMMU's page
size for the default device page size, rather than the kernel's page
size.
 
This series consists of five patches:

1) add a generic dma_get_page_shift implementation that just returns
PAGE_SHIFT
2) override the generic implementation on Power to use the IOMMU table's
page shift if available
3) allow further specific overriding on power with machdep platform
overrides
4) use the machdep override on pseries, as the DDW code puts the TCE
shift in a special property and there is no IOMMU table available
5) leverage the new API in the NVMe driver
 
With these patches, a NVMe device survives our internal hardware
exerciser; the kernel BUGs within a few seconds without the patch.

 arch/powerpc/include/asm/dma-mapping.h   |  3 +++
 arch/powerpc/include/asm/machdep.h   |  3 ++-
 arch/powerpc/kernel/dma.c| 11 +++
 arch/powerpc/platforms/pseries/iommu.c   | 36 

 drivers/block/nvme-core.c|  3 ++-
 include/asm-generic/dma-mapping-common.h |  7 +++
 6 files changed, 61 insertions(+), 2 deletions(-)

v1 -> v2:
  Based upon feedback from Christoph Hellwig, rather than using an
  arch-specific hack, expose the DMA page shift via a generic DMA API and
  override it on Power as needed.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 4/5 v2] pseries/iommu: implement DDW-aware dma_get_page_shift

2015-10-02 Thread Nishanth Aravamudan
When DDW (Dynamic DMA Windows) are present for a device, we have stored
the TCE (Translation Control Entry) size in a special device tree
property. Check if we have enabled DDW for the device and return the TCE
size from that property if present. If the property isn't present,
fallback to looking the value up in struct iommu_table. If we don't find
a iommu_table, fallback to the kernel's page size.

Signed-off-by: Nishanth Aravamudan 

diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index 0946b98..1bf6471 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1292,6 +1292,40 @@ static u64 dma_get_required_mask_pSeriesLP(struct device 
*dev)
return dma_iommu_ops.get_required_mask(dev);
 }
 
+static unsigned long dma_get_page_shift_pSeriesLP(struct device *dev)
+{
+   struct iommu_table *tbl;
+
+   if (!disable_ddw && dev_is_pci(dev)) {
+   struct pci_dev *pdev = to_pci_dev(dev);
+   struct device_node *dn;
+
+   dn = pci_device_to_OF_node(pdev);
+
+   /* search upwards for ibm,dma-window */
+   for (; dn && PCI_DN(dn) && !PCI_DN(dn)->table_group;
+   dn = dn->parent)
+   if (of_get_property(dn, "ibm,dma-window", NULL))
+   break;
+   /*
+* if there is a DDW configuration, the TCE shift is stored in
+* the property
+*/
+   if (dn && PCI_DN(dn)) {
+   const struct dynamic_dma_window_prop *direct64 =
+   of_get_property(dn, DIRECT64_PROPNAME, NULL);
+   if (direct64)
+   return be32_to_cpu(direct64->tce_shift);
+   }
+   }
+
+   tbl = get_iommu_table_base(dev);
+   if (tbl)
+   return tbl->it_page_shift;
+
+   return PAGE_SHIFT;
+}
+
 #else  /* CONFIG_PCI */
 #define pci_dma_bus_setup_pSeries  NULL
 #define pci_dma_dev_setup_pSeries  NULL
@@ -1299,6 +1333,7 @@ static u64 dma_get_required_mask_pSeriesLP(struct device 
*dev)
 #define pci_dma_dev_setup_pSeriesLPNULL
 #define dma_set_mask_pSeriesLP NULL
 #define dma_get_required_mask_pSeriesLPNULL
+#define dma_get_page_shift_pSeriesLP   NULL
 #endif /* !CONFIG_PCI */
 
 static int iommu_mem_notifier(struct notifier_block *nb, unsigned long action,
@@ -1395,6 +1430,7 @@ void iommu_init_early_pSeries(void)
pseries_pci_controller_ops.dma_dev_setup = 
pci_dma_dev_setup_pSeriesLP;
ppc_md.dma_set_mask = dma_set_mask_pSeriesLP;
ppc_md.dma_get_required_mask = dma_get_required_mask_pSeriesLP;
+   ppc_md.dma_get_page_shift = dma_get_page_shift_pSeriesLP;
} else {
pseries_pci_controller_ops.dma_bus_setup = 
pci_dma_bus_setup_pSeries;
pseries_pci_controller_ops.dma_dev_setup = 
pci_dma_dev_setup_pSeries;

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Missing operand for tlbie instruction on Power7

2015-10-02 Thread Denis Kirjanov
On 10/2/15, Laura Abbott  wrote:
> Hi,
>
> We received a report (https://bugzilla.redhat.com/show_bug.cgi?id=1267395)
> of bad assembly
> when compiling on powerpc with little endian
>
> [labbott@labbott-redhat-machine linux_upstream]$ make ARCH=powerpc
> CROSS_COMPILE=powerpc64-linux-gnu-
>CHK include/config/kernel.release
>CHK include/generated/uapi/linux/version.h
>CHK include/generated/utsrelease.h
>CHK include/generated/bounds.h
>CHK include/generated/timeconst.h
>CHK include/generated/asm-offsets.h
>CALLscripts/checksyscalls.sh
>CHK include/generated/compile.h
>CALLarch/powerpc/kernel/systbl_chk.sh
>AS  arch/powerpc/kernel/swsusp_asm64.o
> arch/powerpc/kernel/swsusp_asm64.S: Assembler messages:
> arch/powerpc/kernel/swsusp_asm64.S:188: Error: missing operand
> scripts/Makefile.build:294: recipe for target
> 'arch/powerpc/kernel/swsusp_asm64.o' failed
> make[1]: *** [arch/powerpc/kernel/swsusp_asm64.o] Error 1
> Makefile:941: recipe for target 'arch/powerpc/kernel' failed
> make: *** [arch/powerpc/kernel] Error 2
>
> This problem started happening after a binutils update:
>
> [labbott@labbott-redhat-machine linux_upstream]$ powerpc64-linux-gnu-as
> --version
> GNU assembler version 2.25.1-1.fc22
> Copyright (C) 2014 Free Software Foundation, Inc.
> This program is free software; you may redistribute it under the terms of
> the GNU General Public License version 3 or later.
> This program has absolutely no warranty.
> This assembler was configured for a target of `powerpc64-linux-gnu'.
> [labbott@labbott-redhat-machine linux_upstream]$
>
> After some discussion with the binutils folks, it turns out that the tlbie
> instruction actually requires another operand and binutils was updated to
> check for this https://sourceware.org/ml/binutils/2015-05/msg00133.html .
>
> The code sequence in arch/powerpc/include/asm/ppc_asm.h now needs to be
> updated:
>
> #if !defined(CONFIG_4xx) && !defined(CONFIG_8xx)
> #define tlbia   \
>  li  r4,1024;\
>  mtctr   r4; \
>  lis r4,KERNELBASE@h;\
> 0:  tlbie   r4; \
>  addir4,r4,0x1000;   \
>  bdnz0b
> #endif
>
> I don't know enough ppc assembly to properly fix this but I can test.

Could you please test the patch attached?



>
> Thanks,
> Laura
>
> ___
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
index dd0fc18..240557a 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -445,7 +445,7 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,945)
 	li	r4,1024;			\
 	mtctr	r4;\
 	lis	r4,KERNELBASE@h;		\
-0:	tlbie	r4;\
+0:	tlbie	r4, 0;\
 	addi	r4,r4,0x1000;			\
 	bdnz	0b
 #endif
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/5 v2] powerpc/dma-mapping: override dma_get_page_shift

2015-10-02 Thread Nishanth Aravamudan
On Power, the kernel's page size can differ from the IOMMU's page size,
so we need to override the generic implementation, which always returns
the kernel's page size. Lookup the IOMMU's page size from struct
iommu_table, if available. Fallback to the kernel's page size,
otherwise.

Signed-off-by: Nishanth Aravamudan 

diff --git a/arch/powerpc/include/asm/dma-mapping.h 
b/arch/powerpc/include/asm/dma-mapping.h
index 7f522c0..c5638f4 100644
--- a/arch/powerpc/include/asm/dma-mapping.h
+++ b/arch/powerpc/include/asm/dma-mapping.h
@@ -125,6 +125,9 @@ static inline void set_dma_offset(struct device *dev, 
dma_addr_t off)
 #define HAVE_ARCH_DMA_SET_MASK 1
 extern int dma_set_mask(struct device *dev, u64 dma_mask);
 
+#define HAVE_ARCH_DMA_GET_PAGE_SHIFT 1
+extern unsigned long dma_get_page_shift(struct device *dev);
+
 #include 
 
 extern int __dma_set_mask(struct device *dev, u64 dma_mask);
diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index 59503ed..e805af2 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -335,6 +335,15 @@ int dma_set_mask(struct device *dev, u64 dma_mask)
 }
 EXPORT_SYMBOL(dma_set_mask);
 
+unsigned long dma_get_page_shift(struct device *dev)
+{
+   struct iommu_table *tbl = get_iommu_table_base(dev);
+   if (tbl)
+   return tbl->it_page_shift;
+   return PAGE_SHIFT;
+}
+EXPORT_SYMBOL(dma_get_page_shift);
+
 u64 __dma_get_required_mask(struct device *dev)
 {
struct dma_map_ops *dma_ops = get_dma_ops(dev);

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/5 v2] dma-mapping: add generic dma_get_page_shift API

2015-10-02 Thread Nishanth Aravamudan
Drivers like NVMe need to be able to determine the page size used for
DMA transfers. Add a new API that defaults to return PAGE_SHIFT on all
architectures.

Signed-off-by: Nishanth Aravamudan 

diff --git a/include/asm-generic/dma-mapping-common.h 
b/include/asm-generic/dma-mapping-common.h
index b1bc954..86e4e97 100644
--- a/include/asm-generic/dma-mapping-common.h
+++ b/include/asm-generic/dma-mapping-common.h
@@ -355,4 +355,11 @@ static inline int dma_set_mask(struct device *dev, u64 
mask)
 }
 #endif
 
+#ifndef HAVE_ARCH_DMA_GET_PAGE_SHIFT
+static inline unsigned long dma_get_page_shift(struct device *dev)
+{
+   return PAGE_SHIFT;
+}
+#endif
+
 #endif

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Missing operand for tlbie instruction on Power7

2015-10-02 Thread Peter Bergner
On Fri, 2015-10-02 at 22:03 +0300, Denis Kirjanov wrote:
> arch/powerpc/kernel/swsusp_asm64.S: Assembler messages:
>> arch/powerpc/kernel/swsusp_asm64.S:188: Error: missing operand
>> scripts/Makefile.build:294: recipe for target
>> 'arch/powerpc/kernel/swsusp_asm64.o' failed
>> make[1]: *** [arch/powerpc/kernel/swsusp_asm64.o] Error 1
>> Makefile:941: recipe for target 'arch/powerpc/kernel' failed
>> make: *** [arch/powerpc/kernel] Error 2
[snip]
>> I don't know enough ppc assembly to properly fix this but I can test.
> 
> Could you please test the patch attached?
[snip]
> -0: tlbie   r4; \
> +0: tlbie   r4, 0;  \

This isn't correct.  With POWER7 and later (which this compile
is, since it's on LE), the tlbie instruction takes two register
operands:

tlbie RB, RS

The tlbie instruction on pre POWER7 cpus had one required register
operand (RB) and an optional second L operand, where if you omitted
it, it was the same as using "0":

tlbie RB, L

This is a POWER7 and later build, so your change which adds the "0"
above is really adding r0 for RS.  The new tlbie instruction doesn't
treat r0 specially, so you'll be using whatever random bits which
happen to be in r0 which I don't think that is what you want.


Peter



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 5/5 v2] drivers/nvme: default to the IOMMU page size

2015-10-02 Thread Nishanth Aravamudan
We received a bug report recently when DDW (64-bit direct DMA on Power)
is not enabled for NVMe devices. In that case, we fall back to 32-bit
DMA via the IOMMU, which is always done via 4K TCEs (Translation Control
Entries).

The NVMe device driver, though, assumes that the DMA alignment for the
PRP entries will match the device's page size, and that the DMA aligment
matches the kernel's page aligment. On Power, the the IOMMU page size,
as mentioned above, can be 4K, while the device can have a page size of
8K, while the kernel has a page size of 64K. This eventually trips the
BUG_ON in nvme_setup_prps(), as we have a 'dma_len' that is a multiple
of 4K but not 8K (e.g., 0xF000).

In this particular case of page sizes, we clearly want to use the
IOMMU's page size in the driver. And generally, the NVMe driver in this
function should be using the IOMMU's page size for the default device
page size, rather than the kernel's page size.

With this patch, a NVMe device survives our internal hardware
exerciser; the kernel BUGs within a few seconds without the patch.

---
v1 -> v2:
  Based upon feedback from Christoph Hellwig, implement the IOMMU page
  size lookup as a generic DMA API, rather than an architecture-specific
  hack.

diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index b97fc3f..c561137 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme-core.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1713,7 +1714,7 @@ static int nvme_configure_admin_queue(struct nvme_dev 
*dev)
u32 aqa;
u64 cap = readq(>bar->cap);
struct nvme_queue *nvmeq;
-   unsigned page_shift = PAGE_SHIFT;
+   unsigned page_shift = dma_get_page_shift(dev->dev);
unsigned dev_page_min = NVME_CAP_MPSMIN(cap) + 12;
unsigned dev_page_max = NVME_CAP_MPSMAX(cap) + 12;
 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/5 v2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Benjamin Herrenschmidt
On Fri, 2015-10-02 at 13:09 -0700, Nishanth Aravamudan wrote:

> 1) add a generic dma_get_page_shift implementation that just returns
> PAGE_SHIFT

So you chose to return the granularity of the iommu to the driver
rather than providing a way for the driver to request a specific
alignment for DMA mappings. Any specific reason ?

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/pseries: hibernation/migration should honor topology update policy

2015-10-02 Thread Tyrel Datwyler
Ping?

Don't thing I ever saw anything happen with this patch. On another note
I notice in retrospect the patch subject isn't really stated in
imperative mood. Maybe it would be better as:

powerpc/pseries: make hibernation/migration honor topology update policy

-Tyrel

On 05/05/2015 10:53 AM, Tyrel Datwyler wrote:
> From: Tyrel Datwyler 
> 
> The suspend call paths for hibernation and migration operations call
> stop_topology_update() and start_topology_update() respectively prior to
> suspending the LPAR and upon resume. Topology updating can be
> enabled/disabled from userspace and no check is currently done to determine
> the current policy. This results in topology updates being started upon
> resume from hibernation/migration even in the case where topology updates
> were disabled initially.
> 
> This fixes the issue by storing the current policy and only calling
> start_topology_update() in the case where either PRRN/VPHN were enabled to
> start with.
> 
> Fixes: e04fa61214a3 (powerpc/pseries: Add /proc interface to control topology 
> updates)
> 
> Signed-off-by: Tyrel Datwyler 
> Cc: Nathan Fontenot 
> Cc: Nishanth Aravamudan 
> Cc: sta...@vger.kernel.org
> ---
>  arch/powerpc/include/asm/topology.h  | 5 +
>  arch/powerpc/kernel/rtas.c   | 4 +++-
>  arch/powerpc/mm/numa.c   | 5 +
>  arch/powerpc/platforms/pseries/suspend.c | 5 -
>  4 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/topology.h 
> b/arch/powerpc/include/asm/topology.h
> index 5f1048e..44f6519 100644
> --- a/arch/powerpc/include/asm/topology.h
> +++ b/arch/powerpc/include/asm/topology.h
> @@ -63,6 +63,7 @@ static inline void sysfs_remove_device_from_node(struct 
> device *dev,
>  extern int start_topology_update(void);
>  extern int stop_topology_update(void);
>  extern int prrn_is_enabled(void);
> +extern int vphn_is_enabled(void);
>  #else
>  static inline int start_topology_update(void)
>  {
> @@ -76,6 +77,10 @@ static inline int prrn_is_enabled(void)
>  {
>   return 0;
>  }
> +static inline int vphn_is_enabled(void)
> +{
> + return 0;
> +}
>  #endif /* CONFIG_NUMA && CONFIG_PPC_SPLPAR */
> 
>  #include 
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 7a488c1..7ae9992 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -906,6 +906,7 @@ int rtas_ibm_suspend_me(u64 handle)
>   DECLARE_COMPLETION_ONSTACK(done);
>   cpumask_var_t offline_mask;
>   int cpuret;
> + int restart_topology_updates = (prrn_is_enabled() || vphn_is_enabled());
> 
>   if (!rtas_service_present("ibm,suspend-me"))
>   return -ENOSYS;
> @@ -957,7 +958,8 @@ int rtas_ibm_suspend_me(u64 handle)
>   if (atomic_read() != 0)
>   printk(KERN_ERR "Error doing global join\n");
> 
> - start_topology_update();
> + if (restart_topology_updates)
> + start_topology_update();
> 
>   /* Take down CPUs not online prior to suspend */
>   cpuret = rtas_offline_cpus_mask(offline_mask);
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 5e80621..885d950 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -1593,6 +1593,11 @@ int prrn_is_enabled(void)
>   return prrn_enabled;
>  }
> 
> +int vphn_is_enabled(void)
> +{
> + return vphn_enabled;
> +}
> +
>  static int topology_read(struct seq_file *file, void *v)
>  {
>   if (vphn_enabled || prrn_enabled)
> diff --git a/arch/powerpc/platforms/pseries/suspend.c 
> b/arch/powerpc/platforms/pseries/suspend.c
> index e76aefa..b5f92e2 100644
> --- a/arch/powerpc/platforms/pseries/suspend.c
> +++ b/arch/powerpc/platforms/pseries/suspend.c
> @@ -147,6 +147,7 @@ static ssize_t store_hibernate(struct device *dev,
>  {
>   cpumask_var_t offline_mask;
>   int rc;
> + int restart_topology_updates = (prrn_is_enabled() || vphn_is_enabled);
> 
>   if (!capable(CAP_SYS_ADMIN))
>   return -EPERM;
> @@ -175,7 +176,9 @@ static ssize_t store_hibernate(struct device *dev,
> 
>   stop_topology_update();
>   rc = pm_suspend(PM_SUSPEND_MEM);
> - start_topology_update();
> +
> + if (restart_topology_updates)
> + start_topology_update();
> 
>   /* Take down CPUs not online prior to suspend */
>   if (!rtas_offline_cpus_mask(offline_mask))
> 

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/5 v2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Benjamin Herrenschmidt
On Fri, 2015-10-02 at 14:04 -0700, Nishanth Aravamudan wrote:
> Right, I did start with your advice and tried that approach, but it
> turned out I was wrong about the actual issue at the time. The problem
> for NVMe isn't actually the starting address alignment (which it can
> handle not being aligned to the device's page size). It doesn't handle
> (addr + len % dev_page_size != 0). That is, it's really a length
> alignment issue.
> 
> It seems incredibly device specific to have a an API into the DMA code
> to request an end alignment -- no other device seems to have this
> issue/design. If you think that's better, I can fiddle with that
> instead.
> 
> Sorry, I should have called this out better as an alternative
> consideration.

Nah it's fine. Ok. Also adding the alignment requirement to the API
would have been a much more complex patch since it would have had to
be implemented for all archs.

I think your current solution is fine.

Cheers,
Ben.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/5 v2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Nishanth Aravamudan
On 03.10.2015 [07:35:09 +1000], Benjamin Herrenschmidt wrote:
> On Fri, 2015-10-02 at 14:04 -0700, Nishanth Aravamudan wrote:
> > Right, I did start with your advice and tried that approach, but it
> > turned out I was wrong about the actual issue at the time. The problem
> > for NVMe isn't actually the starting address alignment (which it can
> > handle not being aligned to the device's page size). It doesn't handle
> > (addr + len % dev_page_size != 0). That is, it's really a length
> > alignment issue.
> > 
> > It seems incredibly device specific to have a an API into the DMA code
> > to request an end alignment -- no other device seems to have this
> > issue/design. If you think that's better, I can fiddle with that
> > instead.
> > 
> > Sorry, I should have called this out better as an alternative
> > consideration.
> 
> Nah it's fine. Ok. Also adding the alignment requirement to the API
> would have been a much more complex patch since it would have had to
> be implemented for all archs.
> 
> I think your current solution is fine.

Great, thanks. Also, while it's possible an alignment API would be more
performant...we're already not using DDW on Power in this case,
performance is not a primary concern. We want to simply be
functional/correct in this configuration.

-Nish

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/5 v2] Fix NVMe driver support on Power with 32-bit DMA

2015-10-02 Thread Nishanth Aravamudan
On 03.10.2015 [06:51:06 +1000], Benjamin Herrenschmidt wrote:
> On Fri, 2015-10-02 at 13:09 -0700, Nishanth Aravamudan wrote:
> 
> > 1) add a generic dma_get_page_shift implementation that just returns
> > PAGE_SHIFT
> 
> So you chose to return the granularity of the iommu to the driver
> rather than providing a way for the driver to request a specific
> alignment for DMA mappings. Any specific reason ?

Right, I did start with your advice and tried that approach, but it
turned out I was wrong about the actual issue at the time. The problem
for NVMe isn't actually the starting address alignment (which it can
handle not being aligned to the device's page size). It doesn't handle
(addr + len % dev_page_size != 0). That is, it's really a length
alignment issue.

It seems incredibly device specific to have a an API into the DMA code
to request an end alignment -- no other device seems to have this
issue/design. If you think that's better, I can fiddle with that
instead.

Sorry, I should have called this out better as an alternative
consideration.

-Nish


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Missing operand for tlbie instruction on Power7

2015-10-02 Thread Denis Kirjanov
On 10/2/15, Peter Bergner  wrote:
> On Fri, 2015-10-02 at 22:03 +0300, Denis Kirjanov wrote:
>> arch/powerpc/kernel/swsusp_asm64.S: Assembler messages:
>>> arch/powerpc/kernel/swsusp_asm64.S:188: Error: missing operand
>>> scripts/Makefile.build:294: recipe for target
>>> 'arch/powerpc/kernel/swsusp_asm64.o' failed
>>> make[1]: *** [arch/powerpc/kernel/swsusp_asm64.o] Error 1
>>> Makefile:941: recipe for target 'arch/powerpc/kernel' failed
>>> make: *** [arch/powerpc/kernel] Error 2
> [snip]
>>> I don't know enough ppc assembly to properly fix this but I can test.
>>
>> Could you please test the patch attached?
> [snip]
>> -0: tlbie   r4; \
>> +0: tlbie   r4, 0;  \
>
> This isn't correct.  With POWER7 and later (which this compile
> is, since it's on LE), the tlbie instruction takes two register
> operands:
>
> tlbie RB, RS
>
> The tlbie instruction on pre POWER7 cpus had one required register
> operand (RB) and an optional second L operand, where if you omitted
> it, it was the same as using "0":
>
> tlbie RB, L
>
> This is a POWER7 and later build, so your change which adds the "0"
> above is really adding r0 for RS.  The new tlbie instruction doesn't
> treat r0 specially, so you'll be using whatever random bits which
> happen to be in r0 which I don't think that is what you want.

Ok, than we can just zero out r5 for example and use it in tlbie as RS,
right?


>
>
> Peter
>
>
>
>
diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h
index dd0fc18..cb0f627 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -443,9 +443,10 @@ END_FTR_SECTION_NESTED(CPU_FTR_HAS_PPR,CPU_FTR_HAS_PPR,945)
 #if !defined(CONFIG_4xx) && !defined(CONFIG_8xx)
 #define tlbia	\
 	li	r4,1024;			\
+	li  r5,0;\
 	mtctr	r4;\
 	lis	r4,KERNELBASE@h;		\
-0:	tlbie	r4;\
+0:	tlbie	r4, r5;\
 	addi	r4,r4,0x1000;			\
 	bdnz	0b
 #endif
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Missing operand for tlbie instruction on Power7

2015-10-02 Thread Laura Abbott

On 10/02/2015 03:00 PM, Segher Boessenkool wrote:

On Sat, Oct 03, 2015 at 12:37:35AM +0300, Denis Kirjanov wrote:

-0: tlbie   r4; \
+0: tlbie   r4, 0;  \


This isn't correct.  With POWER7 and later (which this compile
is, since it's on LE), the tlbie instruction takes two register
operands:

 tlbie RB, RS

The tlbie instruction on pre POWER7 cpus had one required register
operand (RB) and an optional second L operand, where if you omitted
it, it was the same as using "0":

 tlbie RB, L

This is a POWER7 and later build, so your change which adds the "0"
above is really adding r0 for RS.  The new tlbie instruction doesn't
treat r0 specially, so you'll be using whatever random bits which
happen to be in r0 which I don't think that is what you want.


Ok, than we can just zero out r5 for example and use it in tlbie as RS,
right?


That won't assemble _unless_ your assembler is in POWER7 mode.  It also
won't do the right thing at run time on older machines.

Where is this tlbia macro used at all, for 64-bit machines?




[labbott@labbott-redhat-machine linux_upstream]$ make ARCH=powerpc 
CROSS_COMPILE=powerpc64-linux-gnu-
  CHK include/config/kernel.release
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  CHK include/generated/bounds.h
  CHK include/generated/timeconst.h
  CHK include/generated/asm-offsets.h
  CALLscripts/checksyscalls.sh
  CHK include/generated/compile.h
  CALLarch/powerpc/kernel/systbl_chk.sh
  AS  arch/powerpc/kernel/swsusp_asm64.o
arch/powerpc/kernel/swsusp_asm64.S: Assembler messages:
arch/powerpc/kernel/swsusp_asm64.S:188: Error: missing operand
scripts/Makefile.build:294: recipe for target 
'arch/powerpc/kernel/swsusp_asm64.o' failed
make[1]: *** [arch/powerpc/kernel/swsusp_asm64.o] Error 1
Makefile:941: recipe for target 'arch/powerpc/kernel' failed
make: *** [arch/powerpc/kernel] Error 2

This is piece of code protected by CONFIG_PPC_BOOK3S_64.
 


Segher



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] powerpc/ps3: Remove unused os_area_db_id_video_mode

2015-10-02 Thread Geoff Levand
On Fri, 2015-09-25 at 12:14 +1000, Michael Ellerman wrote:
> This struct is unused, which is now a build error with gcc 6:
> 
>   error: 'os_area_db_id_video_mode' defined but not used
> 
> There doesn't seem to be any good reason to keep it around so remove it,
> it's in the history if anyone needs it.

Looks OK.

Acked-by: Geoff Levand 



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v2,1/2] powerpc/selftest: Add gettimeofday() benchmark

2015-10-02 Thread Michael Ellerman
On Fri, 2015-25-09 at 04:01:39 UTC, Michael Neuling wrote:
> This adds a benchmark directory to the powerpc selftests and adds a
> gettimeofday() benchmark to it.
> 
> Suggested-by: Michael Ellerman 
> Signed-off-by: Michael Neuling 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/d17475d906fde8e9fe39fff3

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC, 1/2] scripts/kconfig/Makefile: Allow KBUILD_DEFCONFIG to be a target

2015-10-02 Thread Michael Ellerman
On Wed, 2015-23-09 at 05:40:34 UTC, Michael Ellerman wrote:
> Arch Makefiles can set KBUILD_DEFCONFIG to tell kbuild the name of the
> defconfig that should be built by default.
> 
> However currently there is an assumption that KBUILD_DEFCONFIG points to
> a file at arch/$(SRCARCH)/configs/$(KBUILD_DEFCONFIG).
> 
> We would like to use a target, using merge_config, as our defconfig, so
> adapt the logic in scripts/kconfig/Makefile to allow that.
> 
> To minimise the chance of breaking anything, we first check if
> KBUILD_DEFCONFIG is a file, and if so we do the old logic. If it's not a
> file, then we call the top-level Makefile with KBUILD_DEFCONFIG as the
> target.
> 
> Signed-off-by: Michael Ellerman 
> Acked-by: Michal Marek 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/d2036f30cfe1daa19e63ce75

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [1/3] drivers/ps3: Fix ps3-lpm white space

2015-10-02 Thread Michael Ellerman
On Mon, 2015-14-09 at 19:35:04 UTC, Rudhresh Kumar J wrote:
> Fixed a coding style issue.
> 
> Signed-off-by: Rudhresh Kumar J 
> Signed-off-by: Geoff Levand 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/12a509336701132f521c8fc2

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v2, 2/2] powerpc/vdso: Avoid link stack corruption in __get_datapage()

2015-10-02 Thread Michael Ellerman
On Fri, 2015-25-09 at 04:01:40 UTC, Michael Neuling wrote:
> powerpc has a link register (lr) used for calling functions. We "bl
> " to call a function, and "blr" to return back to the call site.



> For the benchmark in tools/testing/selftests/powerpc/benchmarks/gettimeofday.c
>   POWER8:
> 64bit gets ~4% improvement
> 32bit gets ~9% improvement
>   POWER7:
> 64bit gets ~7% improvement
> 
> Signed-off-by: Michael Neuling 
> Reported-by: Aaron Sawdey 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/c974809a26a13e40254dbe3c

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [2/3] drivers/ps3: Fix ps3-vuart null dereference

2015-10-02 Thread Michael Ellerman
On Mon, 2015-14-09 at 19:35:04 UTC, Colin King wrote:
> On the unlikely event that drv is null, the current code will
> perform a null pointer dereference with it when printing a dev_dbg
> message.  Instead, the BUG_ON check on drv should be performed
> before we emit the dev_dbg message.
> 
> Signed-off-by: Colin Ian King 
> Signed-off-by: Geoff Levand 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/6a6120bc5ec9e54d3cc06e73

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC,2/2] powerpc: Add ppc64le_defconfig

2015-10-02 Thread Michael Ellerman
On Wed, 2015-23-09 at 05:40:35 UTC, Michael Ellerman wrote:
> Based directly on ppc64_defconfig using merge_config.
> 
> Signed-off-by: Michael Ellerman 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/2adc48a691866fbb3134dd3a

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [V2,3/3] powerpc/ps3: Refresh ps3_defconfig

2015-10-02 Thread Michael Ellerman
On Mon, 2015-14-09 at 21:36:35 UTC, Geoff Levand wrote:
> Refresh and remove obsolete CONFIG_EXT3_FS.
> 
> Signed-off-by: Geoff Levand 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/336382c78b926af7f3b22f73

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/mm: Add virt_to_pfn and use this instead of opencoding

2015-10-02 Thread Michael Ellerman
On Thu, 2015-03-09 at 07:50:56 UTC, "Aneesh Kumar K.V" wrote:
> This add helper virt_to_pfn and remove the opencoded usage of the
> same.
> 
> Signed-off-by: Aneesh Kumar K.V 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/65d3223a853ac8598694064c

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc: Kconfig: remove BE-only platforms from LE kernel build

2015-10-02 Thread Michael Ellerman
On Sun, 2015-06-09 at 23:58:00 UTC, Boqun Feng wrote:
> Currently, little endian is only supported on powernv and pseries,
> however, Kconfigs still allow us to include other platforms in a LE
> kernel, this may result in space wasting or even build error if some
> BE-only platforms always assume they are built for a BE kernel. So just
> modify the Kconfigs of BE-only platforms to remove them from being built
> for a LE kernel.
> 
> For 32bit only platforms, nothing needs to be done, because
> CPU_LITTLE_ENDIAN depends on PPC64. For 64bit supported platforms, add
> CPU_BIG_ENDIAN to dependencies explicitly, so that these platforms will
> be disabled for LE [Suggested-by: Cédric Le Goater ].
> 
> Signed-off-by: Boqun Feng 
> Acked-by: Geoff Levand 

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/e5e16d8f3ec6973af2068897

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v2] powerpc/slb: Define an enum for the bolted indexes

2015-10-02 Thread Michael Ellerman
On Thu, 2015-13-08 at 07:07:54 UTC, Michael Ellerman wrote:
> From: Anshuman Khandual 
> 
> This patch defines macros for the three bolted SLB indexes we use.
> Switch the functions that take the indexes as an argument to use the
> enum.
> 
> Signed-off-by: Anshuman Khandual 
> Signed-off-by: Michael Ellerman 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/1d15010c349a26640e8f2495

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/vdso: Emit GNU & SysV hashes

2015-10-02 Thread Michael Ellerman
On Fri, 2015-07-08 at 03:05:42 UTC, Michael Ellerman wrote:
> Andy Lutomirski says:
> 
>   Some dynamic loaders may be slightly faster if a GNU hash is
>   available.
> 
>   This is unlikely to have any measurable effect on the time it takes
>   to resolve vdso symbols (since there are so few of them).  In some
>   contexts, it can be a win for a different reason: if every DSO has a
>   GNU hash section, then libc can avoid calculating SysV hashes at
>   all. Both musl and glibc appear to have this optimization.
> 
> Signed-off-by: Michael Ellerman 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/787b393c9f6300c343600d39

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: powerpc/slb: Use a local to avoid multiple calls to get_slb_shadow()

2015-10-02 Thread Michael Ellerman
On Thu, 2015-13-08 at 07:11:18 UTC, Michael Ellerman wrote:
> For no reason other than it looks ugly.
> 
> Signed-off-by: Michael Ellerman 

Applied to powerpc next.

https://git.kernel.org/powerpc/c/26cd835ef8bdc9ca6db03374

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 4/5 v2] pseries/iommu: implement DDW-aware dma_get_page_shift

2015-10-02 Thread kbuild test robot
Hi Nishanth,

[auto build test results on v4.3-rc3 -- if it's inappropriate base, please 
ignore]

config: powerpc-defconfig (attached as .config)
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=powerpc 

All error/warnings (new ones prefixed by >>):

   arch/powerpc/platforms/pseries/iommu.c: In function 
'iommu_init_early_pSeries':
>> arch/powerpc/platforms/pseries/iommu.c:1433:9: error: 'struct machdep_calls' 
>> has no member named 'dma_get_page_shift'
  ppc_md.dma_get_page_shift = dma_get_page_shift_pSeriesLP;
^

vim +1433 arch/powerpc/platforms/pseries/iommu.c

  1427  
  1428  if (firmware_has_feature(FW_FEATURE_LPAR)) {
  1429  pseries_pci_controller_ops.dma_bus_setup = 
pci_dma_bus_setup_pSeriesLP;
  1430  pseries_pci_controller_ops.dma_dev_setup = 
pci_dma_dev_setup_pSeriesLP;
  1431  ppc_md.dma_set_mask = dma_set_mask_pSeriesLP;
  1432  ppc_md.dma_get_required_mask = 
dma_get_required_mask_pSeriesLP;
> 1433  ppc_md.dma_get_page_shift = 
> dma_get_page_shift_pSeriesLP;
  1434  } else {
  1435  pseries_pci_controller_ops.dma_bus_setup = 
pci_dma_bus_setup_pSeries;
  1436  pseries_pci_controller_ops.dma_dev_setup = 
pci_dma_dev_setup_pSeries;

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 5/5 v2] drivers/nvme: default to the IOMMU page size

2015-10-02 Thread kbuild test robot
Hi Nishanth,

[auto build test results on v4.3-rc3 -- if it's inappropriate base, please 
ignore]

config: sparc64-allyesconfig (attached as .config)
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=sparc64 

All error/warnings (new ones prefixed by >>):

   drivers/block/nvme-core.c: In function 'nvme_configure_admin_queue':
>> drivers/block/nvme-core.c:1717:2: error: implicit declaration of function 
>> 'dma_get_page_shift' [-Werror=implicit-function-declaration]
 unsigned page_shift = dma_get_page_shift(dev->dev);
 ^
   cc1: some warnings being treated as errors

vim +/dma_get_page_shift +1717 drivers/block/nvme-core.c

  1711  static int nvme_configure_admin_queue(struct nvme_dev *dev)
  1712  {
  1713  int result;
  1714  u32 aqa;
  1715  u64 cap = readq(>bar->cap);
  1716  struct nvme_queue *nvmeq;
> 1717  unsigned page_shift = dma_get_page_shift(dev->dev);
  1718  unsigned dev_page_min = NVME_CAP_MPSMIN(cap) + 12;
  1719  unsigned dev_page_max = NVME_CAP_MPSMAX(cap) + 12;
  1720  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] drivers/nvme: default to the IOMMU page size on Power

2015-10-02 Thread kbuild test robot
Hi Nishanth,

[auto build test results on v4.3-rc3 -- if it's inappropriate base, please 
ignore]

config: arm64-allmodconfig (attached as .config)
reproduce:
wget 
https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross
 -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=arm64 

All error/warnings (new ones prefixed by >>):

>> drivers/block/nvme-core.c:45:23: fatal error: asm/iommu.h: No such file or 
>> directory
#include 
  ^
   compilation terminated.

vim +45 drivers/block/nvme-core.c

39  #include 
40  #include 
41  #include 
42  #include 
43  #include 
44  #include 
  > 45  #include 
46  
47  #define NVME_MINORS (1U << MINORBITS)
48  #define NVME_Q_DEPTH1024

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: Binary data
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: Missing operand for tlbie instruction on Power7

2015-10-02 Thread Peter Bergner
On Fri, 2015-10-02 at 17:00 -0500, Segher Boessenkool wrote:
> On Sat, Oct 03, 2015 at 12:37:35AM +0300, Denis Kirjanov wrote:
> > >> -0: tlbie   r4; \
> > >> +0: tlbie   r4, 0;  \
> > >
> > > This isn't correct.  With POWER7 and later (which this compile
> > > is, since it's on LE), the tlbie instruction takes two register
> > > operands:
> > >
> > > tlbie RB, RS
> > >
> > > The tlbie instruction on pre POWER7 cpus had one required register
> > > operand (RB) and an optional second L operand, where if you omitted
> > > it, it was the same as using "0":
> > >
> > > tlbie RB, L
> > >
> > > This is a POWER7 and later build, so your change which adds the "0"
> > > above is really adding r0 for RS.  The new tlbie instruction doesn't
> > > treat r0 specially, so you'll be using whatever random bits which
> > > happen to be in r0 which I don't think that is what you want.
> > 
> > Ok, than we can just zero out r5 for example and use it in tlbie as RS,
> > right?
> 
> That won't assemble _unless_ your assembler is in POWER7 mode.  It also
> won't do the right thing at run time on older machines.

Correct, getting this to work on both pre-power7 and power7 and later
is tricky.  One really horrible hack would be to do:

  li r0,0
  tlbie r4,0

On pre-power7, the "0" will be taken as a zero L operand and on
power7 and later, it'll be r0, but with a zero value we loaded in
the insn before.  I know, really ugly. :-)

Peter


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev