Re: [RFC PATCH V1 0/8] KASAN ppc64 support

2015-08-18 Thread Aneesh Kumar K.V
Andrey Ryabinin ryabinin@gmail.com writes:

 2015-08-18 8:42 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com:
 Andrey Ryabinin ryabinin@gmail.com writes:


 But that is introducting conditionals in core code for no real benefit.
 This also will break when we eventually end up tracking vmalloc ?

 Ok, that's a very good reason to not do this.

 I see one potential problem in the way you use kasan_zero_page, though.
 memset/memcpy of large portions of memory (  8 * PAGE_SIZE) will end up
 in overflowing kasan_zero_page when we check shadow in memory_is_poisoned_n()


Any suggestion on how to fix that ? I guess we definitely don't want to
check for addr and size in memset/memcpy. The other option is to
do zero page mapping as is done for other architectures. That is we map
via page table a zero page. But we still have the issue of memory we
need to map the entire vmalloc range (page table memory). I was hoping to
avoid all those complexities.


-aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

RE: [PATCH V2] powerpc/85xx: Remove unused pci fixup hooks on c293pcie

2015-08-18 Thread Hou Zhiqiang
Hi Scott,

Removed both pcibios_fixup_phb and pcibios_fixup_bus.
Could you please help to apply it?

 -Original Message-
 From: Zhiqiang Hou [mailto:b48...@freescale.com]
 Sent: 2015年8月10日 17:40
 To: ga...@kernel.crashing.org; linuxppc-dev@lists.ozlabs.org; Wood Scott-
 B07421
 Cc: Hu Mingkai-B21284; Wang Dongsheng-B40534; Hou Zhiqiang-B48286
 Subject: [PATCH V2] powerpc/85xx: Remove unused pci fixup hooks on
 c293pcie
 
 From: Hou Zhiqiang b48...@freescale.com
 
 The c293pcie board is an endpoint device and it doesn't need PM, so
 remove hooks pcibios_fixup_phb and pcibios_fixup_bus.
 
 Signed-off-by: Hou Zhiqiang b48...@freescale.com
 ---
 Test on c293pcie board:
 V2:
 Rename the title of this patch.
 Remove pcibios_fixup_bus that isn't used in EP.
 
  arch/powerpc/platforms/85xx/c293pcie.c | 4 
  1 file changed, 4 deletions(-)
 
 diff --git a/arch/powerpc/platforms/85xx/c293pcie.c
 b/arch/powerpc/platforms/85xx/c293pcie.c
 index 84476b6..61bc851 100644
 --- a/arch/powerpc/platforms/85xx/c293pcie.c
 +++ b/arch/powerpc/platforms/85xx/c293pcie.c
 @@ -66,10 +66,6 @@ define_machine(c293_pcie) {
   .probe  = c293_pcie_probe,
   .setup_arch = c293_pcie_setup_arch,
   .init_IRQ   = c293_pcie_pic_init,
 -#ifdef CONFIG_PCI
 - .pcibios_fixup_bus  = fsl_pcibios_fixup_bus,
 - .pcibios_fixup_phb  = fsl_pcibios_fixup_phb,
 -#endif
   .get_irq= mpic_get_irq,
   .restart= fsl_rstcr_restart,
   .calibrate_decr = generic_calibrate_decr,
 --
 2.1.0.27.g96db324

Thanks,
Zhiqiang

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [05/27] macintosh: therm_windtunnel: Export I2C module alias information

2015-08-18 Thread Javier Martinez Canillas
Hello Michael,

On 08/18/2015 12:24 PM, Michael Ellerman wrote:
 On Thu, 2015-30-07 at 16:18:30 UTC, Javier Martinez Canillas wrote:
 The I2C core always reports the MODALIAS uevent as i2c:client name
 regardless if the driver was matched using the I2C id_table or the
 of_match_table. So the driver needs to export the I2C table and this
 be built into the module or udev won't have the necessary information
 to auto load the correct module when the device is added.

 Signed-off-by: Javier Martinez Canillas jav...@osg.samsung.com
 ---

  drivers/macintosh/therm_windtunnel.c | 1 +
  1 file changed, 1 insertion(+)
 
 Who are you expecting to merge this?
 

I was expecting Benjamin Herrenschmidt since he is listed in MAINTAINERS
for drivers/macintosh. I cc'ed him in the patch but now in your answer I
don't see him in the cc list, strange.

But I'll be happy to re-post if there is another person who is handling
the patches for this driver now.

BTW there is another patch [0] for the same driver to export the OF id
table information, that was not picked either.

 cheers
 

[0]: https://lkml.org/lkml/2015/7/30/503

Best regards,
-- 
Javier Martinez Canillas
Open Source Group
Samsung Research America
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] cxl: Allow release of contexts which have been OPENED but not STARTED

2015-08-18 Thread Michael Ellerman
On Tue, 2015-08-18 at 16:30 +1000, Andrew Donnellan wrote:
 If we open a context but do not start it (either because we do not attempt
 to start it, or because it fails to start for some reason), we are left
 with a context in state OPENED. Previously, cxl_release_context() only
 allowed releasing contexts in state CLOSED, so attempting to release an
 OPENED context would fail.
 
 In particular, this bug causes available contexts to run out after some EEH
 failures, where drivers attempt to release contexts that have failed to
 start.
 
 Allow releasing contexts in any state other than STARTED, i.e. OPENED or
 CLOSED (we can't release a STARTED context as it's currently using the
 hardware).
 
 Cc: sta...@vger.kernel.org
 Fixes: 6f7f0b3df6d4 (cxl: Add AFU virtual PHB and kernel API)
 Signed-off-by: Andrew Donnellan andrew.donnel...@au1.ibm.com
 Signed-off-by: Daniel Axtens d...@axtens.net
 ---
  drivers/misc/cxl/api.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
 index 6a768a9..1c520b8 100644
 --- a/drivers/misc/cxl/api.c
 +++ b/drivers/misc/cxl/api.c
 @@ -59,7 +59,7 @@ EXPORT_SYMBOL_GPL(cxl_get_phys_dev);
  
  int cxl_release_context(struct cxl_context *ctx)
  {
 - if (ctx-status != CLOSED)
 + if (ctx-status == STARTED)
   return -EBUSY;

So this doesn't break when you add a new state, is it worth writing it as:

if (ctx-status = STARTED)
return -EBUSY;

?

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH] cxl: Allow release of contexts which have been OPENED but not STARTED

2015-08-18 Thread Andrew Donnellan
If we open a context but do not start it (either because we do not attempt
to start it, or because it fails to start for some reason), we are left
with a context in state OPENED. Previously, cxl_release_context() only
allowed releasing contexts in state CLOSED, so attempting to release an
OPENED context would fail.

In particular, this bug causes available contexts to run out after some EEH
failures, where drivers attempt to release contexts that have failed to
start.

Allow releasing contexts in any state other than STARTED, i.e. OPENED or
CLOSED (we can't release a STARTED context as it's currently using the
hardware).

Cc: sta...@vger.kernel.org
Fixes: 6f7f0b3df6d4 (cxl: Add AFU virtual PHB and kernel API)
Signed-off-by: Andrew Donnellan andrew.donnel...@au1.ibm.com
Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/api.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index 6a768a9..1c520b8 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -59,7 +59,7 @@ EXPORT_SYMBOL_GPL(cxl_get_phys_dev);
 
 int cxl_release_context(struct cxl_context *ctx)
 {
-   if (ctx-status != CLOSED)
+   if (ctx-status == STARTED)
return -EBUSY;
 
put_device(ctx-afu-dev);
-- 
Andrew Donnellan  Software Engineer, OzLabs
andrew.donnel...@au1.ibm.com  Australia Development Lab, Canberra
+61 2 6201 8874 (work)IBM Australia Limited

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH V1 7/8] powerpc/mm: kasan: Add kasan support for ppc64

2015-08-18 Thread Andrey Ryabinin
2015-08-18 8:36 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com:
 Andrey Ryabinin ryabinin@gmail.com writes:

 2015-08-17 15:13 GMT+03:00 Andrey Ryabinin ryabinin@gmail.com:

 Did you disable stack instrumentation (in scripts/Makefile.kasa),
 or you version of gcc doesn't support it (e.g. like 4.9.x on x86) ?

 Because this can't work with stack instrumentation as you don't have shadow 
 for stack in early code.

 But this should be doable, as I think. All you need is to setup shadow for 
 init task's
 stack before executing any instrumented function.

 And you also need to define CONFIG_KASAN_SHADOW_OFFSET, so it will be
 passed to GCC
 via -fasan-shadow-offset= option.

 I am using KASAN minimal config. Hence this was not needed. Do we need
 to pass that option for outline instrumentation ? If not it would be a
 good idea to split that out and make it depend on KASAN_INLINE


We need to pass this for stack instrumentation too.

 -aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH V1 0/8] KASAN ppc64 support

2015-08-18 Thread Andrey Ryabinin
2015-08-18 12:21 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com:
 Andrey Ryabinin ryabinin@gmail.com writes:

 2015-08-18 8:42 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com:
 Andrey Ryabinin ryabinin@gmail.com writes:


 But that is introducting conditionals in core code for no real benefit.
 This also will break when we eventually end up tracking vmalloc ?

 Ok, that's a very good reason to not do this.

 I see one potential problem in the way you use kasan_zero_page, though.
 memset/memcpy of large portions of memory (  8 * PAGE_SIZE) will end up
 in overflowing kasan_zero_page when we check shadow in memory_is_poisoned_n()


 Any suggestion on how to fix that ? I guess we definitely don't want to

Wait, I was wrong, we should be fine.
In memory_is_poisoned_n():

ret = memory_is_zero(kasan_mem_to_shadow((void *)addr),
kasan_mem_to_shadow((void *)addr + size - 1) + 1);

So this will be: memory_is_zero(kasan_zero_page, (char *)kasan_zero_page + 1);
Which means that we will access only 1 byte of kasan_zero_page.


 check for addr and size in memset/memcpy. The other option is to
 do zero page mapping as is done for other architectures. That is we map
 via page table a zero page. But we still have the issue of memory we
 need to map the entire vmalloc range (page table memory). I was hoping to
 avoid all those complexities.


 -aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v8,3/3] leds/powernv: Add driver for PowerNV platform

2015-08-18 Thread Michael Ellerman
On Sat, 2015-25-07 at 05:21:10 UTC, Vasant Hegde wrote:
 This patch implements LED driver for PowerNV platform using the existing
 generic LED class framework.
 
 PowerNV platform has below type of LEDs:
   - System attention
   Indicates there is a problem with the system that needs attention.
   - Identify
   Helps the user locate/identify a particular FRU or resource in the
   system.
   - Fault
   Indicates there is a problem with the FRU or resource at the
   location with which the indicator is associated.

Hi Vasant,

I'm waiting for a respin of this based on the discussion between you and
Jackek.

If I don't see it soon it will miss v4.3.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [v2, 1/2] Move the pt_regs_offset struct definition from arch to common include file

2015-08-18 Thread Michael Ellerman
On Mon, 2015-27-07 at 04:39:33 UTC, David A. Long wrote:
 From: David A. Long dave.l...@linaro.org
 
 The pt_regs_offset structure is used for the HAVE_REGS_AND_STACK_ACCESS_API
 feature and has identical definitions in four different arch ptrace.h
 include files. It seems unlikely that definition would ever need to be
 changed regardless of architecture so lets move it into
 include/linux/ptrace.h, along with macros commonly used to access it.

Thanks for cleaning this up. Tested successfully on powerpc.

Acked-by: Michael Ellerman m...@ellerman.id.au

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH V1 4/8] kasan: Don't use kasan shadow pointer in generic functions

2015-08-18 Thread Andrey Ryabinin
2015-08-18 8:29 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com:
 Andrey Ryabinin ryabinin@gmail.com writes:

 On 08/17/2015 09:36 AM, Aneesh Kumar K.V wrote:
 We can't use generic functions like print_hex_dump to access kasan
 shadow region. This require us to setup another kasan shadow region
 for the address passed (kasan shadow address). Most architecture won't
 be able to do that. Hence remove dumping kasan shadow region dump. If
 we really want to do this we will have to have a kasan internal implemen
 tation of print_hex_dump for which we will disable address sanitizer
 operation.


 I didn't understand that.
 Yes, you don't have shadow for shadow. But, for shadow addresses you
 return return (void *)kasan_zero_page in kasan_mem_to_shadow(), so we
 should be fine to access shadow in generic code.


 But in general IMHO it is not correct to pass shadow address to generic
 functions, because that requires arch to setup shadow for the shadow.

Yes, we have this shadow for shadow in x86_64/arm64.

 With one of the initial implementation of ppc64 support, I had page
 table entries setup for vmalloc and vmemmap shadow and that is when I
 hit the issue. We cannot expect arch to setup shadow regions like what is
 expected here. If we really need to print the shadow memory content, we
 could possibly make a copy of print_hex_dump in kasan_init.c . Let me
 know whether you think printing shadow area content is needed.


It was quite useful sometimes, so I think we should keep it.
But I agree with you, that it would be better to avoid accesses to shadow memory
in generic code.
Another way to deal with this would be to copy shadow content in buffer,
and then print_hex_dump() it.

 -aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [05/27] macintosh: therm_windtunnel: Export I2C module alias information

2015-08-18 Thread Michael Ellerman
On Thu, 2015-30-07 at 16:18:30 UTC, Javier Martinez Canillas wrote:
 The I2C core always reports the MODALIAS uevent as i2c:client name
 regardless if the driver was matched using the I2C id_table or the
 of_match_table. So the driver needs to export the I2C table and this
 be built into the module or udev won't have the necessary information
 to auto load the correct module when the device is added.
 
 Signed-off-by: Javier Martinez Canillas jav...@osg.samsung.com
 ---
 
  drivers/macintosh/therm_windtunnel.c | 1 +
  1 file changed, 1 insertion(+)

Who are you expecting to merge this?

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: provide more common DMA API functions V2

2015-08-18 Thread Stephen Rothwell
Hi Andrew,

On Tue, 18 Aug 2015 07:53:15 +0200 Christoph Hellwig h...@lst.de wrote:

 On Mon, Aug 17, 2015 at 10:45:52PM -0700, Andrew Morton wrote:

I'll merge these 5 patches for 4.3.  That means I'll release them into
linux-next after 4.2 is released.
   
   So you only add for-4.3 code to -next after 4.2 is odd?  Isn't thast the
   wrong way around?
  
  Linus will be releasing 4.2 in 1-2 weeks and until then, linux-next is
  supposed to contain only 4.2 material.  Once 4.2 is released,
  linux-next is open for 4.3 material.
 
 Hmm, I'm pretty sure there's tons of 4.3 material in linux-next at the
 moment, at least I got merge warning messages from Stephen about
 some yesterday.

Yeah, we are at v4.2-rc7 so linux-next is full of stuff to be merged by
Linus for v4.3. Nothing for v4.4 should be in linux-next until after
v4.3-rc1 is released in 3-4 weeks i.e. after the next merge window
closes.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] cxl: Allow release of contexts which have been OPENED but not STARTED

2015-08-18 Thread Ian Munsie
Acked-by: Ian Munsie imun...@au1.ibm.com

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v2] powerpc/e6500: hw tablewalk: make sure we invalidate and write to the same tlb entry

2015-08-18 Thread Kevin Hao
In order to workaround Erratum A-008139, we have to invalidate the
tlb entry with tlbilx before overwriting. Due to the performance
consideration, we don't add any memory barrier when acquire/release
the tcd lock. This means the two load instructions for esel_next do
have the possibility to return different value. This is definitely
not acceptable due to the Erratum A-008139. We have two options to
fix this issue:
  a) Add memory barrier when acquire/release tcd lock to order the
 load/store to esel_next.
  b) Just make sure to invalidate and write to the same tlb entry and
 tolerate the race that we may get the wrong value and overwrite
 the tlb entry just updated by the other thread.

We observe better performance using option b. So reserve an additional
register to save the value of the esel_next.

Signed-off-by: Kevin Hao haoke...@gmail.com
---
v2: Use an additional register for saving the value of esel_next instead of 
lwsync.

 arch/powerpc/include/asm/exception-64e.h | 11 ++-
 arch/powerpc/mm/tlb_low_64e.S| 26 ++
 2 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/exception-64e.h 
b/arch/powerpc/include/asm/exception-64e.h
index a8b52b61043f..d53575becbed 100644
--- a/arch/powerpc/include/asm/exception-64e.h
+++ b/arch/powerpc/include/asm/exception-64e.h
@@ -69,13 +69,14 @@
 #define EX_TLB_ESR ( 9 * 8) /* Level 0 and 2 only */
 #define EX_TLB_SRR0(10 * 8)
 #define EX_TLB_SRR1(11 * 8)
+#define EX_TLB_R7  (12 * 8)
 #ifdef CONFIG_BOOK3E_MMU_TLB_STATS
-#define EX_TLB_R8  (12 * 8)
-#define EX_TLB_R9  (13 * 8)
-#define EX_TLB_LR  (14 * 8)
-#define EX_TLB_SIZE(15 * 8)
+#define EX_TLB_R8  (13 * 8)
+#define EX_TLB_R9  (14 * 8)
+#define EX_TLB_LR  (15 * 8)
+#define EX_TLB_SIZE(16 * 8)
 #else
-#define EX_TLB_SIZE(12 * 8)
+#define EX_TLB_SIZE(13 * 8)
 #endif
 
 #defineSTART_EXCEPTION(label)  
\
diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S
index e4185581c5a7..3a5b89dfb5a1 100644
--- a/arch/powerpc/mm/tlb_low_64e.S
+++ b/arch/powerpc/mm/tlb_low_64e.S
@@ -68,11 +68,21 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV)
ld  r14,PACAPGD(r13)
std r15,EX_TLB_R15(r12)
std r10,EX_TLB_CR(r12)
+#ifdef CONFIG_PPC_FSL_BOOK3E
+BEGIN_FTR_SECTION
+   std r7,EX_TLB_R7(r12)
+END_FTR_SECTION_IFSET(CPU_FTR_SMT)
+#endif
TLB_MISS_PROLOG_STATS
 .endm
 
 .macro tlb_epilog_bolted
ld  r14,EX_TLB_CR(r12)
+#ifdef CONFIG_PPC_FSL_BOOK3E
+BEGIN_FTR_SECTION
+   ld  r7,EX_TLB_R7(r12)
+END_FTR_SECTION_IFSET(CPU_FTR_SMT)
+#endif
ld  r10,EX_TLB_R10(r12)
ld  r11,EX_TLB_R11(r12)
ld  r13,EX_TLB_R13(r12)
@@ -297,6 +307,7 @@ itlb_miss_fault_bolted:
  * r13 = PACA
  * r11 = tlb_per_core ptr
  * r10 = crap (free to use)
+ * r7  = esel_next
  */
 tlb_miss_common_e6500:
crmove  cr2*4+2,cr0*4+2 /* cr2.eq != 0 if kernel address */
@@ -334,8 +345,8 @@ BEGIN_FTR_SECTION   /* CPU_FTR_SMT */
 * with tlbilx before overwriting.
 */
 
-   lbz r15,TCD_ESEL_NEXT(r11)
-   rlwinm  r10,r15,16,0xff
+   lbz r7,TCD_ESEL_NEXT(r11)
+   rlwinm  r10,r7,16,0xff
orisr10,r10,MAS0_TLBSEL(1)@h
mtspr   SPRN_MAS0,r10
isync
@@ -429,15 +440,14 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_SMT)
mtspr   SPRN_MAS2,r15
 
 tlb_miss_huge_done_e6500:
-   lbz r15,TCD_ESEL_NEXT(r11)
lbz r16,TCD_ESEL_MAX(r11)
lbz r14,TCD_ESEL_FIRST(r11)
-   rlwimi  r10,r15,16,0x00ff   /* insert esel_next into MAS0 */
-   addir15,r15,1   /* increment esel_next */
+   rlwimi  r10,r7,16,0x00ff/* insert esel_next into MAS0 */
+   addir7,r7,1 /* increment esel_next */
mtspr   SPRN_MAS0,r10
-   cmpwr15,r16
-   iseleq  r15,r14,r15 /* if next == last use first */
-   stb r15,TCD_ESEL_NEXT(r11)
+   cmpwr7,r16
+   iseleq  r7,r14,r7   /* if next == last use first */
+   stb r7,TCD_ESEL_NEXT(r11)
 
tlbwe
 
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC PATCH V1 0/8] KASAN ppc64 support

2015-08-18 Thread Andrey Ryabinin
2015-08-18 8:42 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com:
 Andrey Ryabinin ryabinin@gmail.com writes:

 2015-08-17 12:50 GMT+03:00 Aneesh Kumar K.V 
 aneesh.ku...@linux.vnet.ibm.com:
 Because of the above I concluded that we may not be able to do
 inline instrumentation. Now if we are not doing inline instrumentation,
 we can simplify kasan support by not creating a shadow mapping at all
 for vmalloc and vmemmap region. Hence the idea of returning the address
 of a zero page for anything other than kernel linear map region.


 Yes, mapping zero page needed only for inline instrumentation.
 You simply don't need to check shadow for vmalloc/vmemmap.

 So, instead of redefining kasan_mem_to_shadow() I'd suggest to
 add one more arch hook. Something like:

 bool kasan_tracks_vaddr(unsigned long addr)
 {
  return REGION_ID(addr) == KERNEL_REGION_ID;
 }

 And in check_memory_region():
if (!(kasan_enabled()  kasan_tracks_vaddr(addr)))
return;


 But that is introducting conditionals in core code for no real benefit.
 This also will break when we eventually end up tracking vmalloc ?

Ok, that's a very good reason to not do this.

I see one potential problem in the way you use kasan_zero_page, though.
memset/memcpy of large portions of memory (  8 * PAGE_SIZE) will end up
in overflowing kasan_zero_page when we check shadow in memory_is_poisoned_n()

 In that case our mem_to_shadow will esentially be a switch
 statement returning different offsets for kernel region and vmalloc
 region. As far as core kernel code is considered, it just need to
 ask arch to get the shadow address for a memory and instead of adding
 conditionals in core, my suggestion is, we handle this in an arch function.

 -aneesh

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: provide more common DMA API functions V2

2015-08-18 Thread Ingo Molnar

* Andrew Morton a...@linux-foundation.org wrote:

 On Tue, 18 Aug 2015 07:38:25 +0200 Christoph Hellwig h...@lst.de wrote:
 
  On Mon, Aug 17, 2015 at 02:24:29PM -0700, Andrew Morton wrote:
   110254 bytes saved, shrinking the kernel by a whopping 0.17%. 
   Thoughts?
  
  Sounds fine to me.
 
 OK, I'll clean it up a bit, check that each uninlining actually makes
 sense and then I'll see how it goes.
 
   
   I'll merge these 5 patches for 4.3.  That means I'll release them into
   linux-next after 4.2 is released.
  
  So you only add for-4.3 code to -next after 4.2 is odd?  Isn't thast the
  wrong way around?
 
 Linus will be releasing 4.2 in 1-2 weeks and until then, linux-next is
 supposed to contain only 4.2 material.  Once 4.2 is released,
 linux-next is open for 4.3 material.

Isn't that off by one?

I.e. shouldn't this be:

 I'll merge these 5 patches for 4.4.  That means I'll release them into 
 linux-next after 4.2 is released.

 [...]
 
 Linus will be releasing 4.2 in 1-2 weeks and until then, linux-next is 
 supposed 
 to contain only 4.3 material.  Once 4.2 is released and the 4.3 merge window 
 opens, linux-next is open for 4.4 material.

?

Thanks,

Ingo
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [1/1] powerpc/xmon: Paged output for paca display

2015-08-18 Thread Michael Ellerman
On Fri, 2015-14-08 at 02:55:14 UTC, Sam bobroff wrote:
 The paca display is already more than 24 lines, which can be problematic
 if you have an old school 80x24 terminal, or more likely you are on a
 virtual terminal which does not scroll for whatever reason.
 
 This adds an optional letter to the dp and dpa xmon commands
 (dpp and dppa), which will enable a per-page display (with 16
 line pages): the first page  will be displayed and if there was data
 that didn't fit, it will display a message indicating that the user can
 use enter to display the next page. The intent is that this feels
 similar to the way the memory display functions work.
 
 This is implemented by running over the entire output both for the
 initial command and for each subsequent page: the visible part is
 clipped out by checking line numbers. Handling the empty command as
 more is done by writing a special command into a static buffer that
 indicates where to move the sliding visibility window. This is similar
 to the approach used for the memory dump commands except that the
 state data is encoded into the last_cmd string, rather than a set of
 static variables. The memory dump commands could probably be rewritten
 to make use of the same buffer and remove their other static
 variables.
 
 Sample output:
 
 0:mon dpp1
 paca for cpu 0x1 @ cfdc0480:
  possible = yes
  present  = yes
  online   = yes
  lock_token   = 0x8000(0x8)
  paca_index   = 0x1   (0xa)
  kernel_toc   = 0xc0eb2400(0x10)
  kernelbase   = 0xc000(0x18)
  kernel_msr   = 0xb0001032(0x20)
  emergency_sp = 0xc0003ffe8000(0x28)
  mc_emergency_sp  = 0xc0003ffe4000(0x2e0)
  in_mce   = 0x0   (0x2e8)
  data_offset  = 0x7f17(0x30)
  hw_cpu_id= 0x8   (0x38)
  cpu_start= 0x1   (0x3a)
  kexec_state  = 0x0   (0x3b)
 [Enter for next page]
 0:mon
  __current= 0xc0007e696620(0x290)
  kstack   = 0xc0007e6ebe30(0x298)
  stab_rr  = 0xb   (0x2a0)
  saved_r1 = 0xc0007ef37860(0x2a8)
  trap_save= 0x0   (0x2b8)
  soft_enabled = 0x0   (0x2ba)
  irq_happened = 0x1   (0x2bb)
  io_sync  = 0x0   (0x2bc)
  irq_work_pending = 0x0   (0x2bd)
  nap_state_lost   = 0x0   (0x2be)
 0:mon
 
 (Based on a similar patch by Michael Ellerman m...@ellerman.id.au
 [v2] powerpc/xmon: Allow limiting the size of the paca display.
 This patch is an alternative and cannot coexist with the original.)


So this is nice, but ... the diff is twice the size of my version, plus 128
bytes of BSS, so I'm not sure the added benefit is sufficient to justify the
added code complexity.

But you can convince me otherwise if you feel strongly about it.

cheers
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 0/7] dax: I/O path enhancements

2015-08-18 Thread Ross Zwisler
The goal of this series is to enhance the DAX I/O path so that all operations
that store data (I/O writes, zeroing blocks, punching holes, etc.) properly
synchronize the stores to media using the PMEM API.  This ensures that the data
DAX is writing is durable on media before the operation completes.

Patches 1-4 are a few random cleanups.

Changes from v3 (all in patch 5):
 - moved linux/uio.h include from x86 pmem.h to linux/pmem.h (Christoph)
 - made some local void* variables where apporpriate to cut down on __force
   casts from __pmem (Christoph)
 - made a __iter_needs_pmem_wb() helper and added a TODO to move to
   non-temporal stores (Christoph)

Ross Zwisler (7):
  brd: make rd_size static
  pmem, x86: move x86 PMEM API to new pmem.h header
  pmem: remove layer when calling arch_has_wmb_pmem()
  pmem, x86: clean up conditional pmem includes
  pmem: add copy_from_iter_pmem() and clear_pmem()
  dax: update I/O path to do proper PMEM flushing
  pmem, dax: have direct_access use __pmem annotation

 Documentation/filesystems/Locking |   3 +-
 MAINTAINERS   |   1 +
 arch/powerpc/sysdev/axonram.c |   7 +-
 arch/x86/include/asm/cacheflush.h |  71 -
 arch/x86/include/asm/pmem.h   | 158 ++
 drivers/block/brd.c   |   6 +-
 drivers/nvdimm/pmem.c |   4 +-
 drivers/s390/block/dcssblk.c  |  10 ++-
 fs/block_dev.c|   2 +-
 fs/dax.c  |  68 +---
 include/linux/blkdev.h|   8 +-
 include/linux/pmem.h  |  79 +++
 12 files changed, 289 insertions(+), 128 deletions(-)
 create mode 100644 arch/x86/include/asm/pmem.h

-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH v4 7/7] pmem, dax: have direct_access use __pmem annotation

2015-08-18 Thread Ross Zwisler
Update the annotation for the kaddr pointer returned by direct_access()
so that it is a __pmem pointer.  This is consistent with the PMEM driver
and with how this direct_access() pointer is used in the DAX code.

Signed-off-by: Ross Zwisler ross.zwis...@linux.intel.com
Reviewed-by: Christoph Hellwig h...@lst.de
---
 Documentation/filesystems/Locking |  3 ++-
 arch/powerpc/sysdev/axonram.c |  7 ---
 drivers/block/brd.c   |  4 ++--
 drivers/nvdimm/pmem.c |  4 ++--
 drivers/s390/block/dcssblk.c  | 10 ++
 fs/block_dev.c|  2 +-
 fs/dax.c  | 42 ---
 include/linux/blkdev.h|  8 
 8 files changed, 43 insertions(+), 37 deletions(-)

diff --git a/Documentation/filesystems/Locking 
b/Documentation/filesystems/Locking
index 6a34a0f..06d4434 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -397,7 +397,8 @@ prototypes:
int (*release) (struct gendisk *, fmode_t);
int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
int (*compat_ioctl) (struct block_device *, fmode_t, unsigned, unsigned 
long);
-   int (*direct_access) (struct block_device *, sector_t, void **, 
unsigned long *);
+   int (*direct_access) (struct block_device *, sector_t, void __pmem **,
+   unsigned long *);
int (*media_changed) (struct gendisk *);
void (*unlock_native_capacity) (struct gendisk *);
int (*revalidate_disk) (struct gendisk *);
diff --git a/arch/powerpc/sysdev/axonram.c b/arch/powerpc/sysdev/axonram.c
index ee90db1..a2be2a6 100644
--- a/arch/powerpc/sysdev/axonram.c
+++ b/arch/powerpc/sysdev/axonram.c
@@ -141,13 +141,14 @@ axon_ram_make_request(struct request_queue *queue, struct 
bio *bio)
  */
 static long
 axon_ram_direct_access(struct block_device *device, sector_t sector,
-  void **kaddr, unsigned long *pfn, long size)
+  void __pmem **kaddr, unsigned long *pfn, long size)
 {
struct axon_ram_bank *bank = device-bd_disk-private_data;
loff_t offset = (loff_t)sector  AXON_RAM_SECTOR_SHIFT;
+   void *addr = (void *)(bank-ph_addr + offset);
 
-   *kaddr = (void *)(bank-ph_addr + offset);
-   *pfn = virt_to_phys(*kaddr)  PAGE_SHIFT;
+   *kaddr = (void __pmem *)addr;
+   *pfn = virt_to_phys(addr)  PAGE_SHIFT;
 
return bank-size - offset;
 }
diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 5750b39..2691bb6 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -371,7 +371,7 @@ static int brd_rw_page(struct block_device *bdev, sector_t 
sector,
 
 #ifdef CONFIG_BLK_DEV_RAM_DAX
 static long brd_direct_access(struct block_device *bdev, sector_t sector,
-   void **kaddr, unsigned long *pfn, long size)
+   void __pmem **kaddr, unsigned long *pfn, long size)
 {
struct brd_device *brd = bdev-bd_disk-private_data;
struct page *page;
@@ -381,7 +381,7 @@ static long brd_direct_access(struct block_device *bdev, 
sector_t sector,
page = brd_insert_page(brd, sector);
if (!page)
return -ENOSPC;
-   *kaddr = page_address(page);
+   *kaddr = (void __pmem *)page_address(page);
*pfn = page_to_pfn(page);
 
/*
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index ade9eb9..68f6a6a 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -92,7 +92,7 @@ static int pmem_rw_page(struct block_device *bdev, sector_t 
sector,
 }
 
 static long pmem_direct_access(struct block_device *bdev, sector_t sector,
- void **kaddr, unsigned long *pfn, long size)
+ void __pmem **kaddr, unsigned long *pfn, long size)
 {
struct pmem_device *pmem = bdev-bd_disk-private_data;
size_t offset = sector  9;
@@ -101,7 +101,7 @@ static long pmem_direct_access(struct block_device *bdev, 
sector_t sector,
return -ENODEV;
 
/* FIXME convert DAX to comprehend that this mapping has a lifetime */
-   *kaddr = (void __force *) pmem-virt_addr + offset;
+   *kaddr = pmem-virt_addr + offset;
*pfn = (pmem-phys_addr + offset)  PAGE_SHIFT;
 
return pmem-size - offset;
diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
index da21281..2c5a397 100644
--- a/drivers/s390/block/dcssblk.c
+++ b/drivers/s390/block/dcssblk.c
@@ -29,7 +29,7 @@ static int dcssblk_open(struct block_device *bdev, fmode_t 
mode);
 static void dcssblk_release(struct gendisk *disk, fmode_t mode);
 static void dcssblk_make_request(struct request_queue *q, struct bio *bio);
 static long dcssblk_direct_access(struct block_device *bdev, sector_t secnum,
-void **kaddr, unsigned long *pfn, long size);
+void __pmem **kaddr, unsigned long *pfn, long 

[PATCH 2/2] powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case

2015-08-18 Thread Guilherme G. Piccoli
Since the commit 1851617cd2 (PCI/MSI: Disable MSI at enumeration even if
kernel doesn't support MSI), MSI/MSI-X interrupts aren't being disabled
at PCI probe time, as the logic responsible for this was moved in the
aforementioned commit from pci_device_add() to pci_setup_device(). The
latter function is not reachable on PowerPC pSeries platform during
Open Firmware PCI probing time.

This patch calls pci_msi_setup_pci_dev() explicitly to disable MSI/MSI-X
during PCI probe time on pSeries platform.

Signed-off-by: Guilherme G. Piccoli gpicc...@linux.vnet.ibm.com
---
 arch/powerpc/kernel/pci_of_scan.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/kernel/pci_of_scan.c 
b/arch/powerpc/kernel/pci_of_scan.c
index 42e02a2..0e920f3 100644
--- a/arch/powerpc/kernel/pci_of_scan.c
+++ b/arch/powerpc/kernel/pci_of_scan.c
@@ -191,6 +191,9 @@ struct pci_dev *of_create_pci_dev(struct device_node *node,
 
pci_device_add(dev, bus);
 
+   /* Disable MSI/MSI-X here to avoid bogus interrupts */
+   pci_msi_setup_pci_dev(dev);
+
return dev;
 }
 EXPORT_SYMBOL(of_create_pci_dev);
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V4 3/6] powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR

2015-08-18 Thread Wei Yang
In current implementation, when VF BAR is bigger than 64MB, it uses 4 M64
BARs in Single PE mode to cover the number of VFs required to be enabled.
By doing so, several VFs would be in one VF Group and leads to interference
between VFs in the same group.

And in this patch, m64_wins is renamed to m64_map, which means index number
of the M64 BAR used to map the VF BAR. Based on Gavin's comments.

This patch changes the design by using one M64 BAR in Single PE mode for
one VF BAR. This gives absolute isolation for VFs.

Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/pci-bridge.h |5 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |  178 -
 2 files changed, 74 insertions(+), 109 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index 712add5..8aeba4c 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -214,10 +214,9 @@ struct pci_dn {
u16 vfs_expanded;   /* number of VFs IOV BAR expanded */
u16 num_vfs;/* number of VFs enabled*/
int offset; /* PE# for the first VF PE */
-#define M64_PER_IOV 4
-   int m64_per_iov;
+   boolm64_single_mode;/* Use M64 BAR in Single Mode */
 #define IODA_INVALID_M64(-1)
-   int m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
+   int (*m64_map)[PCI_SRIOV_NUM_BARS];
 #endif /* CONFIG_PCI_IOV */
 #endif
struct list_head child_list;
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index e3e0acb..de7db1d 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1148,29 +1148,36 @@ static void pnv_pci_ioda_setup_PEs(void)
 }
 
 #ifdef CONFIG_PCI_IOV
-static int pnv_pci_vf_release_m64(struct pci_dev *pdev)
+static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs)
 {
struct pci_bus*bus;
struct pci_controller *hose;
struct pnv_phb*phb;
struct pci_dn *pdn;
inti, j;
+   intm64_bars;
 
bus = pdev-bus;
hose = pci_bus_to_host(bus);
phb = hose-private_data;
pdn = pci_get_pdn(pdev);
 
+   if (pdn-m64_single_mode)
+   m64_bars = num_vfs;
+   else
+   m64_bars = 1;
+
for (i = 0; i  PCI_SRIOV_NUM_BARS; i++)
-   for (j = 0; j  M64_PER_IOV; j++) {
-   if (pdn-m64_wins[i][j] == IODA_INVALID_M64)
+   for (j = 0; j  m64_bars; j++) {
+   if (pdn-m64_map[j][i] == IODA_INVALID_M64)
continue;
opal_pci_phb_mmio_enable(phb-opal_id,
-   OPAL_M64_WINDOW_TYPE, pdn-m64_wins[i][j], 0);
-   clear_bit(pdn-m64_wins[i][j], 
phb-ioda.m64_bar_alloc);
-   pdn-m64_wins[i][j] = IODA_INVALID_M64;
+   OPAL_M64_WINDOW_TYPE, pdn-m64_map[j][i], 0);
+   clear_bit(pdn-m64_map[j][i], phb-ioda.m64_bar_alloc);
+   pdn-m64_map[j][i] = IODA_INVALID_M64;
}
 
+   kfree(pdn-m64_map);
return 0;
 }
 
@@ -1187,8 +1194,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
inttotal_vfs;
resource_size_tsize, start;
intpe_num;
-   intvf_groups;
-   intvf_per_group;
+   intm64_bars;
 
bus = pdev-bus;
hose = pci_bus_to_host(bus);
@@ -1196,26 +1202,26 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
pdn = pci_get_pdn(pdev);
total_vfs = pci_sriov_get_totalvfs(pdev);
 
-   /* Initialize the m64_wins to IODA_INVALID_M64 */
-   for (i = 0; i  PCI_SRIOV_NUM_BARS; i++)
-   for (j = 0; j  M64_PER_IOV; j++)
-   pdn-m64_wins[i][j] = IODA_INVALID_M64;
+   if (pdn-m64_single_mode)
+   m64_bars = num_vfs;
+   else
+   m64_bars = 1;
+
+   pdn-m64_map = kmalloc(sizeof(*pdn-m64_map) * m64_bars, GFP_KERNEL);
+   if (!pdn-m64_map)
+   return -ENOMEM;
+   /* Initialize the m64_map to IODA_INVALID_M64 */
+   for (i = 0; i  m64_bars ; i++)
+   for (j = 0; j  PCI_SRIOV_NUM_BARS; j++)
+   pdn-m64_map[i][j] = IODA_INVALID_M64;
 
-   if (pdn-m64_per_iov == M64_PER_IOV) {
-   vf_groups = (num_vfs = M64_PER_IOV) ? num_vfs: M64_PER_IOV;
-   vf_per_group = (num_vfs = M64_PER_IOV)? 1:
-   roundup_pow_of_two(num_vfs) / pdn-m64_per_iov;
-   } else {
-   vf_groups = 1;
-   vf_per_group = 1;
-   }
 
   

[PATCH V4 1/6] powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BAR

2015-08-18 Thread Wei Yang
On PHB_IODA2, we enable SRIOV devices by mapping IOV BAR with M64 BARs. If
a SRIOV device's IOV BAR is not 64bit-prefetchable, this is not assigned
from 64bit prefetchable window, which means M64 BAR can't work on it.

This patch makes this explicit.

Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com
Reviewed-by: Gavin Shan gws...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/powernv/pci-ioda.c |   25 +
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 85cbc96..8c031b5 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -908,9 +908,6 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, 
int offset)
if (!res-flags || !res-parent)
continue;
 
-   if (!pnv_pci_is_mem_pref_64(res-flags))
-   continue;
-
/*
 * The actual IOV BAR range is determined by the start address
 * and the actual size for num_vfs VFs BAR.  This check is to
@@ -939,9 +936,6 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, 
int offset)
if (!res-flags || !res-parent)
continue;
 
-   if (!pnv_pci_is_mem_pref_64(res-flags))
-   continue;
-
size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES);
res2 = *res;
res-start += size * offset;
@@ -1221,9 +1215,6 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
if (!res-flags || !res-parent)
continue;
 
-   if (!pnv_pci_is_mem_pref_64(res-flags))
-   continue;
-
for (j = 0; j  vf_groups; j++) {
do {
win = 
find_next_zero_bit(phb-ioda.m64_bar_alloc,
@@ -1510,6 +1501,12 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 
num_vfs)
pdn = pci_get_pdn(pdev);
 
if (phb-type == PNV_PHB_IODA2) {
+   if (!pdn-vfs_expanded) {
+   dev_info(pdev-dev, don't support this SRIOV device
+with non 64bit-prefetchable IOV BAR\n);
+   return -ENOSPC;
+   }
+
/* Calculate available PE for required VFs */
mutex_lock(phb-ioda.pe_alloc_mutex);
pdn-offset = bitmap_find_next_zero_area(
@@ -2775,9 +2772,10 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
if (!res-flags || res-parent)
continue;
if (!pnv_pci_is_mem_pref_64(res-flags)) {
-   dev_warn(pdev-dev,  non M64 VF BAR%d: %pR\n,
+   dev_warn(pdev-dev, Don't support SR-IOV with
+non M64 VF BAR%d: %pR. \n,
 i, res);
-   continue;
+   return;
}
 
size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES);
@@ -2796,11 +2794,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
res = pdev-resource[i + PCI_IOV_RESOURCES];
if (!res-flags || res-parent)
continue;
-   if (!pnv_pci_is_mem_pref_64(res-flags)) {
-   dev_warn(pdev-dev, Skipping expanding VF BAR%d: 
%pR\n,
-i, res);
-   continue;
-   }
 
dev_dbg(pdev-dev,  Fixing VF BAR%d: %pR to\n, i, res);
size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES);
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V4 3/6] powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR

2015-08-18 Thread Gavin Shan
On Wed, Aug 19, 2015 at 10:01:41AM +0800, Wei Yang wrote:
In current implementation, when VF BAR is bigger than 64MB, it uses 4 M64
BARs in Single PE mode to cover the number of VFs required to be enabled.
By doing so, several VFs would be in one VF Group and leads to interference
between VFs in the same group.

And in this patch, m64_wins is renamed to m64_map, which means index number
of the M64 BAR used to map the VF BAR. Based on Gavin's comments.

This patch changes the design by using one M64 BAR in Single PE mode for
one VF BAR. This gives absolute isolation for VFs.

Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com

Reviewed-by: Gavin Shan gws...@linux.vnet.ibm.com

---
 arch/powerpc/include/asm/pci-bridge.h |5 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |  178 -
 2 files changed, 74 insertions(+), 109 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index 712add5..8aeba4c 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -214,10 +214,9 @@ struct pci_dn {
   u16 vfs_expanded;   /* number of VFs IOV BAR expanded */
   u16 num_vfs;/* number of VFs enabled*/
   int offset; /* PE# for the first VF PE */
-#define M64_PER_IOV 4
-  int m64_per_iov;
+  boolm64_single_mode;/* Use M64 BAR in Single Mode */
 #define IODA_INVALID_M64(-1)
-  int m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV];
+  int (*m64_map)[PCI_SRIOV_NUM_BARS];
 #endif /* CONFIG_PCI_IOV */
 #endif
   struct list_head child_list;
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index e3e0acb..de7db1d 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1148,29 +1148,36 @@ static void pnv_pci_ioda_setup_PEs(void)
 }

 #ifdef CONFIG_PCI_IOV
-static int pnv_pci_vf_release_m64(struct pci_dev *pdev)
+static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs)
 {
   struct pci_bus*bus;
   struct pci_controller *hose;
   struct pnv_phb*phb;
   struct pci_dn *pdn;
   inti, j;
+  intm64_bars;

   bus = pdev-bus;
   hose = pci_bus_to_host(bus);
   phb = hose-private_data;
   pdn = pci_get_pdn(pdev);

+  if (pdn-m64_single_mode)
+  m64_bars = num_vfs;
+  else
+  m64_bars = 1;
+
   for (i = 0; i  PCI_SRIOV_NUM_BARS; i++)
-  for (j = 0; j  M64_PER_IOV; j++) {
-  if (pdn-m64_wins[i][j] == IODA_INVALID_M64)
+  for (j = 0; j  m64_bars; j++) {
+  if (pdn-m64_map[j][i] == IODA_INVALID_M64)
   continue;
   opal_pci_phb_mmio_enable(phb-opal_id,
-  OPAL_M64_WINDOW_TYPE, pdn-m64_wins[i][j], 0);
-  clear_bit(pdn-m64_wins[i][j], 
phb-ioda.m64_bar_alloc);
-  pdn-m64_wins[i][j] = IODA_INVALID_M64;
+  OPAL_M64_WINDOW_TYPE, pdn-m64_map[j][i], 0);
+  clear_bit(pdn-m64_map[j][i], phb-ioda.m64_bar_alloc);
+  pdn-m64_map[j][i] = IODA_INVALID_M64;
   }

+  kfree(pdn-m64_map);
   return 0;
 }

@@ -1187,8 +1194,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
   inttotal_vfs;
   resource_size_tsize, start;
   intpe_num;
-  intvf_groups;
-  intvf_per_group;
+  intm64_bars;

   bus = pdev-bus;
   hose = pci_bus_to_host(bus);
@@ -1196,26 +1202,26 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
   pdn = pci_get_pdn(pdev);
   total_vfs = pci_sriov_get_totalvfs(pdev);

-  /* Initialize the m64_wins to IODA_INVALID_M64 */
-  for (i = 0; i  PCI_SRIOV_NUM_BARS; i++)
-  for (j = 0; j  M64_PER_IOV; j++)
-  pdn-m64_wins[i][j] = IODA_INVALID_M64;
+  if (pdn-m64_single_mode)
+  m64_bars = num_vfs;
+  else
+  m64_bars = 1;
+
+  pdn-m64_map = kmalloc(sizeof(*pdn-m64_map) * m64_bars, GFP_KERNEL);
+  if (!pdn-m64_map)
+  return -ENOMEM;
+  /* Initialize the m64_map to IODA_INVALID_M64 */
+  for (i = 0; i  m64_bars ; i++)
+  for (j = 0; j  PCI_SRIOV_NUM_BARS; j++)
+  pdn-m64_map[i][j] = IODA_INVALID_M64;

-  if (pdn-m64_per_iov == M64_PER_IOV) {
-  vf_groups = (num_vfs = M64_PER_IOV) ? num_vfs: M64_PER_IOV;
-  vf_per_group = (num_vfs = M64_PER_IOV)? 1:
-  roundup_pow_of_two(num_vfs) / pdn-m64_per_iov;
-  } else {
-  vf_groups = 1;
-  

Re: [PATCH 2/2] cxl: add set/get private data to context struct

2015-08-18 Thread Michael Ellerman
On Wed, 2015-08-19 at 14:19 +1000, Ian Munsie wrote:
 From: Michael Neuling mi...@neuling.org
 
 This provides AFU drivers a means to associate private data with a cxl
 context. This is particularly intended for make the new callbacks for
 driver specific events easier for AFU drivers to use, as they can easily
 get back to any private data structures they may use.
 
 Signed-off-by: Michael Neuling mi...@neuling.org
 Signed-off-by: Ian Munsie imun...@au1.ibm.com
 ---
  drivers/misc/cxl/api.c | 21 +
  drivers/misc/cxl/cxl.h |  3 +++
  include/misc/cxl.h |  7 +++
  3 files changed, 31 insertions(+)
 
 diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
 index e0f0c78..5f0b22e 100644
 --- a/drivers/misc/cxl/api.c
 +++ b/drivers/misc/cxl/api.c
 @@ -70,6 +70,27 @@ int cxl_release_context(struct cxl_context *ctx)
  }
  EXPORT_SYMBOL_GPL(cxl_release_context);
  
 +
 +int cxl_set_priv(struct cxl_context *ctx, void *priv)
 +{
 + if (!ctx)
 + return -EINVAL;
 +
 + ctx-priv = priv;
 +
 + return 0;
 +}
 +EXPORT_SYMBOL_GPL(cxl_set_priv);
 +
 +void *cxl_get_priv(struct cxl_context *ctx)
 +{
 + if (!ctx)
 + return ERR_PTR(-EINVAL);
 +
 + return ctx-priv;
 +}
 +EXPORT_SYMBOL_GPL(cxl_get_priv);


Do we really need the accessors? They don't buy anything I can see over just
using ctx-priv directly.

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V4 0/6] Redesign SR-IOV on PowerNV

2015-08-18 Thread Wei Yang
In original design, it tries to group VFs to enable more number of VFs in the
system, when VF BAR is bigger than 64MB. This design has a flaw in which one
error on a VF will interfere other VFs in the same group.

This patch series change this design by using M64 BAR in Single PE mode to
cover only one VF BAR. By doing so, it gives absolute isolation between VFs.

v4:
   * rebase the code on top of v4.2-rc7
   * switch back to use the dynamic version of pe_num_map and m64_map
   * split the memory allocation and PE assignment of pe_num_map to make it
 more easy to read
   * check pe_num_map value before free PE.
   * add the rename reason for pe_num_map and m64_map in change log
v3:
   * return -ENOSPC when a VF has non-64bit prefetchable BAR
   * rename offset to pe_num_map and define it staticly
   * change commit log based on comments
   * define m64_map staticly
v2:
   * clean up iov bar alignment calculation
   * change m64s to m64_bars
   * add a field to represent M64 Single PE mode will be used
   * change m64_wins to m64_map
   * calculate the gate instead of hard coded
   * dynamically allocate m64_map
   * dynamically allocate PE#
   * add a case to calculate iov bar alignment when M64 Single PE is used
   * when M64 Single PE is used, compare num_vfs with M64 BAR available number 
 in system at first



Wei Yang (6):
  powerpc/powernv: don't enable SRIOV when VF BAR has non
64bit-prefetchable BAR
  powerpc/powernv: simplify the calculation of iov resource alignment
  powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR
  powerpc/powernv: replace the hard coded boundary with gate
  powerpc/powernv: boundary the total VF BAR size instead of the
individual one
  powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE
mode

 arch/powerpc/include/asm/pci-bridge.h |7 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |  328 +++--
 2 files changed, 175 insertions(+), 160 deletions(-)

-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V4 5/6] powerpc/powernv: boundary the total VF BAR size instead of the individual one

2015-08-18 Thread Wei Yang
Each VF could have 6 BARs at most. When the total BAR size exceeds the
gate, after expanding it will also exhaust the M64 Window.

This patch limits the boundary by checking the total VF BAR size instead of
the individual BAR.

Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com
Reviewed-by: Gavin Shan gws...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/powernv/pci-ioda.c |   14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index b8bc51f..4bc83b8 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2701,7 +2701,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
const resource_size_t gate = phb-ioda.m64_segsize  2;
struct resource *res;
int i;
-   resource_size_t size;
+   resource_size_t size, total_vf_bar_sz;
struct pci_dn *pdn;
int mul, total_vfs;
 
@@ -2714,6 +2714,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
 
total_vfs = pci_sriov_get_totalvfs(pdev);
mul = phb-ioda.total_pe;
+   total_vf_bar_sz = 0;
 
for (i = 0; i  PCI_SRIOV_NUM_BARS; i++) {
res = pdev-resource[i + PCI_IOV_RESOURCES];
@@ -2726,7 +2727,8 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
return;
}
 
-   size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES);
+   total_vf_bar_sz += pci_iov_resource_size(pdev,
+   i + PCI_IOV_RESOURCES);
 
/*
 * If bigger than quarter of M64 segment size, just round up
@@ -2740,11 +2742,11 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
 * limit the system flexibility.  This is a design decision to
 * set the boundary to quarter of the M64 segment size.
 */
-   if (size  gate) {
-   dev_info(pdev-dev, PowerNV: VF BAR%d: %pR IOV size 
-   is bigger than %lld, roundup power2\n,
-i, res, gate);
+   if (total_vf_bar_sz  gate) {
mul = roundup_pow_of_two(total_vfs);
+   dev_info(pdev-dev,
+   VF BAR Total IOV size %llx  %llx, roundup to 
%d VFs\n,
+   total_vf_bar_sz, gate, mul);
pdn-m64_single_mode = true;
break;
}
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 2/2] cxl: add set/get private data to context struct

2015-08-18 Thread Ian Munsie
From: Michael Neuling mi...@neuling.org

This provides AFU drivers a means to associate private data with a cxl
context. This is particularly intended for make the new callbacks for
driver specific events easier for AFU drivers to use, as they can easily
get back to any private data structures they may use.

Signed-off-by: Michael Neuling mi...@neuling.org
Signed-off-by: Ian Munsie imun...@au1.ibm.com
---
 drivers/misc/cxl/api.c | 21 +
 drivers/misc/cxl/cxl.h |  3 +++
 include/misc/cxl.h |  7 +++
 3 files changed, 31 insertions(+)

diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index e0f0c78..5f0b22e 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -70,6 +70,27 @@ int cxl_release_context(struct cxl_context *ctx)
 }
 EXPORT_SYMBOL_GPL(cxl_release_context);
 
+
+int cxl_set_priv(struct cxl_context *ctx, void *priv)
+{
+   if (!ctx)
+   return -EINVAL;
+
+   ctx-priv = priv;
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(cxl_set_priv);
+
+void *cxl_get_priv(struct cxl_context *ctx)
+{
+   if (!ctx)
+   return ERR_PTR(-EINVAL);
+
+   return ctx-priv;
+}
+EXPORT_SYMBOL_GPL(cxl_get_priv);
+
 int cxl_allocate_afu_irqs(struct cxl_context *ctx, int num)
 {
if (num == 0)
diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 30e44a8..93db76a 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -431,6 +431,9 @@ struct cxl_context {
/* Only used in PR mode */
u64 process_token;
 
+   /* driver private data */
+   void *priv;
+
unsigned long *irq_bitmap; /* Accessed from IRQ context */
struct cxl_irq_ranges irqs;
struct list_head irq_names;
diff --git a/include/misc/cxl.h b/include/misc/cxl.h
index 73e03a6..3f5edbe 100644
--- a/include/misc/cxl.h
+++ b/include/misc/cxl.h
@@ -89,6 +89,13 @@ struct cxl_context *cxl_dev_context_init(struct pci_dev 
*dev);
 int cxl_release_context(struct cxl_context *ctx);
 
 /*
+ * Set and get private data associated with a context. Allows drivers to have a
+ * back pointer to some useful structure.
+ */
+int cxl_set_priv(struct cxl_context *ctx, void *priv);
+void *cxl_get_priv(struct cxl_context *ctx);
+
+/*
  * Allocate AFU interrupts for this context. num=0 will allocate the default
  * for this AFU as given in the AFU descriptor. This number doesn't include the
  * interrupt 0 (CAIA defines AFU IRQ 0 for page faults). Each interrupt to be
-- 
2.1.4

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V4 4/6] powerpc/powernv: replace the hard coded boundary with gate

2015-08-18 Thread Wei Yang
At the moment 64bit-prefetchable window can be maximum 64GB, which is
currently got from device tree. This means that in shared mode the maximum
supported VF BAR size is 64GB/256=256MB. While this size could exhaust the
whole 64bit-prefetchable window. This is a design decision to set a
boundary to 64MB of the VF BAR size. Since VF BAR size with 64MB would
occupy a quarter of the 64bit-prefetchable window, this is affordable.

This patch replaces magic limit of 64MB with gate, which is 1/4 of the
M64 Segment Size(m64_segsize  2) and adds comment to explain the reason
for it.

Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com
Reviewed-by: Gavin Shan gws...@linux.vent.ibm.com
---
 arch/powerpc/platforms/powernv/pci-ioda.c |   28 +++-
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index de7db1d..b8bc51f 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2696,8 +2696,9 @@ static void pnv_pci_init_ioda_msis(struct pnv_phb *phb) { 
}
 #ifdef CONFIG_PCI_IOV
 static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev)
 {
-   struct pci_controller *hose;
-   struct pnv_phb *phb;
+   struct pci_controller *hose = pci_bus_to_host(pdev-bus);
+   struct pnv_phb *phb = hose-private_data;
+   const resource_size_t gate = phb-ioda.m64_segsize  2;
struct resource *res;
int i;
resource_size_t size;
@@ -2707,9 +2708,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
if (!pdev-is_physfn || pdev-is_added)
return;
 
-   hose = pci_bus_to_host(pdev-bus);
-   phb = hose-private_data;
-
pdn = pci_get_pdn(pdev);
pdn-vfs_expanded = 0;
pdn-m64_single_mode = false;
@@ -2730,10 +2728,22 @@ static void pnv_pci_ioda_fixup_iov_resources(struct 
pci_dev *pdev)
 
size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES);
 
-   /* bigger than 64M */
-   if (size  (1  26)) {
-   dev_info(pdev-dev, PowerNV: VF BAR%d: %pR IOV size 
is bigger than 64M, roundup power2\n,
-i, res);
+   /*
+* If bigger than quarter of M64 segment size, just round up
+* power of two.
+*
+* Generally, one M64 BAR maps one IOV BAR. To avoid conflict
+* with other devices, IOV BAR size is expanded to be
+* (total_pe * VF_BAR_size).  When VF_BAR_size is half of M64
+* segment size , the expanded size would equal to half of the
+* whole M64 space size, which will exhaust the M64 space and
+* limit the system flexibility.  This is a design decision to
+* set the boundary to quarter of the M64 segment size.
+*/
+   if (size  gate) {
+   dev_info(pdev-dev, PowerNV: VF BAR%d: %pR IOV size 
+   is bigger than %lld, roundup power2\n,
+i, res, gate);
mul = roundup_pow_of_two(total_vfs);
pdn-m64_single_mode = true;
break;
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V4 2/6] powerpc/powernv: simplify the calculation of iov resource alignment

2015-08-18 Thread Wei Yang
The alignment of IOV BAR on PowerNV platform is the total size of the IOV
BAR. No matter whether the IOV BAR is extended with number of
roundup_pow_of_two(total_vfs) or number of max PE number (256), the total
size could be calculated by (vfs_expanded * VF_BAR_size).

This patch simplifies the pnv_pci_iov_resource_alignment() by removing the
first case.

Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com
Reviewed-by: Gavin Shan gws...@linux.vnet.ibm.com
---
 arch/powerpc/platforms/powernv/pci-ioda.c |   14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 8c031b5..e3e0acb 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -2988,12 +2988,16 @@ static resource_size_t 
pnv_pci_iov_resource_alignment(struct pci_dev *pdev,
  int resno)
 {
struct pci_dn *pdn = pci_get_pdn(pdev);
-   resource_size_t align, iov_align;
-
-   iov_align = resource_size(pdev-resource[resno]);
-   if (iov_align)
-   return iov_align;
+   resource_size_t align;
 
+   /*
+* On PowerNV platform, IOV BAR is mapped by M64 BAR to enable the
+* SR-IOV. While from hardware perspective, the range mapped by M64
+* BAR should be size aligned.
+*
+* This function returns the total IOV BAR size if expanded or just the
+* individual size if not.
+*/
align = pci_iov_resource_size(pdev, resno);
if (pdn-vfs_expanded)
return pdn-vfs_expanded * align;
-- 
1.7.9.5

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V4 6/6] powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE mode

2015-08-18 Thread Gavin Shan
On Wed, Aug 19, 2015 at 10:01:44AM +0800, Wei Yang wrote:
When M64 BAR is set to Single PE mode, the PE# assigned to VF could be
sparse.

This patch restructures the patch to allocate sparse PE# for VFs when M64
BAR is set to Single PE mode. Also it rename the offset to pe_num_map to
reflect the content is the PE number.

Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com

Reviewed-by: Gavin Shan gws...@linux.vnet.ibm.com

---
 arch/powerpc/include/asm/pci-bridge.h |2 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |   79 ++---
 2 files changed, 61 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index 8aeba4c..b3a226b 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -213,7 +213,7 @@ struct pci_dn {
 #ifdef CONFIG_PCI_IOV
   u16 vfs_expanded;   /* number of VFs IOV BAR expanded */
   u16 num_vfs;/* number of VFs enabled*/
-  int offset; /* PE# for the first VF PE */
+  int *pe_num_map;/* PE# for the first VF PE or array */
   boolm64_single_mode;/* Use M64 BAR in Single Mode */
 #define IODA_INVALID_M64(-1)
   int (*m64_map)[PCI_SRIOV_NUM_BARS];
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 4bc83b8..779f52a 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1243,7 +1243,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)

   /* Map the M64 here */
   if (pdn-m64_single_mode) {
-  pe_num = pdn-offset + j;
+  pe_num = pdn-pe_num_map[j];
   rc = opal_pci_map_pe_mmio_window(phb-opal_id,
   pe_num, OPAL_M64_WINDOW_TYPE,
   pdn-m64_map[j][i], 0);
@@ -1347,7 +1347,7 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev)
   struct pnv_phb*phb;
   struct pci_dn *pdn;
   struct pci_sriov  *iov;
-  u16 num_vfs;
+  u16num_vfs, i;

   bus = pdev-bus;
   hose = pci_bus_to_host(bus);
@@ -1361,14 +1361,21 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev)

   if (phb-type == PNV_PHB_IODA2) {
   if (!pdn-m64_single_mode)
-  pnv_pci_vf_resource_shift(pdev, -pdn-offset);
+  pnv_pci_vf_resource_shift(pdev, -*pdn-pe_num_map);

   /* Release M64 windows */
   pnv_pci_vf_release_m64(pdev, num_vfs);

   /* Release PE numbers */
-  bitmap_clear(phb-ioda.pe_alloc, pdn-offset, num_vfs);
-  pdn-offset = 0;
+  if (pdn-m64_single_mode) {
+  for (i = 0; i  num_vfs; i++) {
+  if (pdn-pe_num_map[i] != IODA_INVALID_PE)
+  pnv_ioda_free_pe(phb, 
pdn-pe_num_map[i]);
+  }
+  } else
+  bitmap_clear(phb-ioda.pe_alloc, *pdn-pe_num_map, 
num_vfs);
+  /* Releasing pe_num_map */
+  kfree(pdn-pe_num_map);
   }
 }

@@ -1394,7 +1401,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, 
u16 num_vfs)

   /* Reserve PE for each VF */
   for (vf_index = 0; vf_index  num_vfs; vf_index++) {
-  pe_num = pdn-offset + vf_index;
+  if (pdn-m64_single_mode)
+  pe_num = pdn-pe_num_map[vf_index];
+  else
+  pe_num = *pdn-pe_num_map + vf_index;

   pe = phb-ioda.pe_array[pe_num];
   pe-pe_number = pe_num;
@@ -1436,6 +1446,7 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 
num_vfs)
   struct pnv_phb*phb;
   struct pci_dn *pdn;
   intret;
+  u16i;

   bus = pdev-bus;
   hose = pci_bus_to_host(bus);
@@ -1458,20 +1469,42 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 
num_vfs)
   return -EBUSY;
   }

+  /* Allocating pe_num_map */
+  if (pdn-m64_single_mode)
+  pdn-pe_num_map = kmalloc(sizeof(*pdn-pe_num_map) * 
num_vfs,
+  GFP_KERNEL);
+  else
+  pdn-pe_num_map = kmalloc(sizeof(*pdn-pe_num_map), 
GFP_KERNEL);
+
+  if (!pdn-pe_num_map)
+  return -ENOMEM;
+
   /* Calculate available PE for required VFs */
-  mutex_lock(phb-ioda.pe_alloc_mutex);
-  pdn-offset = bitmap_find_next_zero_area(
-  phb-ioda.pe_alloc, phb-ioda.total_pe,
-  0, num_vfs, 0);
-   

[PATCH 1/2] cxl: Add mechanism for delivering AFU driver specific events

2015-08-18 Thread Ian Munsie
From: Ian Munsie imun...@au1.ibm.com

This adds an afu_driver_ops structure with event_pending and
deliver_event callbacks. An AFU driver can fill these out and associate
it with a context to enable passing custom AFU specific events to
userspace.

The cxl driver will call event_pending() during poll, select, read, etc.
calls to check if an AFU driver specific event is pending, and will call
deliver_event() to deliver that event. This way, the cxl driver takes
care of all the usual locking semantics around these calls and handles
all the generic cxl events, so that the AFU driver only needs to worry
about it's own events.

The deliver_event() call is passed a struct cxl_event buffer to fill in.
The header will already be filled in for an AFU driver event, and the
AFU driver is expected to expand the header.size as necessary (up to
max_size, defined by struct cxl_event_afu_driver_reserved) and fill out
it's own information.

Conflicts between AFU specific events are not expected, due to the fact
that each AFU specific driver has it's own mechanism to deliver an AFU
file descriptor to userspace.

Signed-off-by: Ian Munsie imun...@au1.ibm.com
---
 drivers/misc/cxl/Kconfig |  5 +
 drivers/misc/cxl/api.c   |  7 +++
 drivers/misc/cxl/cxl.h   |  6 +-
 drivers/misc/cxl/file.c  | 37 +++--
 include/misc/cxl.h   | 29 +
 include/uapi/misc/cxl.h  | 13 +
 6 files changed, 86 insertions(+), 11 deletions(-)

diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig
index 8756d06..560412c 100644
--- a/drivers/misc/cxl/Kconfig
+++ b/drivers/misc/cxl/Kconfig
@@ -15,12 +15,17 @@ config CXL_EEH
bool
default n
 
+config CXL_AFU_DRIVER_OPS
+   bool
+   default n
+
 config CXL
tristate Support for IBM Coherent Accelerators (CXL)
depends on PPC_POWERNV  PCI_MSI  EEH
select CXL_BASE
select CXL_KERNEL_API
select CXL_EEH
+   select CXL_AFU_DRIVER_OPS
default m
help
  Select this option to enable driver support for IBM Coherent
diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index 6a768a9..e0f0c78 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -267,6 +267,13 @@ struct cxl_context *cxl_fops_get_context(struct file *file)
 }
 EXPORT_SYMBOL_GPL(cxl_fops_get_context);
 
+void cxl_set_driver_ops(struct cxl_context *ctx,
+   struct cxl_afu_driver_ops *ops)
+{
+   ctx-afu_driver_ops = ops;
+}
+EXPORT_SYMBOL_GPL(cxl_set_driver_ops);
+
 int cxl_start_work(struct cxl_context *ctx,
   struct cxl_ioctl_start_work *work)
 {
diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
index 6f53866..30e44a8 100644
--- a/drivers/misc/cxl/cxl.h
+++ b/drivers/misc/cxl/cxl.h
@@ -24,6 +24,7 @@
 #include asm/reg.h
 #include misc/cxl-base.h
 
+#include misc/cxl.h
 #include uapi/misc/cxl.h
 
 extern uint cxl_verbose;
@@ -34,7 +35,7 @@ extern uint cxl_verbose;
  * Bump version each time a user API change is made, whether it is
  * backwards compatible ot not.
  */
-#define CXL_API_VERSION 1
+#define CXL_API_VERSION 2
 #define CXL_API_VERSION_COMPATIBLE 1
 
 /*
@@ -462,6 +463,9 @@ struct cxl_context {
bool pending_fault;
bool pending_afu_err;
 
+   /* Used by AFU drivers for driver specific event delivery */
+   struct cxl_afu_driver_ops *afu_driver_ops;
+
struct rcu_head rcu;
 };
 
diff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c
index 57bdb47..2ebaca3 100644
--- a/drivers/misc/cxl/file.c
+++ b/drivers/misc/cxl/file.c
@@ -279,6 +279,22 @@ int afu_mmap(struct file *file, struct vm_area_struct *vm)
return cxl_context_iomap(ctx, vm);
 }
 
+static inline int _ctx_event_pending(struct cxl_context *ctx)
+{
+   bool afu_driver_event_pending = false;
+
+   if (ctx-afu_driver_ops  ctx-afu_driver_ops-event_pending)
+   afu_driver_event_pending = 
ctx-afu_driver_ops-event_pending(ctx);
+
+   return (ctx-pending_irq || ctx-pending_fault ||
+   ctx-pending_afu_err || afu_driver_event_pending);
+}
+
+static inline int ctx_event_pending(struct cxl_context *ctx)
+{
+   return _ctx_event_pending(ctx) || (ctx-status == CLOSED);
+}
+
 unsigned int afu_poll(struct file *file, struct poll_table_struct *poll)
 {
struct cxl_context *ctx = file-private_data;
@@ -291,8 +307,7 @@ unsigned int afu_poll(struct file *file, struct 
poll_table_struct *poll)
pr_devel(afu_poll wait done pe: %i\n, ctx-pe);
 
spin_lock_irqsave(ctx-lock, flags);
-   if (ctx-pending_irq || ctx-pending_fault ||
-   ctx-pending_afu_err)
+   if (_ctx_event_pending(ctx))
mask |= POLLIN | POLLRDNORM;
else if (ctx-status == CLOSED)
/* Only error on closed when there are no futher events pending
@@ -305,12 +320,6 @@ unsigned int afu_poll(struct file *file, struct 
poll_table_struct *poll)
 

[PATCH V4 6/6] powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE mode

2015-08-18 Thread Wei Yang
When M64 BAR is set to Single PE mode, the PE# assigned to VF could be
sparse.

This patch restructures the patch to allocate sparse PE# for VFs when M64
BAR is set to Single PE mode. Also it rename the offset to pe_num_map to
reflect the content is the PE number.

Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com
---
 arch/powerpc/include/asm/pci-bridge.h |2 +-
 arch/powerpc/platforms/powernv/pci-ioda.c |   79 ++---
 2 files changed, 61 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/include/asm/pci-bridge.h 
b/arch/powerpc/include/asm/pci-bridge.h
index 8aeba4c..b3a226b 100644
--- a/arch/powerpc/include/asm/pci-bridge.h
+++ b/arch/powerpc/include/asm/pci-bridge.h
@@ -213,7 +213,7 @@ struct pci_dn {
 #ifdef CONFIG_PCI_IOV
u16 vfs_expanded;   /* number of VFs IOV BAR expanded */
u16 num_vfs;/* number of VFs enabled*/
-   int offset; /* PE# for the first VF PE */
+   int *pe_num_map;/* PE# for the first VF PE or array */
boolm64_single_mode;/* Use M64 BAR in Single Mode */
 #define IODA_INVALID_M64(-1)
int (*m64_map)[PCI_SRIOV_NUM_BARS];
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
b/arch/powerpc/platforms/powernv/pci-ioda.c
index 4bc83b8..779f52a 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -1243,7 +1243,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, 
u16 num_vfs)
 
/* Map the M64 here */
if (pdn-m64_single_mode) {
-   pe_num = pdn-offset + j;
+   pe_num = pdn-pe_num_map[j];
rc = opal_pci_map_pe_mmio_window(phb-opal_id,
pe_num, OPAL_M64_WINDOW_TYPE,
pdn-m64_map[j][i], 0);
@@ -1347,7 +1347,7 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev)
struct pnv_phb*phb;
struct pci_dn *pdn;
struct pci_sriov  *iov;
-   u16 num_vfs;
+   u16num_vfs, i;
 
bus = pdev-bus;
hose = pci_bus_to_host(bus);
@@ -1361,14 +1361,21 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev)
 
if (phb-type == PNV_PHB_IODA2) {
if (!pdn-m64_single_mode)
-   pnv_pci_vf_resource_shift(pdev, -pdn-offset);
+   pnv_pci_vf_resource_shift(pdev, -*pdn-pe_num_map);
 
/* Release M64 windows */
pnv_pci_vf_release_m64(pdev, num_vfs);
 
/* Release PE numbers */
-   bitmap_clear(phb-ioda.pe_alloc, pdn-offset, num_vfs);
-   pdn-offset = 0;
+   if (pdn-m64_single_mode) {
+   for (i = 0; i  num_vfs; i++) {
+   if (pdn-pe_num_map[i] != IODA_INVALID_PE)
+   pnv_ioda_free_pe(phb, 
pdn-pe_num_map[i]);
+   }
+   } else
+   bitmap_clear(phb-ioda.pe_alloc, *pdn-pe_num_map, 
num_vfs);
+   /* Releasing pe_num_map */
+   kfree(pdn-pe_num_map);
}
 }
 
@@ -1394,7 +1401,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, 
u16 num_vfs)
 
/* Reserve PE for each VF */
for (vf_index = 0; vf_index  num_vfs; vf_index++) {
-   pe_num = pdn-offset + vf_index;
+   if (pdn-m64_single_mode)
+   pe_num = pdn-pe_num_map[vf_index];
+   else
+   pe_num = *pdn-pe_num_map + vf_index;
 
pe = phb-ioda.pe_array[pe_num];
pe-pe_number = pe_num;
@@ -1436,6 +1446,7 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 
num_vfs)
struct pnv_phb*phb;
struct pci_dn *pdn;
intret;
+   u16i;
 
bus = pdev-bus;
hose = pci_bus_to_host(bus);
@@ -1458,20 +1469,42 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 
num_vfs)
return -EBUSY;
}
 
+   /* Allocating pe_num_map */
+   if (pdn-m64_single_mode)
+   pdn-pe_num_map = kmalloc(sizeof(*pdn-pe_num_map) * 
num_vfs,
+   GFP_KERNEL);
+   else
+   pdn-pe_num_map = kmalloc(sizeof(*pdn-pe_num_map), 
GFP_KERNEL);
+
+   if (!pdn-pe_num_map)
+   return -ENOMEM;
+
/* Calculate available PE for required VFs */
-   mutex_lock(phb-ioda.pe_alloc_mutex);
-   pdn-offset = bitmap_find_next_zero_area(
-   phb-ioda.pe_alloc, phb-ioda.total_pe,
-   0, num_vfs, 0);
-   if (pdn-offset = 

[PATCH v5 7/7] pmem, dax: have direct_access use __pmem annotation

2015-08-18 Thread Ross Zwisler
Update the annotation for the kaddr pointer returned by direct_access()
so that it is a __pmem pointer.  This is consistent with the PMEM driver
and with how this direct_access() pointer is used in the DAX code.

Signed-off-by: Ross Zwisler ross.zwis...@linux.intel.com
Reviewed-by: Christoph Hellwig h...@lst.de
---
 Documentation/filesystems/Locking |  3 ++-
 arch/powerpc/sysdev/axonram.c |  7 ---
 drivers/block/brd.c   |  4 ++--
 drivers/nvdimm/pmem.c |  4 ++--
 drivers/s390/block/dcssblk.c  | 10 ++
 fs/block_dev.c|  2 +-
 fs/dax.c  | 37 -
 include/linux/blkdev.h|  8 
 8 files changed, 41 insertions(+), 34 deletions(-)

diff --git a/Documentation/filesystems/Locking 
b/Documentation/filesystems/Locking
index 6a34a0f..06d4434 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -397,7 +397,8 @@ prototypes:
int (*release) (struct gendisk *, fmode_t);
int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long);
int (*compat_ioctl) (struct block_device *, fmode_t, unsigned, unsigned 
long);
-   int (*direct_access) (struct block_device *, sector_t, void **, 
unsigned long *);
+   int (*direct_access) (struct block_device *, sector_t, void __pmem **,
+   unsigned long *);
int (*media_changed) (struct gendisk *);
void (*unlock_native_capacity) (struct gendisk *);
int (*revalidate_disk) (struct gendisk *);
diff --git a/arch/powerpc/sysdev/axonram.c b/arch/powerpc/sysdev/axonram.c
index ee90db1..a2be2a6 100644
--- a/arch/powerpc/sysdev/axonram.c
+++ b/arch/powerpc/sysdev/axonram.c
@@ -141,13 +141,14 @@ axon_ram_make_request(struct request_queue *queue, struct 
bio *bio)
  */
 static long
 axon_ram_direct_access(struct block_device *device, sector_t sector,
-  void **kaddr, unsigned long *pfn, long size)
+  void __pmem **kaddr, unsigned long *pfn, long size)
 {
struct axon_ram_bank *bank = device-bd_disk-private_data;
loff_t offset = (loff_t)sector  AXON_RAM_SECTOR_SHIFT;
+   void *addr = (void *)(bank-ph_addr + offset);
 
-   *kaddr = (void *)(bank-ph_addr + offset);
-   *pfn = virt_to_phys(*kaddr)  PAGE_SHIFT;
+   *kaddr = (void __pmem *)addr;
+   *pfn = virt_to_phys(addr)  PAGE_SHIFT;
 
return bank-size - offset;
 }
diff --git a/drivers/block/brd.c b/drivers/block/brd.c
index 5750b39..2691bb6 100644
--- a/drivers/block/brd.c
+++ b/drivers/block/brd.c
@@ -371,7 +371,7 @@ static int brd_rw_page(struct block_device *bdev, sector_t 
sector,
 
 #ifdef CONFIG_BLK_DEV_RAM_DAX
 static long brd_direct_access(struct block_device *bdev, sector_t sector,
-   void **kaddr, unsigned long *pfn, long size)
+   void __pmem **kaddr, unsigned long *pfn, long size)
 {
struct brd_device *brd = bdev-bd_disk-private_data;
struct page *page;
@@ -381,7 +381,7 @@ static long brd_direct_access(struct block_device *bdev, 
sector_t sector,
page = brd_insert_page(brd, sector);
if (!page)
return -ENOSPC;
-   *kaddr = page_address(page);
+   *kaddr = (void __pmem *)page_address(page);
*pfn = page_to_pfn(page);
 
/*
diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index eb7552d..f3b6297 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -92,7 +92,7 @@ static int pmem_rw_page(struct block_device *bdev, sector_t 
sector,
 }
 
 static long pmem_direct_access(struct block_device *bdev, sector_t sector,
- void **kaddr, unsigned long *pfn, long size)
+ void __pmem **kaddr, unsigned long *pfn, long size)
 {
struct pmem_device *pmem = bdev-bd_disk-private_data;
size_t offset = sector  9;
@@ -101,7 +101,7 @@ static long pmem_direct_access(struct block_device *bdev, 
sector_t sector,
return -ENODEV;
 
/* FIXME convert DAX to comprehend that this mapping has a lifetime */
-   *kaddr = (void __force *) pmem-virt_addr + offset;
+   *kaddr = pmem-virt_addr + offset;
*pfn = (pmem-phys_addr + offset)  PAGE_SHIFT;
 
return pmem-size - offset;
diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
index da21281..2c5a397 100644
--- a/drivers/s390/block/dcssblk.c
+++ b/drivers/s390/block/dcssblk.c
@@ -29,7 +29,7 @@ static int dcssblk_open(struct block_device *bdev, fmode_t 
mode);
 static void dcssblk_release(struct gendisk *disk, fmode_t mode);
 static void dcssblk_make_request(struct request_queue *q, struct bio *bio);
 static long dcssblk_direct_access(struct block_device *bdev, sector_t secnum,
-void **kaddr, unsigned long *pfn, long size);
+void __pmem **kaddr, unsigned long *pfn, long 

[PATCH 0/2] Disable MSI/MSI-X interrupts manually at PCI probe time in PowerPC architecture

2015-08-18 Thread Guilherme G. Piccoli
These 2 patches correct a bogus behaviour introduced by commit 1851617cd2
(PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI).
The commit moved the logic responsible to disable MSI/MSI-X interrupts
at PCI probe time to a new function, named pci_msi_setup_pci_dev(), that
is not reachable in the code path of PowerPC pSeries platform.

Since then, devices aren't able to activate MSI/MSI-X capability, even
after boot. The first patch makes the function pci_msi_setup_pci_dev()
non-static. The second patch inserts a call to the function in powerpc
code, so it explicitly disables MSI/MSI-X interrupts at PCI probe time.

Guilherme G. Piccoli (2):
  PCI: Make pci_msi_setup_pci_dev() non-static for use by arch code
  powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case

 arch/powerpc/kernel/pci_of_scan.c | 3 +++
 drivers/pci/probe.c   | 2 +-
 include/linux/pci.h   | 1 +
 3 files changed, 5 insertions(+), 1 deletion(-)

-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V2] cxl: Allow release of contexts which have been OPENED but not STARTED

2015-08-18 Thread Andrew Donnellan
If we open a context but do not start it (either because we do not attempt
to start it, or because it fails to start for some reason), we are left
with a context in state OPENED. Previously, cxl_release_context() only
allowed releasing contexts in state CLOSED, so attempting to release an
OPENED context would fail.

In particular, this bug causes available contexts to run out after some EEH
failures, where drivers attempt to release contexts that have failed to
start.

Allow releasing contexts in any state with a value lower than STARTED, i.e.
OPENED or CLOSED (we can't release a STARTED context as it's currently
using the hardware, and we assume that contexts in any new states which may
be added in future with a value higher than STARTED are also unsafe to
release).

Cc: sta...@vger.kernel.org
Fixes: 6f7f0b3df6d4 (cxl: Add AFU virtual PHB and kernel API)
Signed-off-by: Andrew Donnellan andrew.donnel...@au1.ibm.com
Signed-off-by: Daniel Axtens d...@axtens.net
---
 drivers/misc/cxl/api.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
index 6a768a9..f49e3e5 100644
--- a/drivers/misc/cxl/api.c
+++ b/drivers/misc/cxl/api.c
@@ -59,7 +59,7 @@ EXPORT_SYMBOL_GPL(cxl_get_phys_dev);
 
 int cxl_release_context(struct cxl_context *ctx)
 {
-   if (ctx-status != CLOSED)
+   if (ctx-status = STARTED)
return -EBUSY;
 
put_device(ctx-afu-dev);
-- 
Andrew Donnellan  Software Engineer, OzLabs
andrew.donnel...@au1.ibm.com  Australia Development Lab, Canberra
+61 2 6201 8874 (work)IBM Australia Limited

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/2] PCI: Make pci_msi_setup_pci_dev() non-static for use by arch code

2015-08-18 Thread Michael Ellerman
Hi Guilherme,

Thanks for the patches.

On Tue, 2015-08-18 at 18:13 -0300, Guilherme G. Piccoli wrote:
 Commit 1851617cd2 (PCI/MSI: Disable MSI at enumeration even if kernel
 doesn't support MSI) changed the location of the code that disables
 MSI/MSI-X interrupts at PCI probe time in devices that have this flag set.
 It moved the code from pci_msi_init_pci_dev() to a new function named
 pci_msi_setup_pci_dev(), called by pci_setup_device().

OK.

 Since then, the pSeries platform of the powerpc architecture needs to
 disable MSI at PCI probe time manually, as the code flow doesn't
 reach pci_setup_device(). 

 For doing so, it wants to call
 pci_msi_setup_pci_dev(). This patch makes the required function
 non-static, so that it will be called on PCI probe path on powerpc pSeries
 platform in next patch.

I didn't follow that entirely, I think you mean something like:

  The pseries PCI probing code does not call pci_setup_device(), so since
  commit 1851617cd2 pci_msi_setup_pci_dev() is not called and MSIs are left
  enabled, which is a bug.

  To fix this the pseries PCI probe should manually call
  pci_msi_setup_pci_dev(), so make it non-static.


Does that look OK?

Also you haven't CC'ed the original author of the commit, or the PCI
maintainer, or the relevant lists.

That would be:

Michael S. Tsirkin m...@redhat.com
Bjorn Helgaas bhelg...@google.com
linux-...@vger.kernel.org
linux-ker...@vger.kernel.org


And finally both patches should have a fixes line, such as:

Fixes: 1851617cd2da (PCI/MSI: Disable MSI at enumeration even if kernel 
doesn't support MSI)

cheers



___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] cxl: Allow release of contexts which have been OPENED but not STARTED

2015-08-18 Thread Andrew Donnellan

On 19/08/15 02:23, Michael Neuling wrote:

So this doesn't break when you add a new state, is it worth writing it as:

if (ctx-status = STARTED)
return -EBUSY;

?


Yeah I think that would be more future proof, although it won't make a
difference with the current code.


Sounds reasonable, I'll submit a V2.


Andrew

--
Andrew Donnellan  Software Engineer, OzLabs
andrew.donnel...@au1.ibm.com  Australia Development Lab, Canberra
+61 2 6201 8874 (work)IBM Australia Limited

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2] cxl: Allow release of contexts which have been OPENED but not STARTED

2015-08-18 Thread Ian Munsie
Acked-by: Ian Munsie imun...@au1.ibm.com

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [05/27] macintosh: therm_windtunnel: Export I2C module alias information

2015-08-18 Thread Michael Ellerman
On Tue, 2015-08-18 at 12:35 +0200, Javier Martinez Canillas wrote:
 Hello Michael,
 
 On 08/18/2015 12:24 PM, Michael Ellerman wrote:
  On Thu, 2015-30-07 at 16:18:30 UTC, Javier Martinez Canillas wrote:
  The I2C core always reports the MODALIAS uevent as i2c:client name
  regardless if the driver was matched using the I2C id_table or the
  of_match_table. So the driver needs to export the I2C table and this
  be built into the module or udev won't have the necessary information
  to auto load the correct module when the device is added.
 
  Signed-off-by: Javier Martinez Canillas jav...@osg.samsung.com
  ---
 
   drivers/macintosh/therm_windtunnel.c | 1 +
   1 file changed, 1 insertion(+)
  
  Who are you expecting to merge this?
  
 
 I was expecting Benjamin Herrenschmidt since he is listed in MAINTAINERS
 for drivers/macintosh. I cc'ed him in the patch but now in your answer I
 don't see him in the cc list, strange.

That's the mailing list dropping him from CC because he's subscribed.

 But I'll be happy to re-post if there is another person who is handling
 the patches for this driver now.
 
 BTW there is another patch [0] for the same driver to export the OF id
 table information, that was not picked either.

Yep, I'll grab them both.

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] cxl: add set/get private data to context struct

2015-08-18 Thread Michael Ellerman
On Wed, 2015-08-19 at 15:12 +1000, Ian Munsie wrote:
 Excerpts from Michael Ellerman's message of 2015-08-19 14:49:30 +1000:
  Do we really need the accessors? They don't buy anything I can see over just
  using ctx-priv directly.
 
 The reasoning there is because we don't currently expose the contents of
 stuct cxl_context to afu drivers, rather they just treat it as an opaque
 type.
 
 We could potentially change this to expose the details, but there's a
 lot of junk in there that's just internal details of the cxl driver that
 isn't of interest to an afu driver that I'd rather not expose.
 
 We also already have another accessor function (cxl_process_element) in
 the api, so it's not out of place.
 
 FWIW I'm not opposed to changing how this api works if it ultimately
 makes things better, but I want to wait until the cxlflash superpipe
 support is merged so any patches that change the api can change it at
 the same time.

OK. I saw struct cxl_context in cxl.h and figured it was public, but it's in
drivers/misc/cxl/cxl.h, so yes other drivers have no business poking in there,
even though they *could*.

So that's fine.

cheers


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/2] cxl: add set/get private data to context struct

2015-08-18 Thread Ian Munsie
Excerpts from Michael Ellerman's message of 2015-08-19 14:49:30 +1000:
 Do we really need the accessors? They don't buy anything I can see over just
 using ctx-priv directly.

The reasoning there is because we don't currently expose the contents of
stuct cxl_context to afu drivers, rather they just treat it as an opaque
type.

We could potentially change this to expose the details, but there's a
lot of junk in there that's just internal details of the cxl driver that
isn't of interest to an afu driver that I'd rather not expose.

We also already have another accessor function (cxl_process_element) in
the api, so it's not out of place.

FWIW I'm not opposed to changing how this api works if it ultimately
makes things better, but I want to wait until the cxlflash superpipe
support is merged so any patches that change the api can change it at
the same time.

Cheers,
-Ian

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/2] cxl: Add mechanism for delivering AFU driver specific events

2015-08-18 Thread Michael Ellerman
On Wed, 2015-08-19 at 14:19 +1000, Ian Munsie wrote:
 From: Ian Munsie imun...@au1.ibm.com
 
 This adds an afu_driver_ops structure with event_pending and
 deliver_event callbacks. An AFU driver can fill these out and associate
 it with a context to enable passing custom AFU specific events to
 userspace.

What's an AFU driver? Give me an example.

 The cxl driver will call event_pending() during poll, select, read, etc.
 calls to check if an AFU driver specific event is pending, and will call
 deliver_event() to deliver that event. This way, the cxl driver takes
 care of all the usual locking semantics around these calls and handles
 all the generic cxl events, so that the AFU driver only needs to worry
 about it's own events.
 
 The deliver_event() call is passed a struct cxl_event buffer to fill in.
 The header will already be filled in for an AFU driver event, and the
 AFU driver is expected to expand the header.size as necessary (up to
 max_size, defined by struct cxl_event_afu_driver_reserved) and fill out
 it's own information.
 
 Conflicts between AFU specific events are not expected, due to the fact
 that each AFU specific driver has it's own mechanism to deliver an AFU
 file descriptor to userspace.

I don't grok this bit.

 Signed-off-by: Ian Munsie imun...@au1.ibm.com
 ---
  drivers/misc/cxl/Kconfig |  5 +
  drivers/misc/cxl/api.c   |  7 +++
  drivers/misc/cxl/cxl.h   |  6 +-
  drivers/misc/cxl/file.c  | 37 +++--
  include/misc/cxl.h   | 29 +
  include/uapi/misc/cxl.h  | 13 +
  6 files changed, 86 insertions(+), 11 deletions(-)
 
 diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig
 index 8756d06..560412c 100644
 --- a/drivers/misc/cxl/Kconfig
 +++ b/drivers/misc/cxl/Kconfig
 @@ -15,12 +15,17 @@ config CXL_EEH
   bool
   default n
  
 +config CXL_AFU_DRIVER_OPS
 + bool
 + default n
 +
  config CXL
   tristate Support for IBM Coherent Accelerators (CXL)
   depends on PPC_POWERNV  PCI_MSI  EEH
   select CXL_BASE
   select CXL_KERNEL_API
   select CXL_EEH
 + select CXL_AFU_DRIVER_OPS
   default m
   help
 Select this option to enable driver support for IBM Coherent
 diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
 index 6a768a9..e0f0c78 100644
 --- a/drivers/misc/cxl/api.c
 +++ b/drivers/misc/cxl/api.c
 @@ -267,6 +267,13 @@ struct cxl_context *cxl_fops_get_context(struct file 
 *file)
  }
  EXPORT_SYMBOL_GPL(cxl_fops_get_context);
  
 +void cxl_set_driver_ops(struct cxl_context *ctx,
 + struct cxl_afu_driver_ops *ops)
 +{
 + ctx-afu_driver_ops = ops;
 +}
 +EXPORT_SYMBOL_GPL(cxl_set_driver_ops);

This is pointless.

BUT, it wouldn't be if you actually checked the ops. Which you should do,
because then later you can avoid checking them on every event.

IIUI you should never have one op set but not the other, so you check in here
that both are set and error out otherwise.

Then in afu_read() you can change this:

 + if (ctx-afu_driver_ops
 +  ctx-afu_driver_ops-event_pending
 +  ctx-afu_driver_ops-deliver_event
 +  ctx-afu_driver_ops-event_pending(ctx)) {

to:

 + if (ctx-afu_driver_ops  ctx-afu_driver_ops-event_pending(ctx)) {


 diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h
 index 6f53866..30e44a8 100644
 --- a/drivers/misc/cxl/cxl.h
 +++ b/drivers/misc/cxl/cxl.h
 @@ -24,6 +24,7 @@
  #include asm/reg.h
  #include misc/cxl-base.h
  
 +#include misc/cxl.h
  #include uapi/misc/cxl.h
  
  extern uint cxl_verbose;
 @@ -34,7 +35,7 @@ extern uint cxl_verbose;
   * Bump version each time a user API change is made, whether it is
   * backwards compatible ot not.
   */
 -#define CXL_API_VERSION 1
 +#define CXL_API_VERSION 2

I'm not clear on why we're bumping the API version?

Isn't this purely about in-kernel drivers?

I see below you're touching the uapi header, so I guess it's that simple. But
if you can explain it better that would be great.

  #define CXL_API_VERSION_COMPATIBLE 1
  
  /*
 @@ -462,6 +463,9 @@ struct cxl_context {
   bool pending_fault;
   bool pending_afu_err;
  
 + /* Used by AFU drivers for driver specific event delivery */
 + struct cxl_afu_driver_ops *afu_driver_ops;
 +
   struct rcu_head rcu;
  };
  
 diff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c
 index 57bdb47..2ebaca3 100644
 --- a/drivers/misc/cxl/file.c
 +++ b/drivers/misc/cxl/file.c
 @@ -279,6 +279,22 @@ int afu_mmap(struct file *file, struct vm_area_struct 
 *vm)
   return cxl_context_iomap(ctx, vm);
  }
  
 +static inline int _ctx_event_pending(struct cxl_context *ctx)

Why isn't this returning bool?

 +{
 + bool afu_driver_event_pending = false;
 +
 + if (ctx-afu_driver_ops  ctx-afu_driver_ops-event_pending)
 + afu_driver_event_pending = 
 ctx-afu_driver_ops-event_pending(ctx);

You can drop 

[PATCH v5 0/7] dax: I/O path enhancements

2015-08-18 Thread Ross Zwisler
The goal of this series is to enhance the DAX I/O path so that all operations
that store data (I/O writes, zeroing blocks, punching holes, etc.) properly
synchronize the stores to media using the PMEM API.  This ensures that the
data DAX is writing is durable on media before the operation completes.

Patches 1-4 are a few random cleanups.

Changes from v4:
 - rebased to libnvdimm-for-next branch:
https://git.kernel.org/cgit/linux/kernel/git/nvdimm/nvdimm.git/commit/?h=libnvdimm-for-next

The nvdimm repository doesn't have the DAX PMD changes that are in the -mm
tree.  I expect the merge will basically be these two hunks:

@@ -514,7 +528,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned 
long address,
unsigned long pmd_addr = address  PMD_MASK;
bool write = flags  FAULT_FLAG_WRITE;
long length;
-   void *kaddr;
+   void __pmem *kaddr;
pgoff_t size, pgoff;
sector_t block, sector;
unsigned long pfn;
@@ -608,7 +622,8 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned 
long address,
if (buffer_unwritten(bh) || buffer_new(bh)) {
int i;
for (i = 0; i  PTRS_PER_PMD; i++)
-   clear_page(kaddr + i * PAGE_SIZE);
+   clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE);
+   wmb_pmem();
count_vm_event(PGMAJFAULT);
mem_cgroup_count_vm_event(vma-vm_mm, PGMAJFAULT);
result |= VM_FAULT_MAJOR;

Ross Zwisler (7):
  brd: make rd_size static
  pmem, x86: move x86 PMEM API to new pmem.h header
  pmem: remove layer when calling arch_has_wmb_pmem()
  pmem, x86: clean up conditional pmem includes
  pmem: add copy_from_iter_pmem() and clear_pmem()
  dax: update I/O path to do proper PMEM flushing
  pmem, dax: have direct_access use __pmem annotation

 Documentation/filesystems/Locking |   3 +-
 MAINTAINERS   |   1 +
 arch/powerpc/sysdev/axonram.c |   7 +-
 arch/x86/include/asm/cacheflush.h |  71 -
 arch/x86/include/asm/pmem.h   | 158 ++
 drivers/block/brd.c   |   6 +-
 drivers/nvdimm/pmem.c |   4 +-
 drivers/s390/block/dcssblk.c  |  10 ++-
 fs/block_dev.c|   2 +-
 fs/dax.c  |  63 +--
 include/linux/blkdev.h|   8 +-
 include/linux/pmem.h  |  77 ---
 12 files changed, 285 insertions(+), 125 deletions(-)
 create mode 100644 arch/x86/include/asm/pmem.h

-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH 1/2] PCI: Make pci_msi_setup_pci_dev() non-static for use by arch code

2015-08-18 Thread Guilherme G. Piccoli
Commit 1851617cd2 (PCI/MSI: Disable MSI at enumeration even if kernel
doesn't support MSI) changed the location of the code that disables
MSI/MSI-X interrupts at PCI probe time in devices that have this flag set.
It moved the code from pci_msi_init_pci_dev() to a new function named
pci_msi_setup_pci_dev(), called by pci_setup_device().

Since then, the pSeries platform of the powerpc architecture needs to
disable MSI at PCI probe time manually, as the code flow doesn't
reach pci_setup_device(). For doing so, it wants to call
pci_msi_setup_pci_dev(). This patch makes the required function
non-static, so that it will be called on PCI probe path on powerpc pSeries
platform in next patch.

Signed-off-by: Guilherme G. Piccoli gpicc...@linux.vnet.ibm.com
---
 drivers/pci/probe.c | 2 +-
 include/linux/pci.h | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index cefd636..520c5b6 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1103,7 +1103,7 @@ int pci_cfg_space_size(struct pci_dev *dev)
 
 #define LEGACY_IO_RESOURCE (IORESOURCE_IO | IORESOURCE_PCI_FIXED)
 
-static void pci_msi_setup_pci_dev(struct pci_dev *dev)
+void pci_msi_setup_pci_dev(struct pci_dev *dev)
 {
/*
 * Disable the MSI hardware to avoid screaming interrupts
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 8a0321a..860c751 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1202,6 +1202,7 @@ struct msix_entry {
u16 entry;  /* driver uses to specify entry, OS writes */
 };
 
+void pci_msi_setup_pci_dev(struct pci_dev *dev);
 
 #ifdef CONFIG_PCI_MSI
 int pci_msi_vec_count(struct pci_dev *dev);
-- 
2.1.0

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH] cxl: Allow release of contexts which have been OPENED but not STARTED

2015-08-18 Thread Michael Neuling
On Tue, 2015-08-18 at 19:19 +1000, Michael Ellerman wrote:
 On Tue, 2015-08-18 at 16:30 +1000, Andrew Donnellan wrote:
  If we open a context but do not start it (either because we do not attempt
  to start it, or because it fails to start for some reason), we are left
  with a context in state OPENED. Previously, cxl_release_context() only
  allowed releasing contexts in state CLOSED, so attempting to release an
  OPENED context would fail.
  
  In particular, this bug causes available contexts to run out after some EEH
  failures, where drivers attempt to release contexts that have failed to
  start.
  
  Allow releasing contexts in any state other than STARTED, i.e. OPENED or
  CLOSED (we can't release a STARTED context as it's currently using the
  hardware).
  
  Cc: sta...@vger.kernel.org
  Fixes: 6f7f0b3df6d4 (cxl: Add AFU virtual PHB and kernel API)
  Signed-off-by: Andrew Donnellan andrew.donnel...@au1.ibm.com
  Signed-off-by: Daniel Axtens d...@axtens.net
  ---
   drivers/misc/cxl/api.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c
  index 6a768a9..1c520b8 100644
  --- a/drivers/misc/cxl/api.c
  +++ b/drivers/misc/cxl/api.c
  @@ -59,7 +59,7 @@ EXPORT_SYMBOL_GPL(cxl_get_phys_dev);
   
   int cxl_release_context(struct cxl_context *ctx)
   {
  -   if (ctx-status != CLOSED)
  +   if (ctx-status == STARTED)
  return -EBUSY;
 
 So this doesn't break when you add a new state, is it worth writing it as:
 
   if (ctx-status = STARTED)
   return -EBUSY;
 
 ?

Yeah I think that would be more future proof, although it won't make a
difference with the current code.

FWIW, looks good to me.

Mikey

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V2] powerpc/85xx: Remove unused pci fixup hooks on c293pcie

2015-08-18 Thread Scott Wood
On Tue, 2015-08-18 at 04:26 -0500, Hou Zhiqiang-B48286 wrote:
 Hi Scott,
 
 Removed both pcibios_fixup_phb and pcibios_fixup_bus.
 Could you please help to apply it?

I applied it and sent a pull request yesterday.

-Scott


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev