Re: [RFC PATCH V1 0/8] KASAN ppc64 support
Andrey Ryabinin ryabinin@gmail.com writes: 2015-08-18 8:42 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com: Andrey Ryabinin ryabinin@gmail.com writes: But that is introducting conditionals in core code for no real benefit. This also will break when we eventually end up tracking vmalloc ? Ok, that's a very good reason to not do this. I see one potential problem in the way you use kasan_zero_page, though. memset/memcpy of large portions of memory ( 8 * PAGE_SIZE) will end up in overflowing kasan_zero_page when we check shadow in memory_is_poisoned_n() Any suggestion on how to fix that ? I guess we definitely don't want to check for addr and size in memset/memcpy. The other option is to do zero page mapping as is done for other architectures. That is we map via page table a zero page. But we still have the issue of memory we need to map the entire vmalloc range (page table memory). I was hoping to avoid all those complexities. -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
RE: [PATCH V2] powerpc/85xx: Remove unused pci fixup hooks on c293pcie
Hi Scott, Removed both pcibios_fixup_phb and pcibios_fixup_bus. Could you please help to apply it? -Original Message- From: Zhiqiang Hou [mailto:b48...@freescale.com] Sent: 2015年8月10日 17:40 To: ga...@kernel.crashing.org; linuxppc-dev@lists.ozlabs.org; Wood Scott- B07421 Cc: Hu Mingkai-B21284; Wang Dongsheng-B40534; Hou Zhiqiang-B48286 Subject: [PATCH V2] powerpc/85xx: Remove unused pci fixup hooks on c293pcie From: Hou Zhiqiang b48...@freescale.com The c293pcie board is an endpoint device and it doesn't need PM, so remove hooks pcibios_fixup_phb and pcibios_fixup_bus. Signed-off-by: Hou Zhiqiang b48...@freescale.com --- Test on c293pcie board: V2: Rename the title of this patch. Remove pcibios_fixup_bus that isn't used in EP. arch/powerpc/platforms/85xx/c293pcie.c | 4 1 file changed, 4 deletions(-) diff --git a/arch/powerpc/platforms/85xx/c293pcie.c b/arch/powerpc/platforms/85xx/c293pcie.c index 84476b6..61bc851 100644 --- a/arch/powerpc/platforms/85xx/c293pcie.c +++ b/arch/powerpc/platforms/85xx/c293pcie.c @@ -66,10 +66,6 @@ define_machine(c293_pcie) { .probe = c293_pcie_probe, .setup_arch = c293_pcie_setup_arch, .init_IRQ = c293_pcie_pic_init, -#ifdef CONFIG_PCI - .pcibios_fixup_bus = fsl_pcibios_fixup_bus, - .pcibios_fixup_phb = fsl_pcibios_fixup_phb, -#endif .get_irq= mpic_get_irq, .restart= fsl_rstcr_restart, .calibrate_decr = generic_calibrate_decr, -- 2.1.0.27.g96db324 Thanks, Zhiqiang ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [05/27] macintosh: therm_windtunnel: Export I2C module alias information
Hello Michael, On 08/18/2015 12:24 PM, Michael Ellerman wrote: On Thu, 2015-30-07 at 16:18:30 UTC, Javier Martinez Canillas wrote: The I2C core always reports the MODALIAS uevent as i2c:client name regardless if the driver was matched using the I2C id_table or the of_match_table. So the driver needs to export the I2C table and this be built into the module or udev won't have the necessary information to auto load the correct module when the device is added. Signed-off-by: Javier Martinez Canillas jav...@osg.samsung.com --- drivers/macintosh/therm_windtunnel.c | 1 + 1 file changed, 1 insertion(+) Who are you expecting to merge this? I was expecting Benjamin Herrenschmidt since he is listed in MAINTAINERS for drivers/macintosh. I cc'ed him in the patch but now in your answer I don't see him in the cc list, strange. But I'll be happy to re-post if there is another person who is handling the patches for this driver now. BTW there is another patch [0] for the same driver to export the OF id table information, that was not picked either. cheers [0]: https://lkml.org/lkml/2015/7/30/503 Best regards, -- Javier Martinez Canillas Open Source Group Samsung Research America ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: Allow release of contexts which have been OPENED but not STARTED
On Tue, 2015-08-18 at 16:30 +1000, Andrew Donnellan wrote: If we open a context but do not start it (either because we do not attempt to start it, or because it fails to start for some reason), we are left with a context in state OPENED. Previously, cxl_release_context() only allowed releasing contexts in state CLOSED, so attempting to release an OPENED context would fail. In particular, this bug causes available contexts to run out after some EEH failures, where drivers attempt to release contexts that have failed to start. Allow releasing contexts in any state other than STARTED, i.e. OPENED or CLOSED (we can't release a STARTED context as it's currently using the hardware). Cc: sta...@vger.kernel.org Fixes: 6f7f0b3df6d4 (cxl: Add AFU virtual PHB and kernel API) Signed-off-by: Andrew Donnellan andrew.donnel...@au1.ibm.com Signed-off-by: Daniel Axtens d...@axtens.net --- drivers/misc/cxl/api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 6a768a9..1c520b8 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -59,7 +59,7 @@ EXPORT_SYMBOL_GPL(cxl_get_phys_dev); int cxl_release_context(struct cxl_context *ctx) { - if (ctx-status != CLOSED) + if (ctx-status == STARTED) return -EBUSY; So this doesn't break when you add a new state, is it worth writing it as: if (ctx-status = STARTED) return -EBUSY; ? cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] cxl: Allow release of contexts which have been OPENED but not STARTED
If we open a context but do not start it (either because we do not attempt to start it, or because it fails to start for some reason), we are left with a context in state OPENED. Previously, cxl_release_context() only allowed releasing contexts in state CLOSED, so attempting to release an OPENED context would fail. In particular, this bug causes available contexts to run out after some EEH failures, where drivers attempt to release contexts that have failed to start. Allow releasing contexts in any state other than STARTED, i.e. OPENED or CLOSED (we can't release a STARTED context as it's currently using the hardware). Cc: sta...@vger.kernel.org Fixes: 6f7f0b3df6d4 (cxl: Add AFU virtual PHB and kernel API) Signed-off-by: Andrew Donnellan andrew.donnel...@au1.ibm.com Signed-off-by: Daniel Axtens d...@axtens.net --- drivers/misc/cxl/api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 6a768a9..1c520b8 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -59,7 +59,7 @@ EXPORT_SYMBOL_GPL(cxl_get_phys_dev); int cxl_release_context(struct cxl_context *ctx) { - if (ctx-status != CLOSED) + if (ctx-status == STARTED) return -EBUSY; put_device(ctx-afu-dev); -- Andrew Donnellan Software Engineer, OzLabs andrew.donnel...@au1.ibm.com Australia Development Lab, Canberra +61 2 6201 8874 (work)IBM Australia Limited ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V1 7/8] powerpc/mm: kasan: Add kasan support for ppc64
2015-08-18 8:36 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com: Andrey Ryabinin ryabinin@gmail.com writes: 2015-08-17 15:13 GMT+03:00 Andrey Ryabinin ryabinin@gmail.com: Did you disable stack instrumentation (in scripts/Makefile.kasa), or you version of gcc doesn't support it (e.g. like 4.9.x on x86) ? Because this can't work with stack instrumentation as you don't have shadow for stack in early code. But this should be doable, as I think. All you need is to setup shadow for init task's stack before executing any instrumented function. And you also need to define CONFIG_KASAN_SHADOW_OFFSET, so it will be passed to GCC via -fasan-shadow-offset= option. I am using KASAN minimal config. Hence this was not needed. Do we need to pass that option for outline instrumentation ? If not it would be a good idea to split that out and make it depend on KASAN_INLINE We need to pass this for stack instrumentation too. -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V1 0/8] KASAN ppc64 support
2015-08-18 12:21 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com: Andrey Ryabinin ryabinin@gmail.com writes: 2015-08-18 8:42 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com: Andrey Ryabinin ryabinin@gmail.com writes: But that is introducting conditionals in core code for no real benefit. This also will break when we eventually end up tracking vmalloc ? Ok, that's a very good reason to not do this. I see one potential problem in the way you use kasan_zero_page, though. memset/memcpy of large portions of memory ( 8 * PAGE_SIZE) will end up in overflowing kasan_zero_page when we check shadow in memory_is_poisoned_n() Any suggestion on how to fix that ? I guess we definitely don't want to Wait, I was wrong, we should be fine. In memory_is_poisoned_n(): ret = memory_is_zero(kasan_mem_to_shadow((void *)addr), kasan_mem_to_shadow((void *)addr + size - 1) + 1); So this will be: memory_is_zero(kasan_zero_page, (char *)kasan_zero_page + 1); Which means that we will access only 1 byte of kasan_zero_page. check for addr and size in memset/memcpy. The other option is to do zero page mapping as is done for other architectures. That is we map via page table a zero page. But we still have the issue of memory we need to map the entire vmalloc range (page table memory). I was hoping to avoid all those complexities. -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v8,3/3] leds/powernv: Add driver for PowerNV platform
On Sat, 2015-25-07 at 05:21:10 UTC, Vasant Hegde wrote: This patch implements LED driver for PowerNV platform using the existing generic LED class framework. PowerNV platform has below type of LEDs: - System attention Indicates there is a problem with the system that needs attention. - Identify Helps the user locate/identify a particular FRU or resource in the system. - Fault Indicates there is a problem with the FRU or resource at the location with which the indicator is associated. Hi Vasant, I'm waiting for a respin of this based on the discussion between you and Jackek. If I don't see it soon it will miss v4.3. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [v2, 1/2] Move the pt_regs_offset struct definition from arch to common include file
On Mon, 2015-27-07 at 04:39:33 UTC, David A. Long wrote: From: David A. Long dave.l...@linaro.org The pt_regs_offset structure is used for the HAVE_REGS_AND_STACK_ACCESS_API feature and has identical definitions in four different arch ptrace.h include files. It seems unlikely that definition would ever need to be changed regardless of architecture so lets move it into include/linux/ptrace.h, along with macros commonly used to access it. Thanks for cleaning this up. Tested successfully on powerpc. Acked-by: Michael Ellerman m...@ellerman.id.au cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V1 4/8] kasan: Don't use kasan shadow pointer in generic functions
2015-08-18 8:29 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com: Andrey Ryabinin ryabinin@gmail.com writes: On 08/17/2015 09:36 AM, Aneesh Kumar K.V wrote: We can't use generic functions like print_hex_dump to access kasan shadow region. This require us to setup another kasan shadow region for the address passed (kasan shadow address). Most architecture won't be able to do that. Hence remove dumping kasan shadow region dump. If we really want to do this we will have to have a kasan internal implemen tation of print_hex_dump for which we will disable address sanitizer operation. I didn't understand that. Yes, you don't have shadow for shadow. But, for shadow addresses you return return (void *)kasan_zero_page in kasan_mem_to_shadow(), so we should be fine to access shadow in generic code. But in general IMHO it is not correct to pass shadow address to generic functions, because that requires arch to setup shadow for the shadow. Yes, we have this shadow for shadow in x86_64/arm64. With one of the initial implementation of ppc64 support, I had page table entries setup for vmalloc and vmemmap shadow and that is when I hit the issue. We cannot expect arch to setup shadow regions like what is expected here. If we really need to print the shadow memory content, we could possibly make a copy of print_hex_dump in kasan_init.c . Let me know whether you think printing shadow area content is needed. It was quite useful sometimes, so I think we should keep it. But I agree with you, that it would be better to avoid accesses to shadow memory in generic code. Another way to deal with this would be to copy shadow content in buffer, and then print_hex_dump() it. -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [05/27] macintosh: therm_windtunnel: Export I2C module alias information
On Thu, 2015-30-07 at 16:18:30 UTC, Javier Martinez Canillas wrote: The I2C core always reports the MODALIAS uevent as i2c:client name regardless if the driver was matched using the I2C id_table or the of_match_table. So the driver needs to export the I2C table and this be built into the module or udev won't have the necessary information to auto load the correct module when the device is added. Signed-off-by: Javier Martinez Canillas jav...@osg.samsung.com --- drivers/macintosh/therm_windtunnel.c | 1 + 1 file changed, 1 insertion(+) Who are you expecting to merge this? cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: provide more common DMA API functions V2
Hi Andrew, On Tue, 18 Aug 2015 07:53:15 +0200 Christoph Hellwig h...@lst.de wrote: On Mon, Aug 17, 2015 at 10:45:52PM -0700, Andrew Morton wrote: I'll merge these 5 patches for 4.3. That means I'll release them into linux-next after 4.2 is released. So you only add for-4.3 code to -next after 4.2 is odd? Isn't thast the wrong way around? Linus will be releasing 4.2 in 1-2 weeks and until then, linux-next is supposed to contain only 4.2 material. Once 4.2 is released, linux-next is open for 4.3 material. Hmm, I'm pretty sure there's tons of 4.3 material in linux-next at the moment, at least I got merge warning messages from Stephen about some yesterday. Yeah, we are at v4.2-rc7 so linux-next is full of stuff to be merged by Linus for v4.3. Nothing for v4.4 should be in linux-next until after v4.3-rc1 is released in 3-4 weeks i.e. after the next merge window closes. -- Cheers, Stephen Rothwells...@canb.auug.org.au ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: Allow release of contexts which have been OPENED but not STARTED
Acked-by: Ian Munsie imun...@au1.ibm.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v2] powerpc/e6500: hw tablewalk: make sure we invalidate and write to the same tlb entry
In order to workaround Erratum A-008139, we have to invalidate the tlb entry with tlbilx before overwriting. Due to the performance consideration, we don't add any memory barrier when acquire/release the tcd lock. This means the two load instructions for esel_next do have the possibility to return different value. This is definitely not acceptable due to the Erratum A-008139. We have two options to fix this issue: a) Add memory barrier when acquire/release tcd lock to order the load/store to esel_next. b) Just make sure to invalidate and write to the same tlb entry and tolerate the race that we may get the wrong value and overwrite the tlb entry just updated by the other thread. We observe better performance using option b. So reserve an additional register to save the value of the esel_next. Signed-off-by: Kevin Hao haoke...@gmail.com --- v2: Use an additional register for saving the value of esel_next instead of lwsync. arch/powerpc/include/asm/exception-64e.h | 11 ++- arch/powerpc/mm/tlb_low_64e.S| 26 ++ 2 files changed, 24 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/include/asm/exception-64e.h b/arch/powerpc/include/asm/exception-64e.h index a8b52b61043f..d53575becbed 100644 --- a/arch/powerpc/include/asm/exception-64e.h +++ b/arch/powerpc/include/asm/exception-64e.h @@ -69,13 +69,14 @@ #define EX_TLB_ESR ( 9 * 8) /* Level 0 and 2 only */ #define EX_TLB_SRR0(10 * 8) #define EX_TLB_SRR1(11 * 8) +#define EX_TLB_R7 (12 * 8) #ifdef CONFIG_BOOK3E_MMU_TLB_STATS -#define EX_TLB_R8 (12 * 8) -#define EX_TLB_R9 (13 * 8) -#define EX_TLB_LR (14 * 8) -#define EX_TLB_SIZE(15 * 8) +#define EX_TLB_R8 (13 * 8) +#define EX_TLB_R9 (14 * 8) +#define EX_TLB_LR (15 * 8) +#define EX_TLB_SIZE(16 * 8) #else -#define EX_TLB_SIZE(12 * 8) +#define EX_TLB_SIZE(13 * 8) #endif #defineSTART_EXCEPTION(label) \ diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S index e4185581c5a7..3a5b89dfb5a1 100644 --- a/arch/powerpc/mm/tlb_low_64e.S +++ b/arch/powerpc/mm/tlb_low_64e.S @@ -68,11 +68,21 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV) ld r14,PACAPGD(r13) std r15,EX_TLB_R15(r12) std r10,EX_TLB_CR(r12) +#ifdef CONFIG_PPC_FSL_BOOK3E +BEGIN_FTR_SECTION + std r7,EX_TLB_R7(r12) +END_FTR_SECTION_IFSET(CPU_FTR_SMT) +#endif TLB_MISS_PROLOG_STATS .endm .macro tlb_epilog_bolted ld r14,EX_TLB_CR(r12) +#ifdef CONFIG_PPC_FSL_BOOK3E +BEGIN_FTR_SECTION + ld r7,EX_TLB_R7(r12) +END_FTR_SECTION_IFSET(CPU_FTR_SMT) +#endif ld r10,EX_TLB_R10(r12) ld r11,EX_TLB_R11(r12) ld r13,EX_TLB_R13(r12) @@ -297,6 +307,7 @@ itlb_miss_fault_bolted: * r13 = PACA * r11 = tlb_per_core ptr * r10 = crap (free to use) + * r7 = esel_next */ tlb_miss_common_e6500: crmove cr2*4+2,cr0*4+2 /* cr2.eq != 0 if kernel address */ @@ -334,8 +345,8 @@ BEGIN_FTR_SECTION /* CPU_FTR_SMT */ * with tlbilx before overwriting. */ - lbz r15,TCD_ESEL_NEXT(r11) - rlwinm r10,r15,16,0xff + lbz r7,TCD_ESEL_NEXT(r11) + rlwinm r10,r7,16,0xff orisr10,r10,MAS0_TLBSEL(1)@h mtspr SPRN_MAS0,r10 isync @@ -429,15 +440,14 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_SMT) mtspr SPRN_MAS2,r15 tlb_miss_huge_done_e6500: - lbz r15,TCD_ESEL_NEXT(r11) lbz r16,TCD_ESEL_MAX(r11) lbz r14,TCD_ESEL_FIRST(r11) - rlwimi r10,r15,16,0x00ff /* insert esel_next into MAS0 */ - addir15,r15,1 /* increment esel_next */ + rlwimi r10,r7,16,0x00ff/* insert esel_next into MAS0 */ + addir7,r7,1 /* increment esel_next */ mtspr SPRN_MAS0,r10 - cmpwr15,r16 - iseleq r15,r14,r15 /* if next == last use first */ - stb r15,TCD_ESEL_NEXT(r11) + cmpwr7,r16 + iseleq r7,r14,r7 /* if next == last use first */ + stb r7,TCD_ESEL_NEXT(r11) tlbwe -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [RFC PATCH V1 0/8] KASAN ppc64 support
2015-08-18 8:42 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com: Andrey Ryabinin ryabinin@gmail.com writes: 2015-08-17 12:50 GMT+03:00 Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com: Because of the above I concluded that we may not be able to do inline instrumentation. Now if we are not doing inline instrumentation, we can simplify kasan support by not creating a shadow mapping at all for vmalloc and vmemmap region. Hence the idea of returning the address of a zero page for anything other than kernel linear map region. Yes, mapping zero page needed only for inline instrumentation. You simply don't need to check shadow for vmalloc/vmemmap. So, instead of redefining kasan_mem_to_shadow() I'd suggest to add one more arch hook. Something like: bool kasan_tracks_vaddr(unsigned long addr) { return REGION_ID(addr) == KERNEL_REGION_ID; } And in check_memory_region(): if (!(kasan_enabled() kasan_tracks_vaddr(addr))) return; But that is introducting conditionals in core code for no real benefit. This also will break when we eventually end up tracking vmalloc ? Ok, that's a very good reason to not do this. I see one potential problem in the way you use kasan_zero_page, though. memset/memcpy of large portions of memory ( 8 * PAGE_SIZE) will end up in overflowing kasan_zero_page when we check shadow in memory_is_poisoned_n() In that case our mem_to_shadow will esentially be a switch statement returning different offsets for kernel region and vmalloc region. As far as core kernel code is considered, it just need to ask arch to get the shadow address for a memory and instead of adding conditionals in core, my suggestion is, we handle this in an arch function. -aneesh ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: provide more common DMA API functions V2
* Andrew Morton a...@linux-foundation.org wrote: On Tue, 18 Aug 2015 07:38:25 +0200 Christoph Hellwig h...@lst.de wrote: On Mon, Aug 17, 2015 at 02:24:29PM -0700, Andrew Morton wrote: 110254 bytes saved, shrinking the kernel by a whopping 0.17%. Thoughts? Sounds fine to me. OK, I'll clean it up a bit, check that each uninlining actually makes sense and then I'll see how it goes. I'll merge these 5 patches for 4.3. That means I'll release them into linux-next after 4.2 is released. So you only add for-4.3 code to -next after 4.2 is odd? Isn't thast the wrong way around? Linus will be releasing 4.2 in 1-2 weeks and until then, linux-next is supposed to contain only 4.2 material. Once 4.2 is released, linux-next is open for 4.3 material. Isn't that off by one? I.e. shouldn't this be: I'll merge these 5 patches for 4.4. That means I'll release them into linux-next after 4.2 is released. [...] Linus will be releasing 4.2 in 1-2 weeks and until then, linux-next is supposed to contain only 4.3 material. Once 4.2 is released and the 4.3 merge window opens, linux-next is open for 4.4 material. ? Thanks, Ingo ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [1/1] powerpc/xmon: Paged output for paca display
On Fri, 2015-14-08 at 02:55:14 UTC, Sam bobroff wrote: The paca display is already more than 24 lines, which can be problematic if you have an old school 80x24 terminal, or more likely you are on a virtual terminal which does not scroll for whatever reason. This adds an optional letter to the dp and dpa xmon commands (dpp and dppa), which will enable a per-page display (with 16 line pages): the first page will be displayed and if there was data that didn't fit, it will display a message indicating that the user can use enter to display the next page. The intent is that this feels similar to the way the memory display functions work. This is implemented by running over the entire output both for the initial command and for each subsequent page: the visible part is clipped out by checking line numbers. Handling the empty command as more is done by writing a special command into a static buffer that indicates where to move the sliding visibility window. This is similar to the approach used for the memory dump commands except that the state data is encoded into the last_cmd string, rather than a set of static variables. The memory dump commands could probably be rewritten to make use of the same buffer and remove their other static variables. Sample output: 0:mon dpp1 paca for cpu 0x1 @ cfdc0480: possible = yes present = yes online = yes lock_token = 0x8000(0x8) paca_index = 0x1 (0xa) kernel_toc = 0xc0eb2400(0x10) kernelbase = 0xc000(0x18) kernel_msr = 0xb0001032(0x20) emergency_sp = 0xc0003ffe8000(0x28) mc_emergency_sp = 0xc0003ffe4000(0x2e0) in_mce = 0x0 (0x2e8) data_offset = 0x7f17(0x30) hw_cpu_id= 0x8 (0x38) cpu_start= 0x1 (0x3a) kexec_state = 0x0 (0x3b) [Enter for next page] 0:mon __current= 0xc0007e696620(0x290) kstack = 0xc0007e6ebe30(0x298) stab_rr = 0xb (0x2a0) saved_r1 = 0xc0007ef37860(0x2a8) trap_save= 0x0 (0x2b8) soft_enabled = 0x0 (0x2ba) irq_happened = 0x1 (0x2bb) io_sync = 0x0 (0x2bc) irq_work_pending = 0x0 (0x2bd) nap_state_lost = 0x0 (0x2be) 0:mon (Based on a similar patch by Michael Ellerman m...@ellerman.id.au [v2] powerpc/xmon: Allow limiting the size of the paca display. This patch is an alternative and cannot coexist with the original.) So this is nice, but ... the diff is twice the size of my version, plus 128 bytes of BSS, so I'm not sure the added benefit is sufficient to justify the added code complexity. But you can convince me otherwise if you feel strongly about it. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v4 0/7] dax: I/O path enhancements
The goal of this series is to enhance the DAX I/O path so that all operations that store data (I/O writes, zeroing blocks, punching holes, etc.) properly synchronize the stores to media using the PMEM API. This ensures that the data DAX is writing is durable on media before the operation completes. Patches 1-4 are a few random cleanups. Changes from v3 (all in patch 5): - moved linux/uio.h include from x86 pmem.h to linux/pmem.h (Christoph) - made some local void* variables where apporpriate to cut down on __force casts from __pmem (Christoph) - made a __iter_needs_pmem_wb() helper and added a TODO to move to non-temporal stores (Christoph) Ross Zwisler (7): brd: make rd_size static pmem, x86: move x86 PMEM API to new pmem.h header pmem: remove layer when calling arch_has_wmb_pmem() pmem, x86: clean up conditional pmem includes pmem: add copy_from_iter_pmem() and clear_pmem() dax: update I/O path to do proper PMEM flushing pmem, dax: have direct_access use __pmem annotation Documentation/filesystems/Locking | 3 +- MAINTAINERS | 1 + arch/powerpc/sysdev/axonram.c | 7 +- arch/x86/include/asm/cacheflush.h | 71 - arch/x86/include/asm/pmem.h | 158 ++ drivers/block/brd.c | 6 +- drivers/nvdimm/pmem.c | 4 +- drivers/s390/block/dcssblk.c | 10 ++- fs/block_dev.c| 2 +- fs/dax.c | 68 +--- include/linux/blkdev.h| 8 +- include/linux/pmem.h | 79 +++ 12 files changed, 289 insertions(+), 128 deletions(-) create mode 100644 arch/x86/include/asm/pmem.h -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH v4 7/7] pmem, dax: have direct_access use __pmem annotation
Update the annotation for the kaddr pointer returned by direct_access() so that it is a __pmem pointer. This is consistent with the PMEM driver and with how this direct_access() pointer is used in the DAX code. Signed-off-by: Ross Zwisler ross.zwis...@linux.intel.com Reviewed-by: Christoph Hellwig h...@lst.de --- Documentation/filesystems/Locking | 3 ++- arch/powerpc/sysdev/axonram.c | 7 --- drivers/block/brd.c | 4 ++-- drivers/nvdimm/pmem.c | 4 ++-- drivers/s390/block/dcssblk.c | 10 ++ fs/block_dev.c| 2 +- fs/dax.c | 42 --- include/linux/blkdev.h| 8 8 files changed, 43 insertions(+), 37 deletions(-) diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 6a34a0f..06d4434 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -397,7 +397,8 @@ prototypes: int (*release) (struct gendisk *, fmode_t); int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long); int (*compat_ioctl) (struct block_device *, fmode_t, unsigned, unsigned long); - int (*direct_access) (struct block_device *, sector_t, void **, unsigned long *); + int (*direct_access) (struct block_device *, sector_t, void __pmem **, + unsigned long *); int (*media_changed) (struct gendisk *); void (*unlock_native_capacity) (struct gendisk *); int (*revalidate_disk) (struct gendisk *); diff --git a/arch/powerpc/sysdev/axonram.c b/arch/powerpc/sysdev/axonram.c index ee90db1..a2be2a6 100644 --- a/arch/powerpc/sysdev/axonram.c +++ b/arch/powerpc/sysdev/axonram.c @@ -141,13 +141,14 @@ axon_ram_make_request(struct request_queue *queue, struct bio *bio) */ static long axon_ram_direct_access(struct block_device *device, sector_t sector, - void **kaddr, unsigned long *pfn, long size) + void __pmem **kaddr, unsigned long *pfn, long size) { struct axon_ram_bank *bank = device-bd_disk-private_data; loff_t offset = (loff_t)sector AXON_RAM_SECTOR_SHIFT; + void *addr = (void *)(bank-ph_addr + offset); - *kaddr = (void *)(bank-ph_addr + offset); - *pfn = virt_to_phys(*kaddr) PAGE_SHIFT; + *kaddr = (void __pmem *)addr; + *pfn = virt_to_phys(addr) PAGE_SHIFT; return bank-size - offset; } diff --git a/drivers/block/brd.c b/drivers/block/brd.c index 5750b39..2691bb6 100644 --- a/drivers/block/brd.c +++ b/drivers/block/brd.c @@ -371,7 +371,7 @@ static int brd_rw_page(struct block_device *bdev, sector_t sector, #ifdef CONFIG_BLK_DEV_RAM_DAX static long brd_direct_access(struct block_device *bdev, sector_t sector, - void **kaddr, unsigned long *pfn, long size) + void __pmem **kaddr, unsigned long *pfn, long size) { struct brd_device *brd = bdev-bd_disk-private_data; struct page *page; @@ -381,7 +381,7 @@ static long brd_direct_access(struct block_device *bdev, sector_t sector, page = brd_insert_page(brd, sector); if (!page) return -ENOSPC; - *kaddr = page_address(page); + *kaddr = (void __pmem *)page_address(page); *pfn = page_to_pfn(page); /* diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index ade9eb9..68f6a6a 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -92,7 +92,7 @@ static int pmem_rw_page(struct block_device *bdev, sector_t sector, } static long pmem_direct_access(struct block_device *bdev, sector_t sector, - void **kaddr, unsigned long *pfn, long size) + void __pmem **kaddr, unsigned long *pfn, long size) { struct pmem_device *pmem = bdev-bd_disk-private_data; size_t offset = sector 9; @@ -101,7 +101,7 @@ static long pmem_direct_access(struct block_device *bdev, sector_t sector, return -ENODEV; /* FIXME convert DAX to comprehend that this mapping has a lifetime */ - *kaddr = (void __force *) pmem-virt_addr + offset; + *kaddr = pmem-virt_addr + offset; *pfn = (pmem-phys_addr + offset) PAGE_SHIFT; return pmem-size - offset; diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c index da21281..2c5a397 100644 --- a/drivers/s390/block/dcssblk.c +++ b/drivers/s390/block/dcssblk.c @@ -29,7 +29,7 @@ static int dcssblk_open(struct block_device *bdev, fmode_t mode); static void dcssblk_release(struct gendisk *disk, fmode_t mode); static void dcssblk_make_request(struct request_queue *q, struct bio *bio); static long dcssblk_direct_access(struct block_device *bdev, sector_t secnum, -void **kaddr, unsigned long *pfn, long size); +void __pmem **kaddr, unsigned long *pfn, long
[PATCH 2/2] powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case
Since the commit 1851617cd2 (PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI), MSI/MSI-X interrupts aren't being disabled at PCI probe time, as the logic responsible for this was moved in the aforementioned commit from pci_device_add() to pci_setup_device(). The latter function is not reachable on PowerPC pSeries platform during Open Firmware PCI probing time. This patch calls pci_msi_setup_pci_dev() explicitly to disable MSI/MSI-X during PCI probe time on pSeries platform. Signed-off-by: Guilherme G. Piccoli gpicc...@linux.vnet.ibm.com --- arch/powerpc/kernel/pci_of_scan.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/powerpc/kernel/pci_of_scan.c b/arch/powerpc/kernel/pci_of_scan.c index 42e02a2..0e920f3 100644 --- a/arch/powerpc/kernel/pci_of_scan.c +++ b/arch/powerpc/kernel/pci_of_scan.c @@ -191,6 +191,9 @@ struct pci_dev *of_create_pci_dev(struct device_node *node, pci_device_add(dev, bus); + /* Disable MSI/MSI-X here to avoid bogus interrupts */ + pci_msi_setup_pci_dev(dev); + return dev; } EXPORT_SYMBOL(of_create_pci_dev); -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V4 3/6] powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR
In current implementation, when VF BAR is bigger than 64MB, it uses 4 M64 BARs in Single PE mode to cover the number of VFs required to be enabled. By doing so, several VFs would be in one VF Group and leads to interference between VFs in the same group. And in this patch, m64_wins is renamed to m64_map, which means index number of the M64 BAR used to map the VF BAR. Based on Gavin's comments. This patch changes the design by using one M64 BAR in Single PE mode for one VF BAR. This gives absolute isolation for VFs. Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com --- arch/powerpc/include/asm/pci-bridge.h |5 +- arch/powerpc/platforms/powernv/pci-ioda.c | 178 - 2 files changed, 74 insertions(+), 109 deletions(-) diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h index 712add5..8aeba4c 100644 --- a/arch/powerpc/include/asm/pci-bridge.h +++ b/arch/powerpc/include/asm/pci-bridge.h @@ -214,10 +214,9 @@ struct pci_dn { u16 vfs_expanded; /* number of VFs IOV BAR expanded */ u16 num_vfs;/* number of VFs enabled*/ int offset; /* PE# for the first VF PE */ -#define M64_PER_IOV 4 - int m64_per_iov; + boolm64_single_mode;/* Use M64 BAR in Single Mode */ #define IODA_INVALID_M64(-1) - int m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV]; + int (*m64_map)[PCI_SRIOV_NUM_BARS]; #endif /* CONFIG_PCI_IOV */ #endif struct list_head child_list; diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index e3e0acb..de7db1d 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -1148,29 +1148,36 @@ static void pnv_pci_ioda_setup_PEs(void) } #ifdef CONFIG_PCI_IOV -static int pnv_pci_vf_release_m64(struct pci_dev *pdev) +static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs) { struct pci_bus*bus; struct pci_controller *hose; struct pnv_phb*phb; struct pci_dn *pdn; inti, j; + intm64_bars; bus = pdev-bus; hose = pci_bus_to_host(bus); phb = hose-private_data; pdn = pci_get_pdn(pdev); + if (pdn-m64_single_mode) + m64_bars = num_vfs; + else + m64_bars = 1; + for (i = 0; i PCI_SRIOV_NUM_BARS; i++) - for (j = 0; j M64_PER_IOV; j++) { - if (pdn-m64_wins[i][j] == IODA_INVALID_M64) + for (j = 0; j m64_bars; j++) { + if (pdn-m64_map[j][i] == IODA_INVALID_M64) continue; opal_pci_phb_mmio_enable(phb-opal_id, - OPAL_M64_WINDOW_TYPE, pdn-m64_wins[i][j], 0); - clear_bit(pdn-m64_wins[i][j], phb-ioda.m64_bar_alloc); - pdn-m64_wins[i][j] = IODA_INVALID_M64; + OPAL_M64_WINDOW_TYPE, pdn-m64_map[j][i], 0); + clear_bit(pdn-m64_map[j][i], phb-ioda.m64_bar_alloc); + pdn-m64_map[j][i] = IODA_INVALID_M64; } + kfree(pdn-m64_map); return 0; } @@ -1187,8 +1194,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs) inttotal_vfs; resource_size_tsize, start; intpe_num; - intvf_groups; - intvf_per_group; + intm64_bars; bus = pdev-bus; hose = pci_bus_to_host(bus); @@ -1196,26 +1202,26 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs) pdn = pci_get_pdn(pdev); total_vfs = pci_sriov_get_totalvfs(pdev); - /* Initialize the m64_wins to IODA_INVALID_M64 */ - for (i = 0; i PCI_SRIOV_NUM_BARS; i++) - for (j = 0; j M64_PER_IOV; j++) - pdn-m64_wins[i][j] = IODA_INVALID_M64; + if (pdn-m64_single_mode) + m64_bars = num_vfs; + else + m64_bars = 1; + + pdn-m64_map = kmalloc(sizeof(*pdn-m64_map) * m64_bars, GFP_KERNEL); + if (!pdn-m64_map) + return -ENOMEM; + /* Initialize the m64_map to IODA_INVALID_M64 */ + for (i = 0; i m64_bars ; i++) + for (j = 0; j PCI_SRIOV_NUM_BARS; j++) + pdn-m64_map[i][j] = IODA_INVALID_M64; - if (pdn-m64_per_iov == M64_PER_IOV) { - vf_groups = (num_vfs = M64_PER_IOV) ? num_vfs: M64_PER_IOV; - vf_per_group = (num_vfs = M64_PER_IOV)? 1: - roundup_pow_of_two(num_vfs) / pdn-m64_per_iov; - } else { - vf_groups = 1; - vf_per_group = 1; - }
[PATCH V4 1/6] powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BAR
On PHB_IODA2, we enable SRIOV devices by mapping IOV BAR with M64 BARs. If a SRIOV device's IOV BAR is not 64bit-prefetchable, this is not assigned from 64bit prefetchable window, which means M64 BAR can't work on it. This patch makes this explicit. Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com Reviewed-by: Gavin Shan gws...@linux.vnet.ibm.com --- arch/powerpc/platforms/powernv/pci-ioda.c | 25 + 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 85cbc96..8c031b5 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -908,9 +908,6 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset) if (!res-flags || !res-parent) continue; - if (!pnv_pci_is_mem_pref_64(res-flags)) - continue; - /* * The actual IOV BAR range is determined by the start address * and the actual size for num_vfs VFs BAR. This check is to @@ -939,9 +936,6 @@ static int pnv_pci_vf_resource_shift(struct pci_dev *dev, int offset) if (!res-flags || !res-parent) continue; - if (!pnv_pci_is_mem_pref_64(res-flags)) - continue; - size = pci_iov_resource_size(dev, i + PCI_IOV_RESOURCES); res2 = *res; res-start += size * offset; @@ -1221,9 +1215,6 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs) if (!res-flags || !res-parent) continue; - if (!pnv_pci_is_mem_pref_64(res-flags)) - continue; - for (j = 0; j vf_groups; j++) { do { win = find_next_zero_bit(phb-ioda.m64_bar_alloc, @@ -1510,6 +1501,12 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs) pdn = pci_get_pdn(pdev); if (phb-type == PNV_PHB_IODA2) { + if (!pdn-vfs_expanded) { + dev_info(pdev-dev, don't support this SRIOV device +with non 64bit-prefetchable IOV BAR\n); + return -ENOSPC; + } + /* Calculate available PE for required VFs */ mutex_lock(phb-ioda.pe_alloc_mutex); pdn-offset = bitmap_find_next_zero_area( @@ -2775,9 +2772,10 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) if (!res-flags || res-parent) continue; if (!pnv_pci_is_mem_pref_64(res-flags)) { - dev_warn(pdev-dev, non M64 VF BAR%d: %pR\n, + dev_warn(pdev-dev, Don't support SR-IOV with +non M64 VF BAR%d: %pR. \n, i, res); - continue; + return; } size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES); @@ -2796,11 +2794,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) res = pdev-resource[i + PCI_IOV_RESOURCES]; if (!res-flags || res-parent) continue; - if (!pnv_pci_is_mem_pref_64(res-flags)) { - dev_warn(pdev-dev, Skipping expanding VF BAR%d: %pR\n, -i, res); - continue; - } dev_dbg(pdev-dev, Fixing VF BAR%d: %pR to\n, i, res); size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES); -- 1.7.9.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V4 3/6] powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR
On Wed, Aug 19, 2015 at 10:01:41AM +0800, Wei Yang wrote: In current implementation, when VF BAR is bigger than 64MB, it uses 4 M64 BARs in Single PE mode to cover the number of VFs required to be enabled. By doing so, several VFs would be in one VF Group and leads to interference between VFs in the same group. And in this patch, m64_wins is renamed to m64_map, which means index number of the M64 BAR used to map the VF BAR. Based on Gavin's comments. This patch changes the design by using one M64 BAR in Single PE mode for one VF BAR. This gives absolute isolation for VFs. Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com Reviewed-by: Gavin Shan gws...@linux.vnet.ibm.com --- arch/powerpc/include/asm/pci-bridge.h |5 +- arch/powerpc/platforms/powernv/pci-ioda.c | 178 - 2 files changed, 74 insertions(+), 109 deletions(-) diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h index 712add5..8aeba4c 100644 --- a/arch/powerpc/include/asm/pci-bridge.h +++ b/arch/powerpc/include/asm/pci-bridge.h @@ -214,10 +214,9 @@ struct pci_dn { u16 vfs_expanded; /* number of VFs IOV BAR expanded */ u16 num_vfs;/* number of VFs enabled*/ int offset; /* PE# for the first VF PE */ -#define M64_PER_IOV 4 - int m64_per_iov; + boolm64_single_mode;/* Use M64 BAR in Single Mode */ #define IODA_INVALID_M64(-1) - int m64_wins[PCI_SRIOV_NUM_BARS][M64_PER_IOV]; + int (*m64_map)[PCI_SRIOV_NUM_BARS]; #endif /* CONFIG_PCI_IOV */ #endif struct list_head child_list; diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index e3e0acb..de7db1d 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -1148,29 +1148,36 @@ static void pnv_pci_ioda_setup_PEs(void) } #ifdef CONFIG_PCI_IOV -static int pnv_pci_vf_release_m64(struct pci_dev *pdev) +static int pnv_pci_vf_release_m64(struct pci_dev *pdev, u16 num_vfs) { struct pci_bus*bus; struct pci_controller *hose; struct pnv_phb*phb; struct pci_dn *pdn; inti, j; + intm64_bars; bus = pdev-bus; hose = pci_bus_to_host(bus); phb = hose-private_data; pdn = pci_get_pdn(pdev); + if (pdn-m64_single_mode) + m64_bars = num_vfs; + else + m64_bars = 1; + for (i = 0; i PCI_SRIOV_NUM_BARS; i++) - for (j = 0; j M64_PER_IOV; j++) { - if (pdn-m64_wins[i][j] == IODA_INVALID_M64) + for (j = 0; j m64_bars; j++) { + if (pdn-m64_map[j][i] == IODA_INVALID_M64) continue; opal_pci_phb_mmio_enable(phb-opal_id, - OPAL_M64_WINDOW_TYPE, pdn-m64_wins[i][j], 0); - clear_bit(pdn-m64_wins[i][j], phb-ioda.m64_bar_alloc); - pdn-m64_wins[i][j] = IODA_INVALID_M64; + OPAL_M64_WINDOW_TYPE, pdn-m64_map[j][i], 0); + clear_bit(pdn-m64_map[j][i], phb-ioda.m64_bar_alloc); + pdn-m64_map[j][i] = IODA_INVALID_M64; } + kfree(pdn-m64_map); return 0; } @@ -1187,8 +1194,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs) inttotal_vfs; resource_size_tsize, start; intpe_num; - intvf_groups; - intvf_per_group; + intm64_bars; bus = pdev-bus; hose = pci_bus_to_host(bus); @@ -1196,26 +1202,26 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs) pdn = pci_get_pdn(pdev); total_vfs = pci_sriov_get_totalvfs(pdev); - /* Initialize the m64_wins to IODA_INVALID_M64 */ - for (i = 0; i PCI_SRIOV_NUM_BARS; i++) - for (j = 0; j M64_PER_IOV; j++) - pdn-m64_wins[i][j] = IODA_INVALID_M64; + if (pdn-m64_single_mode) + m64_bars = num_vfs; + else + m64_bars = 1; + + pdn-m64_map = kmalloc(sizeof(*pdn-m64_map) * m64_bars, GFP_KERNEL); + if (!pdn-m64_map) + return -ENOMEM; + /* Initialize the m64_map to IODA_INVALID_M64 */ + for (i = 0; i m64_bars ; i++) + for (j = 0; j PCI_SRIOV_NUM_BARS; j++) + pdn-m64_map[i][j] = IODA_INVALID_M64; - if (pdn-m64_per_iov == M64_PER_IOV) { - vf_groups = (num_vfs = M64_PER_IOV) ? num_vfs: M64_PER_IOV; - vf_per_group = (num_vfs = M64_PER_IOV)? 1: - roundup_pow_of_two(num_vfs) / pdn-m64_per_iov; - } else { - vf_groups = 1; -
Re: [PATCH 2/2] cxl: add set/get private data to context struct
On Wed, 2015-08-19 at 14:19 +1000, Ian Munsie wrote: From: Michael Neuling mi...@neuling.org This provides AFU drivers a means to associate private data with a cxl context. This is particularly intended for make the new callbacks for driver specific events easier for AFU drivers to use, as they can easily get back to any private data structures they may use. Signed-off-by: Michael Neuling mi...@neuling.org Signed-off-by: Ian Munsie imun...@au1.ibm.com --- drivers/misc/cxl/api.c | 21 + drivers/misc/cxl/cxl.h | 3 +++ include/misc/cxl.h | 7 +++ 3 files changed, 31 insertions(+) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index e0f0c78..5f0b22e 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -70,6 +70,27 @@ int cxl_release_context(struct cxl_context *ctx) } EXPORT_SYMBOL_GPL(cxl_release_context); + +int cxl_set_priv(struct cxl_context *ctx, void *priv) +{ + if (!ctx) + return -EINVAL; + + ctx-priv = priv; + + return 0; +} +EXPORT_SYMBOL_GPL(cxl_set_priv); + +void *cxl_get_priv(struct cxl_context *ctx) +{ + if (!ctx) + return ERR_PTR(-EINVAL); + + return ctx-priv; +} +EXPORT_SYMBOL_GPL(cxl_get_priv); Do we really need the accessors? They don't buy anything I can see over just using ctx-priv directly. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V4 0/6] Redesign SR-IOV on PowerNV
In original design, it tries to group VFs to enable more number of VFs in the system, when VF BAR is bigger than 64MB. This design has a flaw in which one error on a VF will interfere other VFs in the same group. This patch series change this design by using M64 BAR in Single PE mode to cover only one VF BAR. By doing so, it gives absolute isolation between VFs. v4: * rebase the code on top of v4.2-rc7 * switch back to use the dynamic version of pe_num_map and m64_map * split the memory allocation and PE assignment of pe_num_map to make it more easy to read * check pe_num_map value before free PE. * add the rename reason for pe_num_map and m64_map in change log v3: * return -ENOSPC when a VF has non-64bit prefetchable BAR * rename offset to pe_num_map and define it staticly * change commit log based on comments * define m64_map staticly v2: * clean up iov bar alignment calculation * change m64s to m64_bars * add a field to represent M64 Single PE mode will be used * change m64_wins to m64_map * calculate the gate instead of hard coded * dynamically allocate m64_map * dynamically allocate PE# * add a case to calculate iov bar alignment when M64 Single PE is used * when M64 Single PE is used, compare num_vfs with M64 BAR available number in system at first Wei Yang (6): powerpc/powernv: don't enable SRIOV when VF BAR has non 64bit-prefetchable BAR powerpc/powernv: simplify the calculation of iov resource alignment powerpc/powernv: use one M64 BAR in Single PE mode for one VF BAR powerpc/powernv: replace the hard coded boundary with gate powerpc/powernv: boundary the total VF BAR size instead of the individual one powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE mode arch/powerpc/include/asm/pci-bridge.h |7 +- arch/powerpc/platforms/powernv/pci-ioda.c | 328 +++-- 2 files changed, 175 insertions(+), 160 deletions(-) -- 1.7.9.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V4 5/6] powerpc/powernv: boundary the total VF BAR size instead of the individual one
Each VF could have 6 BARs at most. When the total BAR size exceeds the gate, after expanding it will also exhaust the M64 Window. This patch limits the boundary by checking the total VF BAR size instead of the individual BAR. Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com Reviewed-by: Gavin Shan gws...@linux.vnet.ibm.com --- arch/powerpc/platforms/powernv/pci-ioda.c | 14 -- 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index b8bc51f..4bc83b8 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2701,7 +2701,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) const resource_size_t gate = phb-ioda.m64_segsize 2; struct resource *res; int i; - resource_size_t size; + resource_size_t size, total_vf_bar_sz; struct pci_dn *pdn; int mul, total_vfs; @@ -2714,6 +2714,7 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) total_vfs = pci_sriov_get_totalvfs(pdev); mul = phb-ioda.total_pe; + total_vf_bar_sz = 0; for (i = 0; i PCI_SRIOV_NUM_BARS; i++) { res = pdev-resource[i + PCI_IOV_RESOURCES]; @@ -2726,7 +2727,8 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) return; } - size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES); + total_vf_bar_sz += pci_iov_resource_size(pdev, + i + PCI_IOV_RESOURCES); /* * If bigger than quarter of M64 segment size, just round up @@ -2740,11 +2742,11 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) * limit the system flexibility. This is a design decision to * set the boundary to quarter of the M64 segment size. */ - if (size gate) { - dev_info(pdev-dev, PowerNV: VF BAR%d: %pR IOV size - is bigger than %lld, roundup power2\n, -i, res, gate); + if (total_vf_bar_sz gate) { mul = roundup_pow_of_two(total_vfs); + dev_info(pdev-dev, + VF BAR Total IOV size %llx %llx, roundup to %d VFs\n, + total_vf_bar_sz, gate, mul); pdn-m64_single_mode = true; break; } -- 1.7.9.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 2/2] cxl: add set/get private data to context struct
From: Michael Neuling mi...@neuling.org This provides AFU drivers a means to associate private data with a cxl context. This is particularly intended for make the new callbacks for driver specific events easier for AFU drivers to use, as they can easily get back to any private data structures they may use. Signed-off-by: Michael Neuling mi...@neuling.org Signed-off-by: Ian Munsie imun...@au1.ibm.com --- drivers/misc/cxl/api.c | 21 + drivers/misc/cxl/cxl.h | 3 +++ include/misc/cxl.h | 7 +++ 3 files changed, 31 insertions(+) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index e0f0c78..5f0b22e 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -70,6 +70,27 @@ int cxl_release_context(struct cxl_context *ctx) } EXPORT_SYMBOL_GPL(cxl_release_context); + +int cxl_set_priv(struct cxl_context *ctx, void *priv) +{ + if (!ctx) + return -EINVAL; + + ctx-priv = priv; + + return 0; +} +EXPORT_SYMBOL_GPL(cxl_set_priv); + +void *cxl_get_priv(struct cxl_context *ctx) +{ + if (!ctx) + return ERR_PTR(-EINVAL); + + return ctx-priv; +} +EXPORT_SYMBOL_GPL(cxl_get_priv); + int cxl_allocate_afu_irqs(struct cxl_context *ctx, int num) { if (num == 0) diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index 30e44a8..93db76a 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -431,6 +431,9 @@ struct cxl_context { /* Only used in PR mode */ u64 process_token; + /* driver private data */ + void *priv; + unsigned long *irq_bitmap; /* Accessed from IRQ context */ struct cxl_irq_ranges irqs; struct list_head irq_names; diff --git a/include/misc/cxl.h b/include/misc/cxl.h index 73e03a6..3f5edbe 100644 --- a/include/misc/cxl.h +++ b/include/misc/cxl.h @@ -89,6 +89,13 @@ struct cxl_context *cxl_dev_context_init(struct pci_dev *dev); int cxl_release_context(struct cxl_context *ctx); /* + * Set and get private data associated with a context. Allows drivers to have a + * back pointer to some useful structure. + */ +int cxl_set_priv(struct cxl_context *ctx, void *priv); +void *cxl_get_priv(struct cxl_context *ctx); + +/* * Allocate AFU interrupts for this context. num=0 will allocate the default * for this AFU as given in the AFU descriptor. This number doesn't include the * interrupt 0 (CAIA defines AFU IRQ 0 for page faults). Each interrupt to be -- 2.1.4 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V4 4/6] powerpc/powernv: replace the hard coded boundary with gate
At the moment 64bit-prefetchable window can be maximum 64GB, which is currently got from device tree. This means that in shared mode the maximum supported VF BAR size is 64GB/256=256MB. While this size could exhaust the whole 64bit-prefetchable window. This is a design decision to set a boundary to 64MB of the VF BAR size. Since VF BAR size with 64MB would occupy a quarter of the 64bit-prefetchable window, this is affordable. This patch replaces magic limit of 64MB with gate, which is 1/4 of the M64 Segment Size(m64_segsize 2) and adds comment to explain the reason for it. Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com Reviewed-by: Gavin Shan gws...@linux.vent.ibm.com --- arch/powerpc/platforms/powernv/pci-ioda.c | 28 +++- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index de7db1d..b8bc51f 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2696,8 +2696,9 @@ static void pnv_pci_init_ioda_msis(struct pnv_phb *phb) { } #ifdef CONFIG_PCI_IOV static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) { - struct pci_controller *hose; - struct pnv_phb *phb; + struct pci_controller *hose = pci_bus_to_host(pdev-bus); + struct pnv_phb *phb = hose-private_data; + const resource_size_t gate = phb-ioda.m64_segsize 2; struct resource *res; int i; resource_size_t size; @@ -2707,9 +2708,6 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) if (!pdev-is_physfn || pdev-is_added) return; - hose = pci_bus_to_host(pdev-bus); - phb = hose-private_data; - pdn = pci_get_pdn(pdev); pdn-vfs_expanded = 0; pdn-m64_single_mode = false; @@ -2730,10 +2728,22 @@ static void pnv_pci_ioda_fixup_iov_resources(struct pci_dev *pdev) size = pci_iov_resource_size(pdev, i + PCI_IOV_RESOURCES); - /* bigger than 64M */ - if (size (1 26)) { - dev_info(pdev-dev, PowerNV: VF BAR%d: %pR IOV size is bigger than 64M, roundup power2\n, -i, res); + /* +* If bigger than quarter of M64 segment size, just round up +* power of two. +* +* Generally, one M64 BAR maps one IOV BAR. To avoid conflict +* with other devices, IOV BAR size is expanded to be +* (total_pe * VF_BAR_size). When VF_BAR_size is half of M64 +* segment size , the expanded size would equal to half of the +* whole M64 space size, which will exhaust the M64 space and +* limit the system flexibility. This is a design decision to +* set the boundary to quarter of the M64 segment size. +*/ + if (size gate) { + dev_info(pdev-dev, PowerNV: VF BAR%d: %pR IOV size + is bigger than %lld, roundup power2\n, +i, res, gate); mul = roundup_pow_of_two(total_vfs); pdn-m64_single_mode = true; break; -- 1.7.9.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V4 2/6] powerpc/powernv: simplify the calculation of iov resource alignment
The alignment of IOV BAR on PowerNV platform is the total size of the IOV BAR. No matter whether the IOV BAR is extended with number of roundup_pow_of_two(total_vfs) or number of max PE number (256), the total size could be calculated by (vfs_expanded * VF_BAR_size). This patch simplifies the pnv_pci_iov_resource_alignment() by removing the first case. Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com Reviewed-by: Gavin Shan gws...@linux.vnet.ibm.com --- arch/powerpc/platforms/powernv/pci-ioda.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 8c031b5..e3e0acb 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -2988,12 +2988,16 @@ static resource_size_t pnv_pci_iov_resource_alignment(struct pci_dev *pdev, int resno) { struct pci_dn *pdn = pci_get_pdn(pdev); - resource_size_t align, iov_align; - - iov_align = resource_size(pdev-resource[resno]); - if (iov_align) - return iov_align; + resource_size_t align; + /* +* On PowerNV platform, IOV BAR is mapped by M64 BAR to enable the +* SR-IOV. While from hardware perspective, the range mapped by M64 +* BAR should be size aligned. +* +* This function returns the total IOV BAR size if expanded or just the +* individual size if not. +*/ align = pci_iov_resource_size(pdev, resno); if (pdn-vfs_expanded) return pdn-vfs_expanded * align; -- 1.7.9.5 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V4 6/6] powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE mode
On Wed, Aug 19, 2015 at 10:01:44AM +0800, Wei Yang wrote: When M64 BAR is set to Single PE mode, the PE# assigned to VF could be sparse. This patch restructures the patch to allocate sparse PE# for VFs when M64 BAR is set to Single PE mode. Also it rename the offset to pe_num_map to reflect the content is the PE number. Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com Reviewed-by: Gavin Shan gws...@linux.vnet.ibm.com --- arch/powerpc/include/asm/pci-bridge.h |2 +- arch/powerpc/platforms/powernv/pci-ioda.c | 79 ++--- 2 files changed, 61 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h index 8aeba4c..b3a226b 100644 --- a/arch/powerpc/include/asm/pci-bridge.h +++ b/arch/powerpc/include/asm/pci-bridge.h @@ -213,7 +213,7 @@ struct pci_dn { #ifdef CONFIG_PCI_IOV u16 vfs_expanded; /* number of VFs IOV BAR expanded */ u16 num_vfs;/* number of VFs enabled*/ - int offset; /* PE# for the first VF PE */ + int *pe_num_map;/* PE# for the first VF PE or array */ boolm64_single_mode;/* Use M64 BAR in Single Mode */ #define IODA_INVALID_M64(-1) int (*m64_map)[PCI_SRIOV_NUM_BARS]; diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 4bc83b8..779f52a 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -1243,7 +1243,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs) /* Map the M64 here */ if (pdn-m64_single_mode) { - pe_num = pdn-offset + j; + pe_num = pdn-pe_num_map[j]; rc = opal_pci_map_pe_mmio_window(phb-opal_id, pe_num, OPAL_M64_WINDOW_TYPE, pdn-m64_map[j][i], 0); @@ -1347,7 +1347,7 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev) struct pnv_phb*phb; struct pci_dn *pdn; struct pci_sriov *iov; - u16 num_vfs; + u16num_vfs, i; bus = pdev-bus; hose = pci_bus_to_host(bus); @@ -1361,14 +1361,21 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev) if (phb-type == PNV_PHB_IODA2) { if (!pdn-m64_single_mode) - pnv_pci_vf_resource_shift(pdev, -pdn-offset); + pnv_pci_vf_resource_shift(pdev, -*pdn-pe_num_map); /* Release M64 windows */ pnv_pci_vf_release_m64(pdev, num_vfs); /* Release PE numbers */ - bitmap_clear(phb-ioda.pe_alloc, pdn-offset, num_vfs); - pdn-offset = 0; + if (pdn-m64_single_mode) { + for (i = 0; i num_vfs; i++) { + if (pdn-pe_num_map[i] != IODA_INVALID_PE) + pnv_ioda_free_pe(phb, pdn-pe_num_map[i]); + } + } else + bitmap_clear(phb-ioda.pe_alloc, *pdn-pe_num_map, num_vfs); + /* Releasing pe_num_map */ + kfree(pdn-pe_num_map); } } @@ -1394,7 +1401,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs) /* Reserve PE for each VF */ for (vf_index = 0; vf_index num_vfs; vf_index++) { - pe_num = pdn-offset + vf_index; + if (pdn-m64_single_mode) + pe_num = pdn-pe_num_map[vf_index]; + else + pe_num = *pdn-pe_num_map + vf_index; pe = phb-ioda.pe_array[pe_num]; pe-pe_number = pe_num; @@ -1436,6 +1446,7 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs) struct pnv_phb*phb; struct pci_dn *pdn; intret; + u16i; bus = pdev-bus; hose = pci_bus_to_host(bus); @@ -1458,20 +1469,42 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs) return -EBUSY; } + /* Allocating pe_num_map */ + if (pdn-m64_single_mode) + pdn-pe_num_map = kmalloc(sizeof(*pdn-pe_num_map) * num_vfs, + GFP_KERNEL); + else + pdn-pe_num_map = kmalloc(sizeof(*pdn-pe_num_map), GFP_KERNEL); + + if (!pdn-pe_num_map) + return -ENOMEM; + /* Calculate available PE for required VFs */ - mutex_lock(phb-ioda.pe_alloc_mutex); - pdn-offset = bitmap_find_next_zero_area( - phb-ioda.pe_alloc, phb-ioda.total_pe, - 0, num_vfs, 0); -
[PATCH 1/2] cxl: Add mechanism for delivering AFU driver specific events
From: Ian Munsie imun...@au1.ibm.com This adds an afu_driver_ops structure with event_pending and deliver_event callbacks. An AFU driver can fill these out and associate it with a context to enable passing custom AFU specific events to userspace. The cxl driver will call event_pending() during poll, select, read, etc. calls to check if an AFU driver specific event is pending, and will call deliver_event() to deliver that event. This way, the cxl driver takes care of all the usual locking semantics around these calls and handles all the generic cxl events, so that the AFU driver only needs to worry about it's own events. The deliver_event() call is passed a struct cxl_event buffer to fill in. The header will already be filled in for an AFU driver event, and the AFU driver is expected to expand the header.size as necessary (up to max_size, defined by struct cxl_event_afu_driver_reserved) and fill out it's own information. Conflicts between AFU specific events are not expected, due to the fact that each AFU specific driver has it's own mechanism to deliver an AFU file descriptor to userspace. Signed-off-by: Ian Munsie imun...@au1.ibm.com --- drivers/misc/cxl/Kconfig | 5 + drivers/misc/cxl/api.c | 7 +++ drivers/misc/cxl/cxl.h | 6 +- drivers/misc/cxl/file.c | 37 +++-- include/misc/cxl.h | 29 + include/uapi/misc/cxl.h | 13 + 6 files changed, 86 insertions(+), 11 deletions(-) diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig index 8756d06..560412c 100644 --- a/drivers/misc/cxl/Kconfig +++ b/drivers/misc/cxl/Kconfig @@ -15,12 +15,17 @@ config CXL_EEH bool default n +config CXL_AFU_DRIVER_OPS + bool + default n + config CXL tristate Support for IBM Coherent Accelerators (CXL) depends on PPC_POWERNV PCI_MSI EEH select CXL_BASE select CXL_KERNEL_API select CXL_EEH + select CXL_AFU_DRIVER_OPS default m help Select this option to enable driver support for IBM Coherent diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 6a768a9..e0f0c78 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -267,6 +267,13 @@ struct cxl_context *cxl_fops_get_context(struct file *file) } EXPORT_SYMBOL_GPL(cxl_fops_get_context); +void cxl_set_driver_ops(struct cxl_context *ctx, + struct cxl_afu_driver_ops *ops) +{ + ctx-afu_driver_ops = ops; +} +EXPORT_SYMBOL_GPL(cxl_set_driver_ops); + int cxl_start_work(struct cxl_context *ctx, struct cxl_ioctl_start_work *work) { diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index 6f53866..30e44a8 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -24,6 +24,7 @@ #include asm/reg.h #include misc/cxl-base.h +#include misc/cxl.h #include uapi/misc/cxl.h extern uint cxl_verbose; @@ -34,7 +35,7 @@ extern uint cxl_verbose; * Bump version each time a user API change is made, whether it is * backwards compatible ot not. */ -#define CXL_API_VERSION 1 +#define CXL_API_VERSION 2 #define CXL_API_VERSION_COMPATIBLE 1 /* @@ -462,6 +463,9 @@ struct cxl_context { bool pending_fault; bool pending_afu_err; + /* Used by AFU drivers for driver specific event delivery */ + struct cxl_afu_driver_ops *afu_driver_ops; + struct rcu_head rcu; }; diff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c index 57bdb47..2ebaca3 100644 --- a/drivers/misc/cxl/file.c +++ b/drivers/misc/cxl/file.c @@ -279,6 +279,22 @@ int afu_mmap(struct file *file, struct vm_area_struct *vm) return cxl_context_iomap(ctx, vm); } +static inline int _ctx_event_pending(struct cxl_context *ctx) +{ + bool afu_driver_event_pending = false; + + if (ctx-afu_driver_ops ctx-afu_driver_ops-event_pending) + afu_driver_event_pending = ctx-afu_driver_ops-event_pending(ctx); + + return (ctx-pending_irq || ctx-pending_fault || + ctx-pending_afu_err || afu_driver_event_pending); +} + +static inline int ctx_event_pending(struct cxl_context *ctx) +{ + return _ctx_event_pending(ctx) || (ctx-status == CLOSED); +} + unsigned int afu_poll(struct file *file, struct poll_table_struct *poll) { struct cxl_context *ctx = file-private_data; @@ -291,8 +307,7 @@ unsigned int afu_poll(struct file *file, struct poll_table_struct *poll) pr_devel(afu_poll wait done pe: %i\n, ctx-pe); spin_lock_irqsave(ctx-lock, flags); - if (ctx-pending_irq || ctx-pending_fault || - ctx-pending_afu_err) + if (_ctx_event_pending(ctx)) mask |= POLLIN | POLLRDNORM; else if (ctx-status == CLOSED) /* Only error on closed when there are no futher events pending @@ -305,12 +320,6 @@ unsigned int afu_poll(struct file *file, struct poll_table_struct *poll)
[PATCH V4 6/6] powerpc/powernv: allocate sparse PE# when using M64 BAR in Single PE mode
When M64 BAR is set to Single PE mode, the PE# assigned to VF could be sparse. This patch restructures the patch to allocate sparse PE# for VFs when M64 BAR is set to Single PE mode. Also it rename the offset to pe_num_map to reflect the content is the PE number. Signed-off-by: Wei Yang weiy...@linux.vnet.ibm.com --- arch/powerpc/include/asm/pci-bridge.h |2 +- arch/powerpc/platforms/powernv/pci-ioda.c | 79 ++--- 2 files changed, 61 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/include/asm/pci-bridge.h b/arch/powerpc/include/asm/pci-bridge.h index 8aeba4c..b3a226b 100644 --- a/arch/powerpc/include/asm/pci-bridge.h +++ b/arch/powerpc/include/asm/pci-bridge.h @@ -213,7 +213,7 @@ struct pci_dn { #ifdef CONFIG_PCI_IOV u16 vfs_expanded; /* number of VFs IOV BAR expanded */ u16 num_vfs;/* number of VFs enabled*/ - int offset; /* PE# for the first VF PE */ + int *pe_num_map;/* PE# for the first VF PE or array */ boolm64_single_mode;/* Use M64 BAR in Single Mode */ #define IODA_INVALID_M64(-1) int (*m64_map)[PCI_SRIOV_NUM_BARS]; diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c index 4bc83b8..779f52a 100644 --- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -1243,7 +1243,7 @@ static int pnv_pci_vf_assign_m64(struct pci_dev *pdev, u16 num_vfs) /* Map the M64 here */ if (pdn-m64_single_mode) { - pe_num = pdn-offset + j; + pe_num = pdn-pe_num_map[j]; rc = opal_pci_map_pe_mmio_window(phb-opal_id, pe_num, OPAL_M64_WINDOW_TYPE, pdn-m64_map[j][i], 0); @@ -1347,7 +1347,7 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev) struct pnv_phb*phb; struct pci_dn *pdn; struct pci_sriov *iov; - u16 num_vfs; + u16num_vfs, i; bus = pdev-bus; hose = pci_bus_to_host(bus); @@ -1361,14 +1361,21 @@ void pnv_pci_sriov_disable(struct pci_dev *pdev) if (phb-type == PNV_PHB_IODA2) { if (!pdn-m64_single_mode) - pnv_pci_vf_resource_shift(pdev, -pdn-offset); + pnv_pci_vf_resource_shift(pdev, -*pdn-pe_num_map); /* Release M64 windows */ pnv_pci_vf_release_m64(pdev, num_vfs); /* Release PE numbers */ - bitmap_clear(phb-ioda.pe_alloc, pdn-offset, num_vfs); - pdn-offset = 0; + if (pdn-m64_single_mode) { + for (i = 0; i num_vfs; i++) { + if (pdn-pe_num_map[i] != IODA_INVALID_PE) + pnv_ioda_free_pe(phb, pdn-pe_num_map[i]); + } + } else + bitmap_clear(phb-ioda.pe_alloc, *pdn-pe_num_map, num_vfs); + /* Releasing pe_num_map */ + kfree(pdn-pe_num_map); } } @@ -1394,7 +1401,10 @@ static void pnv_ioda_setup_vf_PE(struct pci_dev *pdev, u16 num_vfs) /* Reserve PE for each VF */ for (vf_index = 0; vf_index num_vfs; vf_index++) { - pe_num = pdn-offset + vf_index; + if (pdn-m64_single_mode) + pe_num = pdn-pe_num_map[vf_index]; + else + pe_num = *pdn-pe_num_map + vf_index; pe = phb-ioda.pe_array[pe_num]; pe-pe_number = pe_num; @@ -1436,6 +1446,7 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs) struct pnv_phb*phb; struct pci_dn *pdn; intret; + u16i; bus = pdev-bus; hose = pci_bus_to_host(bus); @@ -1458,20 +1469,42 @@ int pnv_pci_sriov_enable(struct pci_dev *pdev, u16 num_vfs) return -EBUSY; } + /* Allocating pe_num_map */ + if (pdn-m64_single_mode) + pdn-pe_num_map = kmalloc(sizeof(*pdn-pe_num_map) * num_vfs, + GFP_KERNEL); + else + pdn-pe_num_map = kmalloc(sizeof(*pdn-pe_num_map), GFP_KERNEL); + + if (!pdn-pe_num_map) + return -ENOMEM; + /* Calculate available PE for required VFs */ - mutex_lock(phb-ioda.pe_alloc_mutex); - pdn-offset = bitmap_find_next_zero_area( - phb-ioda.pe_alloc, phb-ioda.total_pe, - 0, num_vfs, 0); - if (pdn-offset =
[PATCH v5 7/7] pmem, dax: have direct_access use __pmem annotation
Update the annotation for the kaddr pointer returned by direct_access() so that it is a __pmem pointer. This is consistent with the PMEM driver and with how this direct_access() pointer is used in the DAX code. Signed-off-by: Ross Zwisler ross.zwis...@linux.intel.com Reviewed-by: Christoph Hellwig h...@lst.de --- Documentation/filesystems/Locking | 3 ++- arch/powerpc/sysdev/axonram.c | 7 --- drivers/block/brd.c | 4 ++-- drivers/nvdimm/pmem.c | 4 ++-- drivers/s390/block/dcssblk.c | 10 ++ fs/block_dev.c| 2 +- fs/dax.c | 37 - include/linux/blkdev.h| 8 8 files changed, 41 insertions(+), 34 deletions(-) diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking index 6a34a0f..06d4434 100644 --- a/Documentation/filesystems/Locking +++ b/Documentation/filesystems/Locking @@ -397,7 +397,8 @@ prototypes: int (*release) (struct gendisk *, fmode_t); int (*ioctl) (struct block_device *, fmode_t, unsigned, unsigned long); int (*compat_ioctl) (struct block_device *, fmode_t, unsigned, unsigned long); - int (*direct_access) (struct block_device *, sector_t, void **, unsigned long *); + int (*direct_access) (struct block_device *, sector_t, void __pmem **, + unsigned long *); int (*media_changed) (struct gendisk *); void (*unlock_native_capacity) (struct gendisk *); int (*revalidate_disk) (struct gendisk *); diff --git a/arch/powerpc/sysdev/axonram.c b/arch/powerpc/sysdev/axonram.c index ee90db1..a2be2a6 100644 --- a/arch/powerpc/sysdev/axonram.c +++ b/arch/powerpc/sysdev/axonram.c @@ -141,13 +141,14 @@ axon_ram_make_request(struct request_queue *queue, struct bio *bio) */ static long axon_ram_direct_access(struct block_device *device, sector_t sector, - void **kaddr, unsigned long *pfn, long size) + void __pmem **kaddr, unsigned long *pfn, long size) { struct axon_ram_bank *bank = device-bd_disk-private_data; loff_t offset = (loff_t)sector AXON_RAM_SECTOR_SHIFT; + void *addr = (void *)(bank-ph_addr + offset); - *kaddr = (void *)(bank-ph_addr + offset); - *pfn = virt_to_phys(*kaddr) PAGE_SHIFT; + *kaddr = (void __pmem *)addr; + *pfn = virt_to_phys(addr) PAGE_SHIFT; return bank-size - offset; } diff --git a/drivers/block/brd.c b/drivers/block/brd.c index 5750b39..2691bb6 100644 --- a/drivers/block/brd.c +++ b/drivers/block/brd.c @@ -371,7 +371,7 @@ static int brd_rw_page(struct block_device *bdev, sector_t sector, #ifdef CONFIG_BLK_DEV_RAM_DAX static long brd_direct_access(struct block_device *bdev, sector_t sector, - void **kaddr, unsigned long *pfn, long size) + void __pmem **kaddr, unsigned long *pfn, long size) { struct brd_device *brd = bdev-bd_disk-private_data; struct page *page; @@ -381,7 +381,7 @@ static long brd_direct_access(struct block_device *bdev, sector_t sector, page = brd_insert_page(brd, sector); if (!page) return -ENOSPC; - *kaddr = page_address(page); + *kaddr = (void __pmem *)page_address(page); *pfn = page_to_pfn(page); /* diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c index eb7552d..f3b6297 100644 --- a/drivers/nvdimm/pmem.c +++ b/drivers/nvdimm/pmem.c @@ -92,7 +92,7 @@ static int pmem_rw_page(struct block_device *bdev, sector_t sector, } static long pmem_direct_access(struct block_device *bdev, sector_t sector, - void **kaddr, unsigned long *pfn, long size) + void __pmem **kaddr, unsigned long *pfn, long size) { struct pmem_device *pmem = bdev-bd_disk-private_data; size_t offset = sector 9; @@ -101,7 +101,7 @@ static long pmem_direct_access(struct block_device *bdev, sector_t sector, return -ENODEV; /* FIXME convert DAX to comprehend that this mapping has a lifetime */ - *kaddr = (void __force *) pmem-virt_addr + offset; + *kaddr = pmem-virt_addr + offset; *pfn = (pmem-phys_addr + offset) PAGE_SHIFT; return pmem-size - offset; diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c index da21281..2c5a397 100644 --- a/drivers/s390/block/dcssblk.c +++ b/drivers/s390/block/dcssblk.c @@ -29,7 +29,7 @@ static int dcssblk_open(struct block_device *bdev, fmode_t mode); static void dcssblk_release(struct gendisk *disk, fmode_t mode); static void dcssblk_make_request(struct request_queue *q, struct bio *bio); static long dcssblk_direct_access(struct block_device *bdev, sector_t secnum, -void **kaddr, unsigned long *pfn, long size); +void __pmem **kaddr, unsigned long *pfn, long
[PATCH 0/2] Disable MSI/MSI-X interrupts manually at PCI probe time in PowerPC architecture
These 2 patches correct a bogus behaviour introduced by commit 1851617cd2 (PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI). The commit moved the logic responsible to disable MSI/MSI-X interrupts at PCI probe time to a new function, named pci_msi_setup_pci_dev(), that is not reachable in the code path of PowerPC pSeries platform. Since then, devices aren't able to activate MSI/MSI-X capability, even after boot. The first patch makes the function pci_msi_setup_pci_dev() non-static. The second patch inserts a call to the function in powerpc code, so it explicitly disables MSI/MSI-X interrupts at PCI probe time. Guilherme G. Piccoli (2): PCI: Make pci_msi_setup_pci_dev() non-static for use by arch code powerpc/PCI: Disable MSI/MSI-X interrupts at PCI probe time in OF case arch/powerpc/kernel/pci_of_scan.c | 3 +++ drivers/pci/probe.c | 2 +- include/linux/pci.h | 1 + 3 files changed, 5 insertions(+), 1 deletion(-) -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2] cxl: Allow release of contexts which have been OPENED but not STARTED
If we open a context but do not start it (either because we do not attempt to start it, or because it fails to start for some reason), we are left with a context in state OPENED. Previously, cxl_release_context() only allowed releasing contexts in state CLOSED, so attempting to release an OPENED context would fail. In particular, this bug causes available contexts to run out after some EEH failures, where drivers attempt to release contexts that have failed to start. Allow releasing contexts in any state with a value lower than STARTED, i.e. OPENED or CLOSED (we can't release a STARTED context as it's currently using the hardware, and we assume that contexts in any new states which may be added in future with a value higher than STARTED are also unsafe to release). Cc: sta...@vger.kernel.org Fixes: 6f7f0b3df6d4 (cxl: Add AFU virtual PHB and kernel API) Signed-off-by: Andrew Donnellan andrew.donnel...@au1.ibm.com Signed-off-by: Daniel Axtens d...@axtens.net --- drivers/misc/cxl/api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 6a768a9..f49e3e5 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -59,7 +59,7 @@ EXPORT_SYMBOL_GPL(cxl_get_phys_dev); int cxl_release_context(struct cxl_context *ctx) { - if (ctx-status != CLOSED) + if (ctx-status = STARTED) return -EBUSY; put_device(ctx-afu-dev); -- Andrew Donnellan Software Engineer, OzLabs andrew.donnel...@au1.ibm.com Australia Development Lab, Canberra +61 2 6201 8874 (work)IBM Australia Limited ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] PCI: Make pci_msi_setup_pci_dev() non-static for use by arch code
Hi Guilherme, Thanks for the patches. On Tue, 2015-08-18 at 18:13 -0300, Guilherme G. Piccoli wrote: Commit 1851617cd2 (PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI) changed the location of the code that disables MSI/MSI-X interrupts at PCI probe time in devices that have this flag set. It moved the code from pci_msi_init_pci_dev() to a new function named pci_msi_setup_pci_dev(), called by pci_setup_device(). OK. Since then, the pSeries platform of the powerpc architecture needs to disable MSI at PCI probe time manually, as the code flow doesn't reach pci_setup_device(). For doing so, it wants to call pci_msi_setup_pci_dev(). This patch makes the required function non-static, so that it will be called on PCI probe path on powerpc pSeries platform in next patch. I didn't follow that entirely, I think you mean something like: The pseries PCI probing code does not call pci_setup_device(), so since commit 1851617cd2 pci_msi_setup_pci_dev() is not called and MSIs are left enabled, which is a bug. To fix this the pseries PCI probe should manually call pci_msi_setup_pci_dev(), so make it non-static. Does that look OK? Also you haven't CC'ed the original author of the commit, or the PCI maintainer, or the relevant lists. That would be: Michael S. Tsirkin m...@redhat.com Bjorn Helgaas bhelg...@google.com linux-...@vger.kernel.org linux-ker...@vger.kernel.org And finally both patches should have a fixes line, such as: Fixes: 1851617cd2da (PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI) cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: Allow release of contexts which have been OPENED but not STARTED
On 19/08/15 02:23, Michael Neuling wrote: So this doesn't break when you add a new state, is it worth writing it as: if (ctx-status = STARTED) return -EBUSY; ? Yeah I think that would be more future proof, although it won't make a difference with the current code. Sounds reasonable, I'll submit a V2. Andrew -- Andrew Donnellan Software Engineer, OzLabs andrew.donnel...@au1.ibm.com Australia Development Lab, Canberra +61 2 6201 8874 (work)IBM Australia Limited ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V2] cxl: Allow release of contexts which have been OPENED but not STARTED
Acked-by: Ian Munsie imun...@au1.ibm.com ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [05/27] macintosh: therm_windtunnel: Export I2C module alias information
On Tue, 2015-08-18 at 12:35 +0200, Javier Martinez Canillas wrote: Hello Michael, On 08/18/2015 12:24 PM, Michael Ellerman wrote: On Thu, 2015-30-07 at 16:18:30 UTC, Javier Martinez Canillas wrote: The I2C core always reports the MODALIAS uevent as i2c:client name regardless if the driver was matched using the I2C id_table or the of_match_table. So the driver needs to export the I2C table and this be built into the module or udev won't have the necessary information to auto load the correct module when the device is added. Signed-off-by: Javier Martinez Canillas jav...@osg.samsung.com --- drivers/macintosh/therm_windtunnel.c | 1 + 1 file changed, 1 insertion(+) Who are you expecting to merge this? I was expecting Benjamin Herrenschmidt since he is listed in MAINTAINERS for drivers/macintosh. I cc'ed him in the patch but now in your answer I don't see him in the cc list, strange. That's the mailing list dropping him from CC because he's subscribed. But I'll be happy to re-post if there is another person who is handling the patches for this driver now. BTW there is another patch [0] for the same driver to export the OF id table information, that was not picked either. Yep, I'll grab them both. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] cxl: add set/get private data to context struct
On Wed, 2015-08-19 at 15:12 +1000, Ian Munsie wrote: Excerpts from Michael Ellerman's message of 2015-08-19 14:49:30 +1000: Do we really need the accessors? They don't buy anything I can see over just using ctx-priv directly. The reasoning there is because we don't currently expose the contents of stuct cxl_context to afu drivers, rather they just treat it as an opaque type. We could potentially change this to expose the details, but there's a lot of junk in there that's just internal details of the cxl driver that isn't of interest to an afu driver that I'd rather not expose. We also already have another accessor function (cxl_process_element) in the api, so it's not out of place. FWIW I'm not opposed to changing how this api works if it ultimately makes things better, but I want to wait until the cxlflash superpipe support is merged so any patches that change the api can change it at the same time. OK. I saw struct cxl_context in cxl.h and figured it was public, but it's in drivers/misc/cxl/cxl.h, so yes other drivers have no business poking in there, even though they *could*. So that's fine. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 2/2] cxl: add set/get private data to context struct
Excerpts from Michael Ellerman's message of 2015-08-19 14:49:30 +1000: Do we really need the accessors? They don't buy anything I can see over just using ctx-priv directly. The reasoning there is because we don't currently expose the contents of stuct cxl_context to afu drivers, rather they just treat it as an opaque type. We could potentially change this to expose the details, but there's a lot of junk in there that's just internal details of the cxl driver that isn't of interest to an afu driver that I'd rather not expose. We also already have another accessor function (cxl_process_element) in the api, so it's not out of place. FWIW I'm not opposed to changing how this api works if it ultimately makes things better, but I want to wait until the cxlflash superpipe support is merged so any patches that change the api can change it at the same time. Cheers, -Ian ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/2] cxl: Add mechanism for delivering AFU driver specific events
On Wed, 2015-08-19 at 14:19 +1000, Ian Munsie wrote: From: Ian Munsie imun...@au1.ibm.com This adds an afu_driver_ops structure with event_pending and deliver_event callbacks. An AFU driver can fill these out and associate it with a context to enable passing custom AFU specific events to userspace. What's an AFU driver? Give me an example. The cxl driver will call event_pending() during poll, select, read, etc. calls to check if an AFU driver specific event is pending, and will call deliver_event() to deliver that event. This way, the cxl driver takes care of all the usual locking semantics around these calls and handles all the generic cxl events, so that the AFU driver only needs to worry about it's own events. The deliver_event() call is passed a struct cxl_event buffer to fill in. The header will already be filled in for an AFU driver event, and the AFU driver is expected to expand the header.size as necessary (up to max_size, defined by struct cxl_event_afu_driver_reserved) and fill out it's own information. Conflicts between AFU specific events are not expected, due to the fact that each AFU specific driver has it's own mechanism to deliver an AFU file descriptor to userspace. I don't grok this bit. Signed-off-by: Ian Munsie imun...@au1.ibm.com --- drivers/misc/cxl/Kconfig | 5 + drivers/misc/cxl/api.c | 7 +++ drivers/misc/cxl/cxl.h | 6 +- drivers/misc/cxl/file.c | 37 +++-- include/misc/cxl.h | 29 + include/uapi/misc/cxl.h | 13 + 6 files changed, 86 insertions(+), 11 deletions(-) diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig index 8756d06..560412c 100644 --- a/drivers/misc/cxl/Kconfig +++ b/drivers/misc/cxl/Kconfig @@ -15,12 +15,17 @@ config CXL_EEH bool default n +config CXL_AFU_DRIVER_OPS + bool + default n + config CXL tristate Support for IBM Coherent Accelerators (CXL) depends on PPC_POWERNV PCI_MSI EEH select CXL_BASE select CXL_KERNEL_API select CXL_EEH + select CXL_AFU_DRIVER_OPS default m help Select this option to enable driver support for IBM Coherent diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 6a768a9..e0f0c78 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -267,6 +267,13 @@ struct cxl_context *cxl_fops_get_context(struct file *file) } EXPORT_SYMBOL_GPL(cxl_fops_get_context); +void cxl_set_driver_ops(struct cxl_context *ctx, + struct cxl_afu_driver_ops *ops) +{ + ctx-afu_driver_ops = ops; +} +EXPORT_SYMBOL_GPL(cxl_set_driver_ops); This is pointless. BUT, it wouldn't be if you actually checked the ops. Which you should do, because then later you can avoid checking them on every event. IIUI you should never have one op set but not the other, so you check in here that both are set and error out otherwise. Then in afu_read() you can change this: + if (ctx-afu_driver_ops + ctx-afu_driver_ops-event_pending + ctx-afu_driver_ops-deliver_event + ctx-afu_driver_ops-event_pending(ctx)) { to: + if (ctx-afu_driver_ops ctx-afu_driver_ops-event_pending(ctx)) { diff --git a/drivers/misc/cxl/cxl.h b/drivers/misc/cxl/cxl.h index 6f53866..30e44a8 100644 --- a/drivers/misc/cxl/cxl.h +++ b/drivers/misc/cxl/cxl.h @@ -24,6 +24,7 @@ #include asm/reg.h #include misc/cxl-base.h +#include misc/cxl.h #include uapi/misc/cxl.h extern uint cxl_verbose; @@ -34,7 +35,7 @@ extern uint cxl_verbose; * Bump version each time a user API change is made, whether it is * backwards compatible ot not. */ -#define CXL_API_VERSION 1 +#define CXL_API_VERSION 2 I'm not clear on why we're bumping the API version? Isn't this purely about in-kernel drivers? I see below you're touching the uapi header, so I guess it's that simple. But if you can explain it better that would be great. #define CXL_API_VERSION_COMPATIBLE 1 /* @@ -462,6 +463,9 @@ struct cxl_context { bool pending_fault; bool pending_afu_err; + /* Used by AFU drivers for driver specific event delivery */ + struct cxl_afu_driver_ops *afu_driver_ops; + struct rcu_head rcu; }; diff --git a/drivers/misc/cxl/file.c b/drivers/misc/cxl/file.c index 57bdb47..2ebaca3 100644 --- a/drivers/misc/cxl/file.c +++ b/drivers/misc/cxl/file.c @@ -279,6 +279,22 @@ int afu_mmap(struct file *file, struct vm_area_struct *vm) return cxl_context_iomap(ctx, vm); } +static inline int _ctx_event_pending(struct cxl_context *ctx) Why isn't this returning bool? +{ + bool afu_driver_event_pending = false; + + if (ctx-afu_driver_ops ctx-afu_driver_ops-event_pending) + afu_driver_event_pending = ctx-afu_driver_ops-event_pending(ctx); You can drop
[PATCH v5 0/7] dax: I/O path enhancements
The goal of this series is to enhance the DAX I/O path so that all operations that store data (I/O writes, zeroing blocks, punching holes, etc.) properly synchronize the stores to media using the PMEM API. This ensures that the data DAX is writing is durable on media before the operation completes. Patches 1-4 are a few random cleanups. Changes from v4: - rebased to libnvdimm-for-next branch: https://git.kernel.org/cgit/linux/kernel/git/nvdimm/nvdimm.git/commit/?h=libnvdimm-for-next The nvdimm repository doesn't have the DAX PMD changes that are in the -mm tree. I expect the merge will basically be these two hunks: @@ -514,7 +528,7 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, unsigned long pmd_addr = address PMD_MASK; bool write = flags FAULT_FLAG_WRITE; long length; - void *kaddr; + void __pmem *kaddr; pgoff_t size, pgoff; sector_t block, sector; unsigned long pfn; @@ -608,7 +622,8 @@ int __dax_pmd_fault(struct vm_area_struct *vma, unsigned long address, if (buffer_unwritten(bh) || buffer_new(bh)) { int i; for (i = 0; i PTRS_PER_PMD; i++) - clear_page(kaddr + i * PAGE_SIZE); + clear_pmem(kaddr + i * PAGE_SIZE, PAGE_SIZE); + wmb_pmem(); count_vm_event(PGMAJFAULT); mem_cgroup_count_vm_event(vma-vm_mm, PGMAJFAULT); result |= VM_FAULT_MAJOR; Ross Zwisler (7): brd: make rd_size static pmem, x86: move x86 PMEM API to new pmem.h header pmem: remove layer when calling arch_has_wmb_pmem() pmem, x86: clean up conditional pmem includes pmem: add copy_from_iter_pmem() and clear_pmem() dax: update I/O path to do proper PMEM flushing pmem, dax: have direct_access use __pmem annotation Documentation/filesystems/Locking | 3 +- MAINTAINERS | 1 + arch/powerpc/sysdev/axonram.c | 7 +- arch/x86/include/asm/cacheflush.h | 71 - arch/x86/include/asm/pmem.h | 158 ++ drivers/block/brd.c | 6 +- drivers/nvdimm/pmem.c | 4 +- drivers/s390/block/dcssblk.c | 10 ++- fs/block_dev.c| 2 +- fs/dax.c | 63 +-- include/linux/blkdev.h| 8 +- include/linux/pmem.h | 77 --- 12 files changed, 285 insertions(+), 125 deletions(-) create mode 100644 arch/x86/include/asm/pmem.h -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH 1/2] PCI: Make pci_msi_setup_pci_dev() non-static for use by arch code
Commit 1851617cd2 (PCI/MSI: Disable MSI at enumeration even if kernel doesn't support MSI) changed the location of the code that disables MSI/MSI-X interrupts at PCI probe time in devices that have this flag set. It moved the code from pci_msi_init_pci_dev() to a new function named pci_msi_setup_pci_dev(), called by pci_setup_device(). Since then, the pSeries platform of the powerpc architecture needs to disable MSI at PCI probe time manually, as the code flow doesn't reach pci_setup_device(). For doing so, it wants to call pci_msi_setup_pci_dev(). This patch makes the required function non-static, so that it will be called on PCI probe path on powerpc pSeries platform in next patch. Signed-off-by: Guilherme G. Piccoli gpicc...@linux.vnet.ibm.com --- drivers/pci/probe.c | 2 +- include/linux/pci.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index cefd636..520c5b6 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1103,7 +1103,7 @@ int pci_cfg_space_size(struct pci_dev *dev) #define LEGACY_IO_RESOURCE (IORESOURCE_IO | IORESOURCE_PCI_FIXED) -static void pci_msi_setup_pci_dev(struct pci_dev *dev) +void pci_msi_setup_pci_dev(struct pci_dev *dev) { /* * Disable the MSI hardware to avoid screaming interrupts diff --git a/include/linux/pci.h b/include/linux/pci.h index 8a0321a..860c751 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -1202,6 +1202,7 @@ struct msix_entry { u16 entry; /* driver uses to specify entry, OS writes */ }; +void pci_msi_setup_pci_dev(struct pci_dev *dev); #ifdef CONFIG_PCI_MSI int pci_msi_vec_count(struct pci_dev *dev); -- 2.1.0 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH] cxl: Allow release of contexts which have been OPENED but not STARTED
On Tue, 2015-08-18 at 19:19 +1000, Michael Ellerman wrote: On Tue, 2015-08-18 at 16:30 +1000, Andrew Donnellan wrote: If we open a context but do not start it (either because we do not attempt to start it, or because it fails to start for some reason), we are left with a context in state OPENED. Previously, cxl_release_context() only allowed releasing contexts in state CLOSED, so attempting to release an OPENED context would fail. In particular, this bug causes available contexts to run out after some EEH failures, where drivers attempt to release contexts that have failed to start. Allow releasing contexts in any state other than STARTED, i.e. OPENED or CLOSED (we can't release a STARTED context as it's currently using the hardware). Cc: sta...@vger.kernel.org Fixes: 6f7f0b3df6d4 (cxl: Add AFU virtual PHB and kernel API) Signed-off-by: Andrew Donnellan andrew.donnel...@au1.ibm.com Signed-off-by: Daniel Axtens d...@axtens.net --- drivers/misc/cxl/api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/cxl/api.c b/drivers/misc/cxl/api.c index 6a768a9..1c520b8 100644 --- a/drivers/misc/cxl/api.c +++ b/drivers/misc/cxl/api.c @@ -59,7 +59,7 @@ EXPORT_SYMBOL_GPL(cxl_get_phys_dev); int cxl_release_context(struct cxl_context *ctx) { - if (ctx-status != CLOSED) + if (ctx-status == STARTED) return -EBUSY; So this doesn't break when you add a new state, is it worth writing it as: if (ctx-status = STARTED) return -EBUSY; ? Yeah I think that would be more future proof, although it won't make a difference with the current code. FWIW, looks good to me. Mikey ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V2] powerpc/85xx: Remove unused pci fixup hooks on c293pcie
On Tue, 2015-08-18 at 04:26 -0500, Hou Zhiqiang-B48286 wrote: Hi Scott, Removed both pcibios_fixup_phb and pcibios_fixup_bus. Could you please help to apply it? I applied it and sent a pull request yesterday. -Scott ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev