Re: [PATCH 1/3] powerpc: mm: make _PAGE_NUMA take effect
On Thu, 2013-12-05 at 16:23 +0530, Aneesh Kumar K.V wrote: Liu Ping Fan kernelf...@gmail.com writes: To enable the do_numa_page(), we should not fix _PAGE_NUMA in hash_page(), so bail out for the case of pte_numa(). For some reason I don't have 2/3 and 3/3 in my mbox (though I do have them on patchwork) so I'll reply to this one. Overall, your statement that this is a faster path needs to be backed up with numbers. The code is complicated enough as it-is, such additional mess in the low level hashing code requires a good justification, and also a demonstration that it doesn't add overhead to the normal hash path. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V4 00/10] perf: New conditional branch filter
On Fri, 2013-12-06 at 10:18 -0300, Arnaldo Carvalho de Melo wrote: Em Thu, Dec 05, 2013 at 03:47:54PM +1100, Michael Ellerman escreveu: On Wed, 2013-12-04 at 16:02 +0530, Anshuman Khandual wrote: This patchset is the re-spin of the original branch stack sampling patchset which introduced new PERF_SAMPLE_BRANCH_COND branch filter. This patchset also enables SW based branch filtering support for book3s powerpc platforms which have PMU HW backed branch stack sampling support. Summary of code changes in this patchset: (1) Introduces a new PERF_SAMPLE_BRANCH_COND branch filter (2) Add the cond branch filter options in the perf record tool (3) Enable PERF_SAMPLE_BRANCH_COND in X86 platforms (4) Enable PERF_SAMPLE_BRANCH_COND in POWER8 platform (5) Update the documentation regarding perf record tool Hi Arnaldo, Can you please take just patches 1-5 into the perf tree? And do you mind putting them in a topic branch so Benh can merge that. This is mostly kernel code, I process the userspace ones, so I think either Ingo or PeterZ should pick these, Ingo, Peter? Urgh, sorry. MAINTAINERS just lists all of you in a block. Added PeterZ to CC. Peter/Ingo can you please take just patches 1-5 into the perf tree? And do you mind putting them in a topic branch so Benh can merge that. The generic x86 changes have a Reviewed-by from Stephane, and the change to tools/perf has an ack-of-sorts from Arnaldo: Only: Subject: [PATCH V4 03/10] perf, tool: Conditional branch filter 'cond' added to perf record Which is a one liner, touches tools/perf/, and I'm ok with it. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH] drivers/tty: ehv_bytechan fails to build as a module
ehv_bytechan is marked tristate but fails to build as a module: drivers/tty/ehv_bytechan.c:363:1: error: type defaults to ‘int’ in declaration of ‘console_initcall’ [-Werror=implicit-int] It doesn't make much sense for a console driver to be built as a module, so change it to a bool. Signed-off-by: Anton Blanchard an...@samba.org --- Index: b/drivers/tty/Kconfig === --- a/drivers/tty/Kconfig +++ b/drivers/tty/Kconfig @@ -366,7 +366,7 @@ config TRACE_SINK Trace data router for MIPI P1149.7 cJTAG standard. config PPC_EPAPR_HV_BYTECHAN - tristate ePAPR hypervisor byte channel driver + bool ePAPR hypervisor byte channel driver depends on PPC select EPAPR_PARAVIRT help ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH][RESEND] powerpc: remove unused REDBOOT Kconfig parameter
This removes the REDBOOT Kconfig parameter, which was no longer used anywhere in the source code and Makefiles. Signed-off-by: Michael Opdenacker michael.opdenac...@free-electrons.com --- arch/powerpc/Kconfig| 3 --- arch/powerpc/platforms/83xx/Kconfig | 1 - arch/powerpc/platforms/8xx/Kconfig | 1 - 3 files changed, 5 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index b44b52c0a8f0..70dc283050b5 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -209,9 +209,6 @@ config DEFAULT_UIMAGE Used to allow a board to specify it wants a uImage built by default default n -config REDBOOT - bool - config ARCH_HIBERNATION_POSSIBLE bool default y diff --git a/arch/powerpc/platforms/83xx/Kconfig b/arch/powerpc/platforms/83xx/Kconfig index 670a033264c0..2bdc8c862c46 100644 --- a/arch/powerpc/platforms/83xx/Kconfig +++ b/arch/powerpc/platforms/83xx/Kconfig @@ -99,7 +99,6 @@ config SBC834x config ASP834x bool Analogue Micro ASP 834x select PPC_MPC834x - select REDBOOT help This enables support for the Analogue Micro ASP 83xx board. diff --git a/arch/powerpc/platforms/8xx/Kconfig b/arch/powerpc/platforms/8xx/Kconfig index 8dec3c0911ad..bd6f1a1cf922 100644 --- a/arch/powerpc/platforms/8xx/Kconfig +++ b/arch/powerpc/platforms/8xx/Kconfig @@ -45,7 +45,6 @@ config PPC_EP88XC config PPC_ADDER875 bool Analogue Micro Adder 875 select CPM1 - select REDBOOT help This enables support for the Analogue Micro Adder 875 board. -- 1.8.3.2 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH][RESEND] powerpc: remove unused REDBOOT Kconfig parameter
On Mon, 2013-12-09 at 06:27 +0100, Michael Opdenacker wrote: This removes the REDBOOT Kconfig parameter, which was no longer used anywhere in the source code and Makefiles. It hasn't been lost :-) It's still in patchwork and it's even in my queue. Cheers, Ben. Signed-off-by: Michael Opdenacker michael.opdenac...@free-electrons.com --- arch/powerpc/Kconfig| 3 --- arch/powerpc/platforms/83xx/Kconfig | 1 - arch/powerpc/platforms/8xx/Kconfig | 1 - 3 files changed, 5 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index b44b52c0a8f0..70dc283050b5 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -209,9 +209,6 @@ config DEFAULT_UIMAGE Used to allow a board to specify it wants a uImage built by default default n -config REDBOOT - bool - config ARCH_HIBERNATION_POSSIBLE bool default y diff --git a/arch/powerpc/platforms/83xx/Kconfig b/arch/powerpc/platforms/83xx/Kconfig index 670a033264c0..2bdc8c862c46 100644 --- a/arch/powerpc/platforms/83xx/Kconfig +++ b/arch/powerpc/platforms/83xx/Kconfig @@ -99,7 +99,6 @@ config SBC834x config ASP834x bool Analogue Micro ASP 834x select PPC_MPC834x - select REDBOOT help This enables support for the Analogue Micro ASP 83xx board. diff --git a/arch/powerpc/platforms/8xx/Kconfig b/arch/powerpc/platforms/8xx/Kconfig index 8dec3c0911ad..bd6f1a1cf922 100644 --- a/arch/powerpc/platforms/8xx/Kconfig +++ b/arch/powerpc/platforms/8xx/Kconfig @@ -45,7 +45,6 @@ config PPC_EP88XC config PPC_ADDER875 bool Analogue Micro Adder 875 select CPM1 - select REDBOOT help This enables support for the Analogue Micro Adder 875 board. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] powerpc: mm: make _PAGE_NUMA take effect
On Mon, Dec 9, 2013 at 8:31 AM, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: On Thu, 2013-12-05 at 16:23 +0530, Aneesh Kumar K.V wrote: Liu Ping Fan kernelf...@gmail.com writes: To enable the do_numa_page(), we should not fix _PAGE_NUMA in hash_page(), so bail out for the case of pte_numa(). For some reason I don't have 2/3 and 3/3 in my mbox (though I do have them on patchwork) so I'll reply to this one. Overall, your statement that this is a faster path needs to be backed up with numbers. The code is complicated enough as it-is, such additional mess in the low level hashing code requires a good justification, and also a demonstration that it doesn't add overhead to the normal hash path. For the test, is it ok to have an user application to copy page where all page are PG_mlocked? Thanks and regards, Pingfan ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH 1/3] powerpc: mm: make _PAGE_NUMA take effect
On Mon, 2013-12-09 at 14:17 +0800, Liu ping fan wrote: On Mon, Dec 9, 2013 at 8:31 AM, Benjamin Herrenschmidt b...@kernel.crashing.org wrote: On Thu, 2013-12-05 at 16:23 +0530, Aneesh Kumar K.V wrote: Liu Ping Fan kernelf...@gmail.com writes: To enable the do_numa_page(), we should not fix _PAGE_NUMA in hash_page(), so bail out for the case of pte_numa(). For some reason I don't have 2/3 and 3/3 in my mbox (though I do have them on patchwork) so I'll reply to this one. Overall, your statement that this is a faster path needs to be backed up with numbers. The code is complicated enough as it-is, such additional mess in the low level hashing code requires a good justification, and also a demonstration that it doesn't add overhead to the normal hash path. For the test, is it ok to have an user application to copy page where all page are PG_mlocked? If that specific scenario is relevant in practice, then yes, though also demonstrate the lack of regression with some more normal path such as a kernel compile. Cheers, Ben. ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V4 08/10] powerpc, perf: Enable SW filtering in branch stack sampling framework
On Wed, 2013-04-12 at 10:32:40 UTC, Anshuman Khandual wrote: This patch enables SW based post processing of BHRB captured branches to be able to meet more user defined branch filtration criteria in perf branch stack sampling framework. These changes increase the number of branch filters and their valid combinations on any powerpc64 server platform with BHRB support. Find the summary of code changes here. (1) struct cpu_hw_events Introduced two new variables track various filter values and mask (a) bhrb_sw_filter Tracks SW implemented branch filter flags (b) filter_mask Tracks both (SW and HW) branch filter flags The name 'filter_mask' doesn't mean much to me. I'd rather it was 'bhrb_filter'. (2) Event creation Kernel will figure out supported BHRB branch filters through a PMU call back 'bhrb_filter_map'. This function will find out how many of the requested branch filters can be supported in the PMU HW. It will not try to invalidate any branch filter combinations. Event creation will not error out because of lack of HW based branch filters. Meanwhile it will track the overall supported branch filters in the filter_mask variable. Once the PMU call back returns kernel will process the user branch filter request against available SW filters while looking at the filter_mask. During this phase all the branch filters which are still pending from the user requested list will have to be supported in SW failing which the event creation will error out. (3) SW branch filter During the BHRB data capture inside the PMU interrupt context, each of the captured 'perf_branch_entry.from' will be checked for compliance with applicable SW branch filters. If the entry does not conform to the filter requirements, it will be discarded from the final perf branch stack buffer. (4) Supported SW based branch filters (a) PERF_SAMPLE_BRANCH_ANY_RETURN (b) PERF_SAMPLE_BRANCH_IND_CALL (c) PERF_SAMPLE_BRANCH_ANY_CALL (d) PERF_SAMPLE_BRANCH_COND Please refer patch to understand the classification of instructions into these branch filter categories. (5) Multiple branch filter semantics Book3 sever implementation follows the same OR semantics (as implemented in x86) while dealing with multiple branch filters at any point of time. SW branch filter analysis is carried on the data set captured in the PMU HW. So the resulting set of data (after applying the SW filters) will inherently be an AND with the HW captured set. Hence any combination of HW and SW branch filters will be invalid. HW based branch filters are more efficient and faster compared to SW implemented branch filters. So at first the PMU should decide whether it can support all the requested branch filters itself or not. In case it can support all the branch filters in an OR manner, we dont apply any SW branch filter on top of the HW captured set (which is the final set). This preserves the OR semantic of multiple branch filters as required. But in case where the PMU cannot support all the requested branch filters in an OR manner, it should not apply any it's filters and leave it upto the SW to handle them all. Its the PMU code's responsibility to uphold this protocol to be able to conform to the overall OR semantic of perf branch stack sampling framework. I'd prefer this level of commentary was in a block comment in the code. It's much more likely to be seen by a future hacker than here in the commit log. diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index 2de7d48..54d39a5 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -48,6 +48,8 @@ struct cpu_hw_events { /* BHRB bits */ u64 bhrb_hw_filter; /* BHRB HW branch filter */ + u64 bhrb_sw_filter; /* BHRB SW branch filter */ + u64 filter_mask;/* Branch filter mask */ int bhrb_users; void*bhrb_context; struct perf_branch_stack bhrb_stack; @@ -400,6 +402,228 @@ static __u64 power_pmu_bhrb_to(u64 addr) return target - (unsigned long)instr + addr; } +/* + * Instruction opcode analysis + * + * Analyse instruction opcodes and classify them + * into various branch filter options available. + * This follows the standard semantics of OR which + * means that instructions which conforms to `any` + * of the requested branch filters get picked up. + */ +static bool validate_instruction(unsigned int *addr, u64 bhrb_sw_filter) +{ validate is not a good name here. That implies that this
Re: [PATCH V4 07/10] powerpc, lib: Add new branch instruction analysis support functions
On Wed, 2013-04-12 at 10:32:39 UTC, Anshuman Khandual wrote: Generic powerpc branch instruction analysis support added in the code patching library which will help the subsequent patch on SW based filtering of branch records in perf. This patch also converts and exports some of the existing local static functions through the header file to be used else where. diff --git a/arch/powerpc/include/asm/code-patching.h b/arch/powerpc/include/asm/code-patching.h index a6f8c7a..8bab417 100644 --- a/arch/powerpc/include/asm/code-patching.h +++ b/arch/powerpc/include/asm/code-patching.h @@ -22,6 +22,36 @@ #define BRANCH_SET_LINK 0x1 #define BRANCH_ABSOLUTE 0x2 +#define XL_FORM_LR 0x4C20 +#define XL_FORM_CTR 0x4C000420 +#define XL_FORM_TAR 0x4C000460 + +#define BO_ALWAYS0x0280 +#define BO_CTR 0x0200 +#define BO_CRBI_OFF 0x0080 +#define BO_CRBI_ON 0x0180 +#define BO_CRBI_HINT 0x0040 + +/* Forms of branch instruction */ +int instr_is_branch_iform(unsigned int instr); +int instr_is_branch_bform(unsigned int instr); +int instr_is_branch_xlform(unsigned int instr); + +/* Classification of XL-form instruction */ +int is_xlform_lr(unsigned int instr); +int is_xlform_ctr(unsigned int instr); +int is_xlform_tar(unsigned int instr); + +/* Branch instruction is a call */ +int is_branch_link_set(unsigned int instr); + +/* BO field analysis (B-form or XL-form) */ +int is_bo_always(unsigned int instr); +int is_bo_ctr(unsigned int instr); +int is_bo_crbi_off(unsigned int instr); +int is_bo_crbi_on(unsigned int instr); +int is_bo_crbi_hint(unsigned int instr); I think this is the wrong API. We end up with all these micro checks, which don't actually encapsulate much, and don't implement the logic perf needs. If we had another user for this level of detail then it might make sense, but for a single user I think we're better off just implementing the semantics it wants. So that would be something more like: bool instr_is_return_branch(unsigned int instr); bool instr_is_conditional_branch(unsigned int instr); bool instr_is_func_call(unsigned int instr); bool instr_is_indirect_func_call(unsigned int instr); These would then encapsulate something like the logic in your 8/10 patch. You can hopefully also optimise the checking logic in each routine because you know the exact semantics you're implementing. cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
Re: [PATCH V4 09/10] power8, perf: Change BHRB branch filter configuration
On Wed, 2013-04-12 at 10:32:41 UTC, Anshuman Khandual wrote: Powerpc kernel now supports SW based branch filters for book3s systems with some specifc requirements while dealing with HW supported branch filters in order to achieve overall OR semantics prevailing in perf branch stack sampling framework. This patch adapts the BHRB branch filter configuration to meet those protocols. POWER8 PMU does support 3 branch filters (out of which two are getting used in perf branch stack) which are mutually exclussive and cannot be ORed with each other. This implies that PMU can only handle one HW based branch filter request at any point of time. For all other combinations PMU will pass it on to the SW. Also the combination of PERF_SAMPLE_BRANCH_ANY_CALL and PERF_SAMPLE_BRANCH_COND can now be handled in SW, hence we dont error them out anymore. diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c index 03c5b8d..6021349 100644 --- a/arch/powerpc/perf/power8-pmu.c +++ b/arch/powerpc/perf/power8-pmu.c @@ -561,7 +561,56 @@ static int power8_generic_events[] = { static u64 power8_bhrb_filter_map(u64 branch_sample_type, u64 *filter_mask) { - u64 pmu_bhrb_filter = 0; + u64 x, tmp, pmu_bhrb_filter = 0; + *filter_mask = 0; + + /* No branch filter requested */ + if (branch_sample_type PERF_SAMPLE_BRANCH_ANY) { + *filter_mask = PERF_SAMPLE_BRANCH_ANY; + return pmu_bhrb_filter; + } + + /* + * P8 does not support oring of PMU HW branch filters. Hence + * if multiple branch filters are requested which includes filters + * supported in PMU, still go ahead and clear the PMU based HW branch + * filter component as in this case all the filters will be processed + * in SW. Leading space there. + */ + tmp = branch_sample_type; + + /* Remove privilege filters before comparison */ + tmp = ~PERF_SAMPLE_BRANCH_USER; + tmp = ~PERF_SAMPLE_BRANCH_KERNEL; + tmp = ~PERF_SAMPLE_BRANCH_HV; + + for_each_branch_sample_type(x) { + /* Ignore privilege requests */ + if ((x == PERF_SAMPLE_BRANCH_USER) || (x == PERF_SAMPLE_BRANCH_KERNEL) || (x == PERF_SAMPLE_BRANCH_HV)) + continue; + + if (!(tmp x)) + continue; + + /* Supported HW PMU filters */ + if (tmp PERF_SAMPLE_BRANCH_ANY_CALL) { + tmp = ~PERF_SAMPLE_BRANCH_ANY_CALL; + if (tmp) { + pmu_bhrb_filter = 0; + *filter_mask = 0; + return pmu_bhrb_filter; + } + } + + if (tmp PERF_SAMPLE_BRANCH_COND) { + tmp = ~PERF_SAMPLE_BRANCH_COND; + if (tmp) { + pmu_bhrb_filter = 0; + *filter_mask = 0; + return pmu_bhrb_filter; + } + } + } /* BHRB and regular PMU events share the same privilege state * filter configuration. BHRB is always recorded along with a @@ -570,34 +619,20 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type, u64 *filter_mask) * PMU event, we ignore any separate BHRB specific request. */ - /* No branch filter requested */ - if (branch_sample_type PERF_SAMPLE_BRANCH_ANY) - return pmu_bhrb_filter; - - /* Invalid branch filter options - HW does not support */ - if (branch_sample_type PERF_SAMPLE_BRANCH_ANY_RETURN) - return -1; - - if (branch_sample_type PERF_SAMPLE_BRANCH_IND_CALL) - return -1; - + /* Supported individual branch filters */ if (branch_sample_type PERF_SAMPLE_BRANCH_ANY_CALL) { pmu_bhrb_filter |= POWER8_MMCRA_IFM1; + *filter_mask|= PERF_SAMPLE_BRANCH_ANY_CALL; return pmu_bhrb_filter; } if (branch_sample_type PERF_SAMPLE_BRANCH_COND) { pmu_bhrb_filter |= POWER8_MMCRA_IFM3; + *filter_mask|= PERF_SAMPLE_BRANCH_COND; return pmu_bhrb_filter; } - /* PMU does not support ANY combination of HW BHRB filters */ - if ((branch_sample_type PERF_SAMPLE_BRANCH_ANY_CALL) - (branch_sample_type PERF_SAMPLE_BRANCH_COND)) - return -1; - - /* Every thing else is unsupported */ - return -1; + return pmu_bhrb_filter; } As I said in my comments on version 3 which you ignored: I think it would be clearer if we actually checked for the possibilities we allow and let everything else fall through, eg: Â Â Â Â Â Â Â Â /* Ignore user/kernel/hv bits */ Â Â Â Â Â Â Â Â branch_sample_type = ~PERF_SAMPLE_BRANCH_PLM_ALL; Â Â Â Â Â Â Â Â if
Re: [PATCH V4 10/10] powerpc, perf: Cleanup SW branch filter list look up
On Wed, 2013-04-12 at 10:32:42 UTC, Anshuman Khandual wrote: This patch adds enumeration for all available SW branch filters in powerpc book3s code and also streamlines the look for the SW branch filter entries while trying to figure out which all branch filters can be supported in SW. This appears to patch code that was only added in 8/10 ? Was there any reason not to do it the right way from the beginning? cheers ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
linux-next: build failure after merge of the final tree (powerpc tree related)
Hi all, After merging the final tree, today's linux-next build (powerpc allyesconfig) failed like this: arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:958: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:959: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:983: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:984: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1003: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1013: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1014: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1015: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1016: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1017: Error: attempt to move .org backwards arch/powerpc/kernel/exceptions-64s.S:1018: Error: attempt to move .org backwards Caused by commit 1e9b4507ed98 (powerpc/book3s: handle machine check in Linux host). I have reverted these commits (possibly some of these reverts are unnecessary): b63a0ffe35de powerpc/powernv: Machine check exception handling 28446de2ce99 powerpc/powernv: Remove machine check handling in OPAL b5ff4211a829 powerpc/book3s: Queue up and process delayed MCE events 36df96f8acaf powerpc/book3s: Decode and save machine check event ae744f3432d3 powerpc/book3s: Flush SLB/TLBs if we get SLB/TLB machine check errors on power8 e22a22740c1a powerpc/book3s: Flush SLB/TLBs if we get SLB/TLB machine check errors on power7 0440705049b0 powerpc/book3s: Add flush_tlb operation in cpu_spec 4c703416efc0 powerpc/book3s: Introduce a early machine check hook in cpu_spec 1c51089f777b powerpc/book3s: Return from interrupt if coming from evil context 1e9b4507ed98 powerpc/book3s: handle machine check in Linux host -- Cheers, Stephen Rothwell s...@canb.auug.org.au pgpc4ymwFbUDD.pgp Description: PGP signature ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 0/3] powerpc iommu: Remove hardcoded page sizes
The series doesn't actually change the iommu page size as each platform continues to initialise the iommu page size to a hardcoded value of 4K. At this stage testing has only been carried out on a pSeries machine, other platforms including cell have yet to be tested. Changes from V1: * Rebased on Ben's next tree * Updated constants in 1/3 that were not present in V1 (thanks Alexy!) * Added initialisation for pasemi platform that was missed in V1 ___ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev
[PATCH V2 1/3] powerpc iommu: Update constant names to reflect their hardcoded page size
The powerpc iommu uses a hardcoded page size of 4K. This patch changes the name of the IOMMU_PAGE_* macros to reflect the hardcoded values. A future patch will use the existing names to support dynamic page sizes. Signed-off-by: Alistair Popple alist...@popple.id.au Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- arch/powerpc/include/asm/iommu.h | 10 ++-- arch/powerpc/kernel/dma-iommu.c|4 +- arch/powerpc/kernel/iommu.c| 78 arch/powerpc/kernel/vio.c | 19 arch/powerpc/platforms/cell/iommu.c| 12 ++--- arch/powerpc/platforms/powernv/pci.c |4 +- arch/powerpc/platforms/pseries/iommu.c |8 ++-- arch/powerpc/platforms/pseries/setup.c |4 +- arch/powerpc/platforms/wsp/wsp_pci.c | 10 ++-- drivers/net/ethernet/ibm/ibmveth.c |9 ++-- drivers/vfio/vfio_iommu_spapr_tce.c| 28 ++-- 11 files changed, 94 insertions(+), 92 deletions(-) diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h index 774fa27..0869c7e 100644 --- a/arch/powerpc/include/asm/iommu.h +++ b/arch/powerpc/include/asm/iommu.h @@ -30,10 +30,10 @@ #include asm/machdep.h #include asm/types.h -#define IOMMU_PAGE_SHIFT 12 -#define IOMMU_PAGE_SIZE (ASM_CONST(1) IOMMU_PAGE_SHIFT) -#define IOMMU_PAGE_MASK (~((1 IOMMU_PAGE_SHIFT) - 1)) -#define IOMMU_PAGE_ALIGN(addr) _ALIGN_UP(addr, IOMMU_PAGE_SIZE) +#define IOMMU_PAGE_SHIFT_4K 12 +#define IOMMU_PAGE_SIZE_4K (ASM_CONST(1) IOMMU_PAGE_SHIFT_4K) +#define IOMMU_PAGE_MASK_4K (~((1 IOMMU_PAGE_SHIFT_4K) - 1)) +#define IOMMU_PAGE_ALIGN_4K(addr) _ALIGN_UP(addr, IOMMU_PAGE_SIZE_4K) /* Boot time flags */ extern int iommu_is_off; @@ -42,7 +42,7 @@ extern int iommu_force_on; /* Pure 2^n version of get_order */ static __inline__ __attribute_const__ int get_iommu_order(unsigned long size) { - return __ilog2((size - 1) IOMMU_PAGE_SHIFT) + 1; + return __ilog2((size - 1) IOMMU_PAGE_SHIFT_4K) + 1; } diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c index e489752..5cfe3db 100644 --- a/arch/powerpc/kernel/dma-iommu.c +++ b/arch/powerpc/kernel/dma-iommu.c @@ -83,10 +83,10 @@ static int dma_iommu_dma_supported(struct device *dev, u64 mask) return 0; } - if (tbl-it_offset (mask IOMMU_PAGE_SHIFT)) { + if (tbl-it_offset (mask IOMMU_PAGE_SHIFT_4K)) { dev_info(dev, Warning: IOMMU offset too big for device mask\n); dev_info(dev, mask: 0x%08llx, table offset: 0x%08lx\n, - mask, tbl-it_offset IOMMU_PAGE_SHIFT); + mask, tbl-it_offset IOMMU_PAGE_SHIFT_4K); return 0; } else return 1; diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index d22abe0..df4a7f1 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -251,14 +251,14 @@ again: if (dev) boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1, - 1 IOMMU_PAGE_SHIFT); + 1 IOMMU_PAGE_SHIFT_4K); else - boundary_size = ALIGN(1UL 32, 1 IOMMU_PAGE_SHIFT); + boundary_size = ALIGN(1UL 32, 1 IOMMU_PAGE_SHIFT_4K); /* 4GB boundary for iseries_hv_alloc and iseries_hv_map */ n = iommu_area_alloc(tbl-it_map, limit, start, npages, -tbl-it_offset, boundary_size IOMMU_PAGE_SHIFT, -align_mask); + tbl-it_offset, boundary_size IOMMU_PAGE_SHIFT_4K, + align_mask); if (n == -1) { if (likely(pass == 0)) { /* First try the pool from the start */ @@ -320,12 +320,12 @@ static dma_addr_t iommu_alloc(struct device *dev, struct iommu_table *tbl, return DMA_ERROR_CODE; entry += tbl-it_offset;/* Offset into real TCE table */ - ret = entry IOMMU_PAGE_SHIFT;/* Set the return dma address */ + ret = entry IOMMU_PAGE_SHIFT_4K; /* Set the return dma address */ /* Put the TCEs in the HW table */ build_fail = ppc_md.tce_build(tbl, entry, npages, - (unsigned long)page IOMMU_PAGE_MASK, - direction, attrs); + (unsigned long)page IOMMU_PAGE_MASK_4K, + direction, attrs); /* ppc_md.tce_build() only returns non-zero for transient errors. * Clean up the table bitmap in this case and return @@ -352,7 +352,7 @@ static bool iommu_free_check(struct iommu_table *tbl, dma_addr_t dma_addr, { unsigned long entry, free_entry; - entry = dma_addr IOMMU_PAGE_SHIFT; + entry = dma_addr IOMMU_PAGE_SHIFT_4K;
[PATCH V2 2/3] powerpc iommu: Add it_page_shift field to determine iommu page size
This patch adds a it_page_shift field to struct iommu_table and initiliases it to 4K for all platforms. Signed-off-by: Alistair Popple alist...@popple.id.au --- arch/powerpc/include/asm/iommu.h |1 + arch/powerpc/kernel/vio.c |5 +++-- arch/powerpc/platforms/cell/iommu.c|8 +--- arch/powerpc/platforms/pasemi/iommu.c |5 - arch/powerpc/platforms/powernv/pci.c |3 ++- arch/powerpc/platforms/pseries/iommu.c | 10 ++ arch/powerpc/platforms/wsp/wsp_pci.c |5 +++-- 7 files changed, 24 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h index 0869c7e..7c92834 100644 --- a/arch/powerpc/include/asm/iommu.h +++ b/arch/powerpc/include/asm/iommu.h @@ -76,6 +76,7 @@ struct iommu_table { struct iommu_pool large_pool; struct iommu_pool pools[IOMMU_NR_POOLS]; unsigned long *it_map; /* A simple allocation bitmap for now */ + unsigned long it_page_shift;/* table iommu page size */ #ifdef CONFIG_IOMMU_API struct iommu_group *it_group; #endif diff --git a/arch/powerpc/kernel/vio.c b/arch/powerpc/kernel/vio.c index 2e89fa3..170ac24 100644 --- a/arch/powerpc/kernel/vio.c +++ b/arch/powerpc/kernel/vio.c @@ -1177,9 +1177,10 @@ static struct iommu_table *vio_build_iommu_table(struct vio_dev *dev) tbl-it_index, offset, size); /* TCE table size - measured in tce entries */ - tbl-it_size = size IOMMU_PAGE_SHIFT_4K; + tbl-it_page_shift = IOMMU_PAGE_SHIFT_4K; + tbl-it_size = size tbl-it_page_shift; /* offset for VIO should always be 0 */ - tbl-it_offset = offset IOMMU_PAGE_SHIFT_4K; + tbl-it_offset = offset tbl-it_page_shift; tbl-it_busno = 0; tbl-it_type = TCE_VB; tbl-it_blocksize = 16; diff --git a/arch/powerpc/platforms/cell/iommu.c b/arch/powerpc/platforms/cell/iommu.c index fc61b90..2b90ff8 100644 --- a/arch/powerpc/platforms/cell/iommu.c +++ b/arch/powerpc/platforms/cell/iommu.c @@ -197,7 +197,7 @@ static int tce_build_cell(struct iommu_table *tbl, long index, long npages, io_pte = (unsigned long *)tbl-it_base + (index - tbl-it_offset); - for (i = 0; i npages; i++, uaddr += IOMMU_PAGE_SIZE_4K) + for (i = 0; i npages; i++, uaddr += tbl-it_page_shift) io_pte[i] = base_pte | (__pa(uaddr) CBE_IOPTE_RPN_Mask); mb(); @@ -487,8 +487,10 @@ cell_iommu_setup_window(struct cbe_iommu *iommu, struct device_node *np, window-table.it_blocksize = 16; window-table.it_base = (unsigned long)iommu-ptab; window-table.it_index = iommu-nid; - window-table.it_offset = (offset IOMMU_PAGE_SHIFT_4K) + pte_offset; - window-table.it_size = size IOMMU_PAGE_SHIFT_4K; + window-table.it_page_shift = IOMMU_PAGE_SHIFT_4K; + window-table.it_offset = + (offset window-table.it_page_shift) + pte_offset; + window-table.it_size = size window-table.it_page_shift; iommu_init_table(window-table, iommu-nid); diff --git a/arch/powerpc/platforms/pasemi/iommu.c b/arch/powerpc/platforms/pasemi/iommu.c index 7d2d036..2e576f2 100644 --- a/arch/powerpc/platforms/pasemi/iommu.c +++ b/arch/powerpc/platforms/pasemi/iommu.c @@ -138,8 +138,11 @@ static void iommu_table_iobmap_setup(void) pr_debug( - %s\n, __func__); iommu_table_iobmap.it_busno = 0; iommu_table_iobmap.it_offset = 0; + iommu_table_iobmap.it_page_shift = IOBMAP_PAGE_SHIFT; + /* it_size is in number of entries */ - iommu_table_iobmap.it_size = 0x8000 IOBMAP_PAGE_SHIFT; + iommu_table_iobmap.it_size = + 0x8000 iommu_table_iobmap.it_page_shift; /* Initialize the common IOMMU code */ iommu_table_iobmap.it_base = (unsigned long)iob_l2_base; diff --git a/arch/powerpc/platforms/powernv/pci.c b/arch/powerpc/platforms/powernv/pci.c index 7f4d857..569b464 100644 --- a/arch/powerpc/platforms/powernv/pci.c +++ b/arch/powerpc/platforms/powernv/pci.c @@ -564,7 +564,8 @@ void pnv_pci_setup_iommu_table(struct iommu_table *tbl, { tbl-it_blocksize = 16; tbl-it_base = (unsigned long)tce_mem; - tbl-it_offset = dma_offset IOMMU_PAGE_SHIFT_4K; + tbl-it_page_shift = IOMMU_PAGE_SHIFT_4K; + tbl-it_offset = dma_offset tbl-it_page_shift; tbl-it_index = 0; tbl-it_size = tce_size 3; tbl-it_busno = 0; diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index 1b7531c..e029918 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -486,9 +486,10 @@ static void iommu_table_setparms(struct pci_controller *phb, memset((void *)tbl-it_base, 0, *sizep); tbl-it_busno = phb-bus-number; + tbl-it_page_shift = IOMMU_PAGE_SHIFT_4K; /* Units of tce entries */ -
[PATCH V2 3/3] powerpc iommu: Update the generic code to use dynamic iommu page sizes
This patch updates the generic iommu backend code to use the it_page_shift field to determine the iommu page size instead of using hardcoded values. Signed-off-by: Alistair Popple alist...@popple.id.au --- arch/powerpc/include/asm/iommu.h | 19 +--- arch/powerpc/kernel/dma-iommu.c |4 +- arch/powerpc/kernel/iommu.c | 88 ++ arch/powerpc/kernel/vio.c| 25 +++--- arch/powerpc/platforms/powernv/pci.c |2 - drivers/net/ethernet/ibm/ibmveth.c | 15 +++--- 6 files changed, 88 insertions(+), 65 deletions(-) diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h index 7c92834..f7a8036 100644 --- a/arch/powerpc/include/asm/iommu.h +++ b/arch/powerpc/include/asm/iommu.h @@ -35,17 +35,14 @@ #define IOMMU_PAGE_MASK_4K (~((1 IOMMU_PAGE_SHIFT_4K) - 1)) #define IOMMU_PAGE_ALIGN_4K(addr) _ALIGN_UP(addr, IOMMU_PAGE_SIZE_4K) +#define IOMMU_PAGE_SIZE(tblptr) (ASM_CONST(1) (tblptr)-it_page_shift) +#define IOMMU_PAGE_MASK(tblptr) (~((1 (tblptr)-it_page_shift) - 1)) +#define IOMMU_PAGE_ALIGN(addr, tblptr) _ALIGN_UP(addr, IOMMU_PAGE_SIZE(tblptr)) + /* Boot time flags */ extern int iommu_is_off; extern int iommu_force_on; -/* Pure 2^n version of get_order */ -static __inline__ __attribute_const__ int get_iommu_order(unsigned long size) -{ - return __ilog2((size - 1) IOMMU_PAGE_SHIFT_4K) + 1; -} - - /* * IOMAP_MAX_ORDER defines the largest contiguous block * of dma space we can get. IOMAP_MAX_ORDER = 13 @@ -82,6 +79,14 @@ struct iommu_table { #endif }; +/* Pure 2^n version of get_order */ +static inline __attribute_const__ +int get_iommu_order(unsigned long size, struct iommu_table *tbl) +{ + return __ilog2((size - 1) tbl-it_page_shift) + 1; +} + + struct scatterlist; static inline void set_iommu_table_base(struct device *dev, void *base) diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c index 5cfe3db..54d0116 100644 --- a/arch/powerpc/kernel/dma-iommu.c +++ b/arch/powerpc/kernel/dma-iommu.c @@ -83,10 +83,10 @@ static int dma_iommu_dma_supported(struct device *dev, u64 mask) return 0; } - if (tbl-it_offset (mask IOMMU_PAGE_SHIFT_4K)) { + if (tbl-it_offset (mask tbl-it_page_shift)) { dev_info(dev, Warning: IOMMU offset too big for device mask\n); dev_info(dev, mask: 0x%08llx, table offset: 0x%08lx\n, - mask, tbl-it_offset IOMMU_PAGE_SHIFT_4K); + mask, tbl-it_offset tbl-it_page_shift); return 0; } else return 1; diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c index df4a7f1..f58d813 100644 --- a/arch/powerpc/kernel/iommu.c +++ b/arch/powerpc/kernel/iommu.c @@ -251,14 +251,13 @@ again: if (dev) boundary_size = ALIGN(dma_get_seg_boundary(dev) + 1, - 1 IOMMU_PAGE_SHIFT_4K); + 1 tbl-it_page_shift); else - boundary_size = ALIGN(1UL 32, 1 IOMMU_PAGE_SHIFT_4K); + boundary_size = ALIGN(1UL 32, 1 tbl-it_page_shift); /* 4GB boundary for iseries_hv_alloc and iseries_hv_map */ - n = iommu_area_alloc(tbl-it_map, limit, start, npages, - tbl-it_offset, boundary_size IOMMU_PAGE_SHIFT_4K, - align_mask); + n = iommu_area_alloc(tbl-it_map, limit, start, npages, tbl-it_offset, +boundary_size tbl-it_page_shift, align_mask); if (n == -1) { if (likely(pass == 0)) { /* First try the pool from the start */ @@ -320,12 +319,12 @@ static dma_addr_t iommu_alloc(struct device *dev, struct iommu_table *tbl, return DMA_ERROR_CODE; entry += tbl-it_offset;/* Offset into real TCE table */ - ret = entry IOMMU_PAGE_SHIFT_4K; /* Set the return dma address */ + ret = entry tbl-it_page_shift; /* Set the return dma address */ /* Put the TCEs in the HW table */ build_fail = ppc_md.tce_build(tbl, entry, npages, - (unsigned long)page IOMMU_PAGE_MASK_4K, - direction, attrs); + (unsigned long)page + IOMMU_PAGE_MASK(tbl), direction, attrs); /* ppc_md.tce_build() only returns non-zero for transient errors. * Clean up the table bitmap in this case and return @@ -352,7 +351,7 @@ static bool iommu_free_check(struct iommu_table *tbl, dma_addr_t dma_addr, { unsigned long entry, free_entry; - entry = dma_addr IOMMU_PAGE_SHIFT_4K; + entry = dma_addr tbl-it_page_shift; free_entry = entry - tbl-it_offset; if (((free_entry + npages) tbl-it_size) || @@