Re: [PATCH v2 2/5] perf/x86/intel: Record branch type

2017-04-07 Thread Andi Kleen
> > It's a somewhat common situation with partially JITed code, if you > > don't have an agent. You can still do a lot of useful things. > > Like what? How can you say anything about code you don't have? For example if you combine the PMU topdown measurement, and see if it's frontend bound, and

Re: [PATCH v2 1/2] fadump: reduce memory consumption for capture kernel

2017-04-07 Thread Hari Bathini
Hi Michael, On Friday 07 April 2017 07:16 PM, Michael Ellerman wrote: Hari Bathini writes: On Friday 07 April 2017 07:24 AM, Michael Ellerman wrote: My preference would be that the fadump kernel "just works". If it's using too much memory then the fadump kernel

Re: [PATCH v2 2/5] perf/x86/intel: Record branch type

2017-04-07 Thread Peter Zijlstra
On Fri, Apr 07, 2017 at 09:48:34AM -0700, Andi Kleen wrote: > On Fri, Apr 07, 2017 at 05:20:31PM +0200, Peter Zijlstra wrote: > > On Fri, Apr 07, 2017 at 06:47:43PM +0800, Jin Yao wrote: > > > Perf already has support for disassembling the branch instruction > > > and using the branch type for

Re: [PATCH v2 2/5] perf/x86/intel: Record branch type

2017-04-07 Thread Andi Kleen
On Fri, Apr 07, 2017 at 05:20:31PM +0200, Peter Zijlstra wrote: > On Fri, Apr 07, 2017 at 06:47:43PM +0800, Jin Yao wrote: > > Perf already has support for disassembling the branch instruction > > and using the branch type for filtering. The patch just records > > the branch type in

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-07 Thread Will Deacon
On Fri, Apr 07, 2017 at 01:30:11AM +1000, Nicholas Piggin wrote: > On Thu, 6 Apr 2017 15:13:53 +0100 > Will Deacon wrote: > > On Thu, Apr 06, 2017 at 10:59:58AM +1000, Nicholas Piggin wrote: > > > Thanks for taking a look. The default spin primitives should just > > >

Re: [PATCH V4] powerpc/hugetlb: Add ABI defines for supported HugeTLB page sizes

2017-04-07 Thread Paul Clarke
nits... take 'em or leave 'em... On 04/07/2017 08:01 AM, Michael Ellerman wrote: Anshuman Khandual writes: And I reworded the comment the make it clearer (I think) that most users shouldn't need to use these, and should just use the default size: /* * When

Re: [PATCH v2 2/5] perf/x86/intel: Record branch type

2017-04-07 Thread Peter Zijlstra
On Fri, Apr 07, 2017 at 06:47:43PM +0800, Jin Yao wrote: > Perf already has support for disassembling the branch instruction > and using the branch type for filtering. The patch just records > the branch type in perf_branch_entry. > > Before recording, the patch converts the x86 branch

Re: [PATCH V4] powerpc/hugetlb: Add ABI defines for supported HugeTLB page sizes

2017-04-07 Thread Anshuman Khandual
On 04/07/2017 06:31 PM, Michael Ellerman wrote: > Anshuman Khandual writes: > >> This just adds user space exported ABI definitions for 2MB, 16MB, 1GB, >> 16GB non default huge page sizes to be used with mmap() system call. > > I updated this for you to include all

[PATCH V4 6/7] cxl: Isolate few psl8 specific calls

2017-04-07 Thread Christophe Lombard
Point out the specific Coherent Accelerator Interface Architecture, level 1, registers. Code and functions specific to PSL8 (CAIA1) must be framed. Signed-off-by: Christophe Lombard --- drivers/misc/cxl/context.c | 28 +++- drivers/misc/cxl/cxl.h

[PATCH V4 4/7] cxl: Update implementation service layer

2017-04-07 Thread Christophe Lombard
The service layer API (in cxl.h) lists some low-level functions whose implementation is different on PSL8, PSL9 and XSL: - Init implementation for the adapter and the afu. - Invalidate TLB/SLB. - Attach process for dedicated/directed models. - Handle psl interrupts. - Debug registers for the

[PATCH V4 7/7] cxl: Add psl9 specific code

2017-04-07 Thread Christophe Lombard
The new Coherent Accelerator Interface Architecture, level 2, for the IBM POWER9 brings new content and features: - POWER9 Service Layer - Registers - Radix mode - Process element entry - Dedicated-Shared Process Programming Model - Translation Fault Handling - CAPP - Memory Context ID If a

[PATCH V4 5/7] cxl: Rename some psl8 specific functions

2017-04-07 Thread Christophe Lombard
Rename a few functions, changing the '_psl' suffix to '_psl8', to make clear that the implementation is psl8 specific. Those functions will have an equivalent implementation for the psl9 in a later patch. Signed-off-by: Christophe Lombard --- drivers/misc/cxl/cxl.h

[PATCH V4 1/7] cxl: Read vsec perst load image

2017-04-07 Thread Christophe Lombard
This bit is used to cause a flash image load for programmable CAIA-compliant implementation. If this bit is set to ‘0’, a power cycle of the adapter is required to load a programmable CAIA-com- pliant implementation from flash. This field will be used by the following patches. Signed-off-by:

[PATCH V4 2/7] cxl: Remove unused values in bare-metal environment.

2017-04-07 Thread Christophe Lombard
The two previously fields pid and tid, located in the structure cxl_irq_info, are only used in the guest environment. To avoid confusion, it's not necessary to fill the fields in the bare-metal environment. Pid_tid is now renamed to 'reserved' to avoid undefined behavior on bare-metal. The PSL

[PATCH V4 3/7] cxl: Keep track of mm struct associated with a context

2017-04-07 Thread Christophe Lombard
The mm_struct corresponding to the current task is acquired each time an interrupt is raised. So to simplify the code, we only get the mm_struct when attaching an AFU context to the process. The mm_count reference is increased to ensure that the mm_struct can't be freed. The mm_struct will be

[PATCH V4 0/7] cxl: Add support for Coherent Accelerator Interface Architecture 2.0

2017-04-07 Thread Christophe Lombard
This series adds support for a cxl card which supports the Coherent Accelerator Interface Architecture 2.0. It requires IBM Power9 system and the Power Service Layer, version 9. The PSL provides the address translation and system memory cache for CAIA compliant Accelerators. the PSL attaches to

RE: [7/7] crypto: caam/qi - add ablkcipher and authenc algorithms

2017-04-07 Thread Laurentiu Tudor
-Original Message- From: Michael Ellerman [mailto:m...@ellerman.id.au] Sent: Friday, April 07, 2017 4:22 PM To: Laurentiu Tudor ; Horia Geantă ; Herbert Xu ; Scott Wood ; Roy Pledge

Re: [PATCH v2 1/2] fadump: reduce memory consumption for capture kernel

2017-04-07 Thread Michael Ellerman
Hari Bathini writes: > On Friday 07 April 2017 07:24 AM, Michael Ellerman wrote: >> My preference would be that the fadump kernel "just works". If it's >> using too much memory then the fadump kernel should do whatever it needs >> to use less memory, eg. shrinking

Re: [7/7] crypto: caam/qi - add ablkcipher and authenc algorithms

2017-04-07 Thread Michael Ellerman
Laurentiu Tudor writes: > On 04/05/2017 01:06 PM, Michael Ellerman wrote: >> Laurentiu Tudor writes: >> >>> Hi Michael, >>> >>> Just a couple of basic things to check: >>>- was the dtb updated to the newest? >> >> Possibly not, it's an

Re: [PATCH V4] powerpc/hugetlb: Add ABI defines for supported HugeTLB page sizes

2017-04-07 Thread Michael Ellerman
Anshuman Khandual writes: > This just adds user space exported ABI definitions for 2MB, 16MB, 1GB, > 16GB non default huge page sizes to be used with mmap() system call. I updated this for you to include all the sizes. > diff --git

[PATCH 5/5] powerpc/powernv: POWER9 support for msgsnd/doorbell IPI

2017-04-07 Thread Nicholas Piggin
POWER9 requires msgsync for receiver-side synchronization, and a DD1 workaround that uses the darn instruction. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/dbell.h | 8 arch/powerpc/include/asm/feature-fixups.h | 20

[PATCH 4/5] powerpc/64s: avoid branch for ppc_msgsnd

2017-04-07 Thread Nicholas Piggin
Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/dbell.h | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/dbell.h b/arch/powerpc/include/asm/dbell.h index 4db4cfdd829c..8ad66ccb7180 100644 ---

[PATCH 3/5] powerpc: Introduce msgsnd/doorbell barrier primitives

2017-04-07 Thread Nicholas Piggin
POWER9 changes requirements and adds new instructions for synchronization. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/dbell.h | 22 ++ arch/powerpc/include/asm/smp.h | 1 + arch/powerpc/kernel/dbell.c | 8 +---

[PATCH 2/5] powerpc: change the doorbell IPI calling convention

2017-04-07 Thread Nicholas Piggin
Change the doorbell callers to know about their msgsnd addressing, rather than have them set a per-cpu target data tag at boot that gets sent to the cause_ipi functions. The data is only used for doorbell IPI functions, no other IPI types, so it makes sense to keep that detail local to doorbell.

[PATCH 1/5] powerpc/pseries: do not use msgsndp doorbells on POWER9 guests

2017-04-07 Thread Nicholas Piggin
POWER9 hypervisors will not necessarily run guest threads together on the same core at the same time, so msgsndp should not be used. Signed-off-by: Nicholas Piggin --- arch/powerpc/platforms/pseries/smp.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git

[PATCH 0/5] doorbell patches for POWER9

2017-04-07 Thread Nicholas Piggin
This is what I'd like to do for POWER9 doorbells, which reworks the existing code a bit. I guess it won't work on DD1 with OPAL until darn is fixed (only tested on POWER9 using mambo). Nicholas Piggin (5): powerpc/pseries: do not use msgsndp doorbells on POWER9 guests powerpc: change the

[PATCH 0/5] doorbell patches for POWER9

2017-04-07 Thread Nicholas Piggin
This is what I'd like to do for POWER9 doorbells, which reworks the existing code a bit. I guess it won't work on DD1 with OPAL until darn is fixed (only tested on POWER9 using mambo). Nicholas Piggin (5): powerpc/pseries: do not use msgsndp doorbells on POWER9 guests powerpc: change the

Re: [PATCH] powerpc/mm: Remove reduntant initmem information from log

2017-04-07 Thread Michael Ellerman
Anshuman Khandual writes: > Generic core VM already prints these information in the log > buffer, hence there is no need for a second print. This just > removes the second print from arch powerpc NUMA init path. > > Before the patch: > > $dmesg | grep "Initmem" > >

Re: kselftest:lost_exception_test failure with 4.11.0-rc5

2017-04-07 Thread Michael Ellerman
Sachin Sant writes: > I have run into few instances where the lost_exception_test from > powerpc kselftest fails with SIGABRT. Following o/p is against > 4.11.0-rc5. The failure is intermittent. What hardware are you on? How long does it take to run when it fails?

Re: [RFC PATCH 7/7] powerpc/hugetlb: Enable hugetlb migration for ppc64

2017-04-07 Thread Anshuman Khandual
On 04/04/2017 07:34 PM, Aneesh Kumar K.V wrote: > Signed-off-by: Aneesh Kumar K.V > --- > arch/powerpc/platforms/Kconfig.cputype | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/arch/powerpc/platforms/Kconfig.cputype >

Re: [RFC PATCH 5/7] mm/follow_page_mask: Add support for hugetlb pgd entries.

2017-04-07 Thread Anshuman Khandual
On 04/04/2017 07:34 PM, Aneesh Kumar K.V wrote: > ppc64 supports pgd hugetlb entries. Add code to handle hugetlb pgd entries to > follow_page_mask so that ppc64 can switch to it to handle hugetlbe entries. > > Signed-off-by: Aneesh Kumar K.V This was exactly

Re: [RFC PATCH 4/7] mm/follow_page_mask: Add support for hugepage directory entry

2017-04-07 Thread Anshuman Khandual
On 04/04/2017 07:34 PM, Aneesh Kumar K.V wrote: > The defaul implementation prints warning and returns NULL. We will add ppc64 > support in later patches. The description is not sufficient. The patch makes the entire follow page mask function aware of hugepd based implementation at PGD, PUD and

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-07 Thread Nicholas Piggin
On Fri, 7 Apr 2017 11:43:49 +0200 Peter Zijlstra wrote: > On Thu, Apr 06, 2017 at 10:31:46AM -0700, Linus Torvalds wrote: > > But maybe "monitor" is really cheap. I suspect it's microcoded, > > though, which implies "no". > > On my IVB-EP (will also try on something

Re: [RFC PATCH 3/7] mm/hugetlb: export hugetlb_entry_migration helper

2017-04-07 Thread Anshuman Khandual
On 04/04/2017 07:34 PM, Aneesh Kumar K.V wrote: > We will be using this later from the ppc64 code. Change the return type to > bool. How all other arch were able to detect the hugetlb migration entries without using this helper function before ?

Re: [RFC PATCH 2/7] mm/follow_page_mask: Split follow_page_mask to smaller functions.

2017-04-07 Thread Anshuman Khandual
On 04/04/2017 07:34 PM, Aneesh Kumar K.V wrote: > Makes code reading easy. No functional changes in this patch. The description should mention how the follow function is broken down to PGD follow, PUD follow and PMD follow on 4 level page table system. Needs to be bit verbose.

Re: [RFC PATCH 1/7] mm/hugetlb/migration: Use set_huge_pte_at instead of set_pte_at

2017-04-07 Thread Anshuman Khandual
On 04/04/2017 07:34 PM, Aneesh Kumar K.V wrote: > The right interface to use to set a hugetlb pte entry is set_huge_pte_at. Use > that instead of set_pte_at. > Though set_huge_pte_at() calls set_pte_at() on powerpc, changing this in the generic code makes sense.

[PATCH] ibmveth: Support to enable LSO/CSO for Trunk VEA.

2017-04-07 Thread Sivakumar Krishnasamy
Enable largesend and checksum offload for ibmveth configured in trunk mode. Added support to SKB frag_list in TX path by skb_linearize'ing such SKBs. Signed-off-by: Sivakumar Krishnasamy --- drivers/net/ethernet/ibm/ibmveth.c | 102 ++---

Re: [RFC][PATCH] spin loop arch primitives for busy waiting

2017-04-07 Thread Peter Zijlstra
On Thu, Apr 06, 2017 at 10:31:46AM -0700, Linus Torvalds wrote: > But maybe "monitor" is really cheap. I suspect it's microcoded, > though, which implies "no". On my IVB-EP (will also try on something newer): MONITOR ~332 cycles MWAIT ~224 cycles (C0, explicitly invalidated MONITOR) So yes,

Re: [PATCH v2 1/2] fadump: reduce memory consumption for capture kernel

2017-04-07 Thread Hari Bathini
On Friday 07 April 2017 12:54 PM, Hari Bathini wrote: Hi Michael, On Friday 07 April 2017 07:24 AM, Michael Ellerman wrote: Hari Bathini writes: In case of fadump, capture (fadump) kernel boots like a normal kernel. While this has its advantages, the capture

kselftest:lost_exception_test failure with 4.11.0-rc5

2017-04-07 Thread Sachin Sant
I have run into few instances where the lost_exception_test from powerpc kselftest fails with SIGABRT. Following o/p is against 4.11.0-rc5. The failure is intermittent. When the test fails it is killed due to SIGABRT. # ./lost_exception_test test: lost_exception tags: git_version:unknown

Re: [PATCH v2 1/2] fadump: reduce memory consumption for capture kernel

2017-04-07 Thread Hari Bathini
Hi Michael, On Friday 07 April 2017 07:24 AM, Michael Ellerman wrote: Hari Bathini writes: In case of fadump, capture (fadump) kernel boots like a normal kernel. While this has its advantages, the capture kernel would initialize all the components like normal

[PATCH] powerpc/mm: Remove reduntant initmem information from log

2017-04-07 Thread Anshuman Khandual
Generic core VM already prints these information in the log buffer, hence there is no need for a second print. This just removes the second print from arch powerpc NUMA init path. Before the patch: $dmesg | grep "Initmem" numa: Initmem setup node 0 [mem 0x-0x] numa: Initmem

Re: [PATCH V3 7/7] cxl: Add psl9 specific code

2017-04-07 Thread christophe lombard
Le 03/04/2017 à 15:05, Frederic Barrat a écrit : Le 28/03/2017 à 17:14, Christophe Lombard a écrit : The new Coherent Accelerator Interface Architecture, level 2, for the IBM POWER9 brings new content and features: - POWER9 Service Layer - Registers - Radix mode - Process element entry -

Re: [PATCH 2/3] of/fdt: introduce of_scan_flat_dt_subnodes and of_get_flat_dt_phandle

2017-04-07 Thread Michael Ellerman
Rob Herring writes: > On Wed, Apr 5, 2017 at 9:32 AM, Nicholas Piggin wrote: >> On Wed, 5 Apr 2017 08:35:06 -0500 >> Rob Herring wrote: >> >>> On Wed, Apr 5, 2017 at 7:37 AM, Nicholas Piggin wrote: >>> > Introduce

Re: [PATCH v8 2/3] powerpc/book3s: EXPORT_SYMBOL machine_check_print_event_info

2017-04-07 Thread Michael Ellerman
Mahesh J Salgaonkar writes: > From: Mahesh Salgaonkar > > It will be used in arch/powerpc/kvm/book3s_hv.c KVM module. > > Signed-off-by: Mahesh Salgaonkar > --- > arch/powerpc/kernel/mce.c |1 + > 1 file

Re: [v2 5/5] mm: teach platforms not to zero struct pages memory

2017-04-07 Thread Aneesh Kumar K.V
Heiko Carstens writes: > On Fri, Mar 24, 2017 at 03:19:52PM -0400, Pavel Tatashin wrote: >> If we are using deferred struct page initialization feature, most of >> "struct page"es are getting initialized after other CPUs are started, and >> hence we are benefiting from