Re: [PATCH for-4.8 03/12] powerpc/mm: use _raw variant of page table accessors

2016-07-16 Thread Anton Blanchard via Linuxppc-dev
Hi David, > > This switch few of the page table accessor to use the __raw variant > > and does the cpu to big endian conversion of constants. This helps > > in generating better code. > > It might be better to say that checks for a value being 0 don't depend > on the endianness. > > In which

Re: [patch V2 30/67] powerpc/numa: Convert to hotplug state machine

2016-07-15 Thread Anton Blanchard via Linuxppc-dev
> > I noticed tip started failing in my CI environment which tests on > > QEMU. The failure bisected to commit > > 425209e0abaf2c6e3a90ce4fedb935c10652bf80 > > That's very useful, thanks Anton! > > I have removed this commit from the series for the time being, > refactored the followup

Re: [patch V2 30/67] powerpc/numa: Convert to hotplug state machine

2016-07-14 Thread Anton Blanchard via Linuxppc-dev
Hi Anna-Maria, > >> Install the callbacks via the state machine and let the core invoke > >> the callbacks on the already online CPUs. > > > > This is causing an oops on ppc64le QEMU, looks like a NULL > > pointer: > > Did you tested it against tip WIP.hotplug? I noticed tip started failing

Re: [patch V2 30/67] powerpc/numa: Convert to hotplug state machine

2016-07-14 Thread Anton Blanchard via Linuxppc-dev
Hi, > From: Sebastian Andrzej Siewior > > Install the callbacks via the state machine and let the core invoke > the callbacks on the already online CPUs. This is causing an oops on ppc64le QEMU, looks like a NULL pointer: percpu: Embedded 3 pages/cpu @c0001fe0

Re: [PATCH] powerpc/configs: Enable VMX crypto

2016-07-11 Thread Anton Blanchard via Linuxppc-dev
Hi Steven, > Not in ppc64_defconfig? Good point. The recent addition of powernv_defconfig made me forget about ppc64_defconfig. We could do with some rationalisation here. pseries isn't really pseries, perhaps we should call it ibm_defconfig, or maybe server_deconfig. ppc64_defconfig continues

cpuidle broken on mainline

2016-06-22 Thread Anton Blanchard via Linuxppc-dev
Hi, I was noticing some pretty big run to run variations on single threaded benchmarks, and I've isolated it cpuidle issues. If I look at the cpuidle tracepoint, I notice we only go into the snooze state. Do we have any known bugs in cpuidle at the moment? While looking around, I also noticed

Re: [3/3] powerpc: Avoid load hit store when using find_linux_pte_or_hugepte()

2016-05-31 Thread Anton Blanchard via Linuxppc-dev
Hi Michael, > I'd really rather __find_linux_pte_or_hugepte() was an internal > detail, rather than the standard API. > > We do already have quite a few uses, but adding more just further > spreads the details about how the implementation works. > > So I'm going to drop this in favor of

Re: [PATCH 1/3] powerpc: Avoid load hit store in __giveup_fpu() and __giveup_altivec()

2016-05-31 Thread Anton Blanchard via Linuxppc-dev
> Huh? Make it an unsigned long please, which is the type of the msr > field in struct pt_regs to work on both 32 and 64 bit processors. Thanks, not sure what I was thinking there. Will respin. Anton ___ Linuxppc-dev mailing list

Re: [PATCH] powerpc: inline current_stack_pointer()

2016-05-31 Thread Anton Blanchard via Linuxppc-dev
Hi, > current_stack_pointeur() is a single instruction function. it > It is not worth breaking the execution flow with a bl/blr for a > single instruction Check out bfe9a2cfe91a ("powerpc: Reimplement __get_SP() as a function not a define") to see why we made it a function. Anton

Re: [PATCH V2 04/68] powerpc/mm: Use big endian page table for book3s 64

2016-05-30 Thread Anton Blanchard via Linuxppc-dev
Hi, > I see the same issue in unmap_page_range(), __hash_page_64K(), > handle_mm_fault(). This looks to be about 10% slower on POWER8: #include #include #include #define ITERATIONS 1000 #define MEMSIZE (128 * 1024 * 1024) int main(void) { unsigned long i = ITERATIONS;

Re: [PATCH V2 04/68] powerpc/mm: Use big endian page table for book3s 64

2016-05-29 Thread Anton Blanchard via Linuxppc-dev
Hi Ben, > That is surprising, do we have any idea what specifically increases > the overhead so significantly ? Does gcc know about ldbrx/stdbrx ? I > notice in our io.h for example we still do manual ld/std + swap > because old processors didn't know these, we should fix that for > CONFIG_POWER8

Re: [PATCH 2/3] powerpc: Avoid load hit store in setup_sigcontext()

2016-05-29 Thread Anton Blanchard via Linuxppc-dev
Hi, > On Sun, 2016-05-29 at 22:03 +1000, Anton Blanchard wrote: > > From: Anton Blanchard > > > > In setup_sigcontext(), we set current->thread.vrsave then use it > > straight after. Since current is hidden from the compiler via inline > > assembly, it cannot optimise this and

Re: [PATCH V2 04/68] powerpc/mm: Use big endian page table for book3s 64

2016-05-29 Thread Anton Blanchard via Linuxppc-dev
Hi, > This enables us to share the same page table code for > both radix and hash. Radix use a hardware defined big endian > page table This is measurably worse (a little over 2% on POWER8) on a futex microbenchmark: #define _GNU_SOURCE #include #include #include #define ITERATIONS 1000

[PATCH 2/2] powerpc: Align hot loops of some string functions

2016-05-25 Thread Anton Blanchard via Linuxppc-dev
Align the hot loops in our assembly implementation of strncpy(), strncmp() and memchr(). Signed-off-by: Anton Blanchard --- Index: linux.junk/arch/powerpc/lib/string.S === ---

[PATCH 1/2] powerpc: Remove assembly versions of strcpy, strcat, strlen and strcmp

2016-05-25 Thread Anton Blanchard via Linuxppc-dev
A number of our assembly implementations of string functions do not align their hot loops. I was going to align them manually, but I realised that they are are almost instruction for instruction identical to what gcc produces, with the advantage that gcc does align them. In light of that, let's

[PATCH] powerpc: Improve comment explaining why we modify VRSAVE

2016-05-19 Thread Anton Blanchard via Linuxppc-dev
The comment explaining why we modify VRSAVE is misleading, glibc does rely on the behaviour. Update the comment. Signed-off-by: Anton Blanchard --- diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S index 1c2e7a3..3907fcf 100644 ---

[PATCH v2] spapr: Don't set the TM ibm,pa-features bit in PR KVM mode

2016-04-29 Thread Anton Blanchard via Linuxppc-dev
We don't support transactional memory in PR KVM, so don't tell the OS that we do. Signed-off-by: Anton Blanchard --- v2: Fix build with CONFIG_KVM disabled, noticed by Alex. diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index b69995e..dc3e3c9 100644 --- a/hw/ppc/spapr.c +++

[PATCH] powerpc: create_zero_mask() has bad inline assembly constraint

2016-04-29 Thread Anton Blanchard via Linuxppc-dev
In create_zero_mask() we have: addi%1,%2,-1 andc%1,%1,%2 popcntd %0,%1 using the "r" constraint for %2. r0 is a valid register in the "r" set, but addi X,r0,X turns it into an li: li r7,-1 andcr7,r7,r0 popcntd r4,r7 Fix this by

[PATCH 3/3] powerpc: Update TM user feature bits in scan_features()

2016-04-14 Thread Anton Blanchard via Linuxppc-dev
We need to update the user TM feature bits (PPC_FEATURE2_HTM and PPC_FEATURE2_HTM) to mirror what we do with the kernel TM feature bit. At the moment, if firmware reports TM is not available we turn off the kernel TM feature bit but leave the userspace ones on. Userspace thinks it can execute TM

[PATCH 2/3] powerpc: Update cpu_user_features2 in scan_features()

2016-04-14 Thread Anton Blanchard via Linuxppc-dev
scan_features() updates cpu_user_features but not cpu_user_features2. Amongst other things, cpu_user_features2 contains the user TM feature bits which we must keep in sync with the kernel TM feature bit. Signed-off-by: Anton Blanchard Cc: sta...@vger.kernel.org ---

[PATCH 1/3] powerpc: scan_features() updates incorrect bits

2016-04-14 Thread Anton Blanchard via Linuxppc-dev
The real LE feature entry in the ibm_pa_feature struct has the wrong number of elements. Instead of checking for byte 5, bit 0, we check for byte 0, bit 0, and we also incorrectly update cpu user feature bit 5. Fixes: 44ae3ab3358e ("powerpc: Free up some CPU feature bits by moving out

Re: [PATCH 1/3] powerpc: Complete FSCR context switch

2016-04-13 Thread Anton Blanchard via Linuxppc-dev
Hi Jack, > Previously we just saved the FSCR, but only restored it in some > settings, and never copied it thread to thread. This patch always > restores the FSCR and formalizes new threads inheriting its setting so > that later we can manipulate FSCR bits in start_thread. Will this break the

[PATCH] sched/cpuacct: Check for NULL when using task_pt_regs()

2016-04-13 Thread Anton Blanchard via Linuxppc-dev
task_pt_regs() can return NULL for kernel threads, so add a check. This fixes an oops at boot on ppc64. Fixes: d740037fac70 ("sched/cpuacct: Split usage accounting into user_usage and sys_usage") Signed-off-by: Anton Blanchard Reported-and-Tested-by: Srikar Dronamraju

[PATCH] sched/cpuacct: Check for NULL when using task_pt_regs()

2016-04-06 Thread Anton Blanchard via Linuxppc-dev
Hi Peter, > Ah, so sometihng like: > > struct pt_regs *regs = task_pt_regs(); > int index = CPUACCT_USAGE_SYSTEM; > > if (regs && user_mode(regs)) > index = CPUACCT_USAGE_USER; > > should work, right? Looks good, and the patch below does fix the oops for me.

[PATCH] powerpc: Clear user CPU feature bits if TM is disabled at runtime

2016-04-04 Thread Anton Blanchard via Linuxppc-dev
In check_cpu_pa_features() we check a number of bits in the ibm,pa-features array and set and clear CPU features based on what we find. One of these bits is CPU_FTR_TM, the transactional memory feature bit. If this does disable TM at runtime, then we need to tell userspace about it by clearing

[PATCH] spapr: Don't set the TM ibm,pa-features bit in PR KVM mode

2016-04-04 Thread Anton Blanchard via Linuxppc-dev
We don't support transactional memory in PR KVM, so don't tell the OS that we do. Signed-off-by: Anton Blanchard --- diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index e7be21e..538bd87 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -696,6 +696,12 @@ static void

Re: PR KVM and TM issues

2016-04-04 Thread Anton Blanchard via Linuxppc-dev
Hi Alexey, > > I can't get an Ubuntu Wily guest to boot on an Ubuntu Wily host in > > PR KVM mode. The kernel in both cases is 4.2. To reproduce: > > > > wget -N > > https://cloud-images.ubuntu.com/wily/current/wily-server-cloudimg-ppc64el-disk1.img > > > > qemu-system-ppc64 -cpu POWER8

PR KVM and TM issues

2016-04-04 Thread Anton Blanchard via Linuxppc-dev
Hi, I can't get an Ubuntu Wily guest to boot on an Ubuntu Wily host in PR KVM mode. The kernel in both cases is 4.2. To reproduce: wget -N https://cloud-images.ubuntu.com/wily/current/wily-server-cloudimg-ppc64el-disk1.img qemu-system-ppc64 -cpu POWER8 -enable-kvm -machine pseries,kvm-type=PR

[PATCH] perf jit: genelf makes assumptions about endian

2016-03-29 Thread Anton Blanchard via Linuxppc-dev
Commit 9b07e27f88b9 ("perf inject: Add jitdump mmap injection support") incorrectly assumed that PowerPC is big endian only. Simplify things by consolidating the define of GEN_ELF_ENDIAN and checking for __BYTE_ORDER == __BIG_ENDIAN. The PowerPC checks were also incorrect, they do not match what

Re: [PATCH] powerpc/process: fix altivec SPR not being saved

2016-03-06 Thread Anton Blanchard via Linuxppc-dev
Hi Oliver, > In save_sprs() in process.c contains the following test: > > if (cpu_has_feature(cpu_has_feature(CPU_FTR_ALTIVEC))) > t->vrsave = mfspr(SPRN_VRSAVE); > > CPU feature with the mask 0x1 is CPU_FTR_COHERENT_ICACHE so the test > is equivilent to: > > if

[no subject]

2016-02-06 Thread Anton Blanchard via Linuxppc-dev
--- Begin Message --- > Since binutils 2.26 BFD is doing suffix merging on STRTAB sections. > But dedotify modifies the symbol names in place, which can also modify > unrelated symbols with a name that matches a suffix of a dotted > name. To remove the leading dot of a symbol name we can just >