Re: perf TUI fails with "failed to process type: 64"

2016-10-10 Thread Anton Blanchard
Hi Michael, > > 14c00-14c00 g exc_virt_0x4c00_system_call >^ >What's this? The address? If so it's wrong? Offset into the binary I think, there's one 64kB page of ELF gunk at the start. > Seems likely. But I can't see why. > > AFAICS we have never emitted a size for those symbols: >

Re: CPU hotplug hits oops in select_idle_sibling()

2016-10-08 Thread Anton Blanchard
Hi Peter, > > I updated to mainline as of today and tried CPU hotplug via the > > ppc64_cpu tool: > > http://lkml.kernel.org/r/1475922278-3306-1-git-send-email-wanpeng...@hotmail.com Thanks, this fixes the issue for me. Anton

sysrq-b fails miserably to reboot PowerNV box

2016-10-08 Thread Anton Blanchard
Hi, Unfortunately sysrq-b seems to tie us up in knots, instead of rebooting the box. This is mainline from today. Anton -- Trying to free IRQ 17 from IRQ context! [ cut here ] WARNING: CPU: 32 PID: 0 at kernel/irq/manage.c:1460 __free_irq+0x298/0x380 Modules linked in:

CPU hotplug hits oops in select_idle_sibling()

2016-10-08 Thread Anton Blanchard
Hi, I updated to mainline as of today and tried CPU hotplug via the ppc64_cpu tool: # ppc64_cpu --smt=off Segmentation fault Looks like a NULL pointer in select_idle_sibling(): Unable to handle kernel paging request for data at address 0x0078 Faulting instruction address:

[PATCH] powerpc: During context switch, check before setting mm_cpumask

2016-10-03 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> During context switch, switch_mm() sets our current CPU in mm_cpumask. We can avoid this atomic sequence in most cases by checking before setting the bit. Testing on a POWER8 using our context switch microbenchmark: tools/testing/selftests/p

[PATCH] powerpc: Remove static branch prediction in atomic{, 64}_add_unless

2016-10-03 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> I see quite a lot of static branch mispredictions on a simple web serving workload. The issue is in __atomic_add_unless(), called from _atomic_dec_and_lock(). There is no obvious common case, so it is better to let the hardware predict the branch.

[PATCH] powerpc/eeh: Quieten EEH message when no adapters are found

2016-10-01 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> No real need for this to be pr_warn(), reduce it to pr_info(). Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/kernel/eeh.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/eeh.c b/

[PATCH] powerpc/pseries: Use H_CLEAR_HPT to clear MMU hash table during kexec

2016-10-01 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> An hcall was recently added that does exactly what we need during kexec - it clears the entire MMU hash table, ignoring any VRMA mappings. Try it and fall back to the old method if we get a failure. On a POWER8 box with 5TB of memory, this r

[PATCH 4/4] powerpc/configs: Enable Intel i40e on 64 bit configs

2016-09-30 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> We are starting to see i40e adapters in recent machines, so enable it in our configs. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/configs/powernv_defconfig | 1 + arch/powerpc/configs/ppc64_defconfig | 1 + arch/pow

[PATCH 3/4] powerpc/configs: Change a few things from built in to modules

2016-09-30 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> Change a few devices and filesystems that are seldom used any more from built in to modules. This reduces our vmlinux about 500kB. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/configs/powernv_defconfig | 14 +++-

[PATCH 2/4] powerpc/configs: Bump kernel ring buffer size on 64 bit configs

2016-09-30 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> When we issue a system reset, every CPU in the box prints an Oops, including a backtrace. Each of these can be quite large (over 4kB) and we may end up wrapping the ring buffer and losing important information. Bump the base size from 128kB to

[PATCH 1/4] powerpc/configs: Enable VMX crypto

2016-09-30 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> We see big improvements with the VMX crypto functions (often 10x or more), so enable it as a module. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/configs/powernv_defconfig | 2 ++ arch/powerpc/configs/ppc64_defconfig

Re: [PATCH 0/3] Fix crypto/vmx/p8_ghash memory corruption

2016-09-28 Thread Anton Blanchard
Hi Marcelo > This series fixes the memory corruption found by Jan Stancek in > 4.8-rc7. The problem however also affects previous versions of the > driver. If it affects previous versions, please add the lines in the sign off to get it into the stable kernels. Anton

Re: [PATCH 1/2] powerpc: Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian

2016-09-25 Thread Anton Blanchard
Hi Ben, > Hrm.. this should really be a runtime switch... I wonder if anyone is still running POWER7 LE, perhaps we could drop it entirely. Anton

[PATCH 2/2] powerpc: Set default CPU type to POWER8 for little endian builds

2016-09-25 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> We supported POWER7 CPUs for bootstrapping little endian, but the target was always POWER8. Now that POWER7 specific issues are impacting performance, change the default target to POWER8. Signed-off-by: Anton Blanchard <an...@samba.org> ---

[PATCH 1/2] powerpc: Only disable HAVE_EFFICIENT_UNALIGNED_ACCESS on POWER7 little endian

2016-09-25 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> POWER8 handles unaligned accesses in little endian mode, but commit 0b5e6661ac69 ("powerpc: Don't set HAVE_EFFICIENT_UNALIGNED_ACCESS on little endian builds") disabled it for all. The issue with unaligned little endian accesses is specif

Re: [PATCH] powerpc: Align hot loops of memset() and backwards_memcpy()

2016-09-25 Thread Anton Blanchard
Hi Nick, > Hmm. If we execute this loop once, we'll only fetch additional nops. > Twice, and we make up for them by not fetching unused instructions. > More than twice and we may start winning. > > For large sizes it probably helps, but I'd like to see what sizes > memset sees. I noticed this

[PATCH] powerpc/vdso64: Use double word compare on pointers

2016-09-25 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> __kernel_get_syscall_map and __kernel_clock_getres use cmpli to check if the passed in pointer is non zero. cmpli maps to a 32 bit compare on binutils, so we ignore the top 32 bits. A simple test case can be created by passing in a bogus p

Re: [PATCH] Work around for enabling CONFIG_CMDLINE on ppc64le

2016-09-22 Thread Anton Blanchard
Hi, > But I can't merge that patch. > > Our options are one or both of: > - get GCC fixed and backport the fix to the compilers we care about. > - blacklist the broken compiler versions. > > Is there a GCC bug filed for this? Likely: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71709 We

Re: [PATCH] powerpc: Ensure .mem(init|exit).text are within _stext/_etext

2016-09-14 Thread Anton Blanchard
ly > ftrace/kprobes etc.). > > Fix it by adding MEM_KEEP() directives, mirroring what TEXT_TEXT does. > > This isn't a problem when CONFIG_MEMORY_HOTPLUG=n, because we use the > standard INIT_TEXT_SECTION() and EXIT_TEXT macros from vmlinux.lds.h. Thanks Michael, looks good: Tested

Re: [PATCH v3 19/21] powerpc: Add option to use jump label for mmu_has_feature()

2016-08-08 Thread Anton Blanchard
Hi, > This patch causes an oops when building with the gold linker: Found the problem. On binutils .meminit.text is within _stext/_etext: [Nr] Name TypeAddress OffSize ES Flg Lk Inf Al [ 3] .meminit.text PROGBITSc0989d14 999d14

Re: [PATCH v3 19/21] powerpc: Add option to use jump label for mmu_has_feature()

2016-08-07 Thread Anton Blanchard
Hi, > From: Kevin Hao > > As we just did for CPU features. This patch causes an oops when building with the gold linker: Unable to handle kernel paging request for data at address 0xf000 Faulting instruction address: 0xc0971544 Oops: Kernel access of

Re: [PATCH] powerpc: Align hot loops of memset() and backwards_memcpy()

2016-08-05 Thread Anton Blanchard
Hi Nick, > Hmm. If we execute this loop once, we'll only fetch additional nops. > Twice, and we make up for them by not fetching unused instructions. > More than twice and we may start winning. > > For large sizes it probably helps, but I'd like to see what sizes > memset sees. I found this in

Re: [PATCH] powerpc: Align hot loops of memset() and backwards_memcpy()

2016-08-04 Thread Anton Blanchard
Hi Christophe, > > Align the hot loops in our assembly implementation of memset() > > and backwards_memcpy(). > > > > backwards_memcpy() is called from tcp_v4_rcv(), so we might > > want to optimise this a little more. > > > > Signed-off-by: Anton Bl

Re: [PATCH] crypto: crc32c-vpmsum - Convert to CPU feature based module autoloading

2016-08-04 Thread Anton Blanchard
Hi Michael, > Is VEC_CRYPTO the right feature? > > That's new power8 crypto stuff. The vpmsum* instructions are part of the same pipeline as the vcipher* instructions, introduced in POWER8. > I thought this only used VMX? (but I haven't looked closely) Yes, vcipher* and vpmsum* are VMX

[PATCH] powerpc: Align hot loops of memset() and backwards_memcpy()

2016-08-04 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> Align the hot loops in our assembly implementation of memset() and backwards_memcpy(). backwards_memcpy() is called from tcp_v4_rcv(), so we might want to optimise this a little more. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch

[PATCH] crypto: crc32c-vpmsum - Convert to CPU feature based module autoloading

2016-08-04 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> This patch utilises the GENERIC_CPU_AUTOPROBE infrastructure to automatically load the crc32c-vpmsum module if the CPU supports it. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/crypto/crc32c-vpmsum_glue.c | 3 ++- 1 fil

Re: [PATCH] crypto: powerpc - CRYPT_CRC32C_VPMSUM should depend on ALTIVEC

2016-08-03 Thread Anton Blanchard
Hi Michael, > The optimised crc32c implementation depends on VMX (aka. Altivec) > instructions, so the kernel must be built with Altivec support in > order for the crc32c code to build. Thanks for that, looks good. Acked-by: Anton Blanchard <an...@samba.org> > Fixes: 6dd

Re: [PATCH for-4.8 03/12] powerpc/mm: use _raw variant of page table accessors

2016-07-16 Thread Anton Blanchard via Linuxppc-dev
Hi David, > > This switch few of the page table accessor to use the __raw variant > > and does the cpu to big endian conversion of constants. This helps > > in generating better code. > > It might be better to say that checks for a value being 0 don't depend > on the endianness. > > In which

Re: [patch V2 30/67] powerpc/numa: Convert to hotplug state machine

2016-07-15 Thread Anton Blanchard via Linuxppc-dev
> > I noticed tip started failing in my CI environment which tests on > > QEMU. The failure bisected to commit > > 425209e0abaf2c6e3a90ce4fedb935c10652bf80 > > That's very useful, thanks Anton! > > I have removed this commit from the series for the time being, > refactored the followup

Re: [patch V2 30/67] powerpc/numa: Convert to hotplug state machine

2016-07-14 Thread Anton Blanchard via Linuxppc-dev
Hi Anna-Maria, > >> Install the callbacks via the state machine and let the core invoke > >> the callbacks on the already online CPUs. > > > > This is causing an oops on ppc64le QEMU, looks like a NULL > > pointer: > > Did you tested it against tip WIP.hotplug? I noticed tip started failing

Re: [patch V2 30/67] powerpc/numa: Convert to hotplug state machine

2016-07-14 Thread Anton Blanchard via Linuxppc-dev
Hi, > From: Sebastian Andrzej Siewior > > Install the callbacks via the state machine and let the core invoke > the callbacks on the already online CPUs. This is causing an oops on ppc64le QEMU, looks like a NULL pointer: percpu: Embedded 3 pages/cpu @c0001fe0

[PATCH] powerpc/configs: Enable VMX crypto

2016-07-11 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> We see big improvements with the VMX crypto functions (often 10x or more), so enable it as a module. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/configs/powernv_defconfig | 2 ++ arch/powerpc/configs/ppc64_defconfig

Re: [PATCH] powerpc/configs: Enable VMX crypto

2016-07-11 Thread Anton Blanchard via Linuxppc-dev
Hi Steven, > Not in ppc64_defconfig? Good point. The recent addition of powernv_defconfig made me forget about ppc64_defconfig. We could do with some rationalisation here. pseries isn't really pseries, perhaps we should call it ibm_defconfig, or maybe server_deconfig. ppc64_defconfig continues

[PATCH] powerpc/configs: Enable VMX crypto

2016-07-11 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> We see big improvements with the VMX crypto functions (often 10x or more), so enable it as a module. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/configs/powernv_defconfig | 2 ++ arch/powerpc/configs/pseries_defconfig | 2

[PATCH 2/2] crypto: powerpc: Add POWER8 optimised crc32c

2016-06-30 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> Use the vector polynomial multiply-sum instructions in POWER8 to speed up crc32c. This is just over 41x faster than the slice-by-8 method that it replaces. Measurements on a 4.1 GHz POWER8 show it sustaining 52 GiB/sec. A simple btrfs write perfo

[PATCH 1/2] powerpc: define FUNC_START/FUNC_END

2016-06-30 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> gcc provides FUNC_START/FUNC_END macros to help with creating assembly functions. Mirror these in the kernel so we can more easily share code between userspace and the kernel. FUNC_END is just a stub since we don't currently annotate the end of

cpuidle broken on mainline

2016-06-22 Thread Anton Blanchard via Linuxppc-dev
Hi, I was noticing some pretty big run to run variations on single threaded benchmarks, and I've isolated it cpuidle issues. If I look at the cpuidle tracepoint, I notice we only go into the snooze state. Do we have any known bugs in cpuidle at the moment? While looking around, I also noticed

[PATCH 2/2] crypto: vmx: Increase priority of aes-cbc cipher

2016-06-10 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> All of the VMX AES ciphers (AES, AES-CBC and AES-CTR) are set at priority 1000. Unfortunately this means we never use AES-CBC and AES-CTR, because the base AES-CBC cipher that is implemented on top of AES inherits its priority. To fix this, AES-CBC a

[PATCH 1/2] crypto: vmx: Fix ABI detection

2016-06-10 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> When calling ppc-xlate.pl, we pass it either linux-ppc64 or linux-ppc64le. The script however was expecting linux64le, a result of its OpenSSL origins. This means we aren't obeying the ppc64le ABIv2 rules. Fix this by checking for linux-ppc64le.

[PATCH 2/2] spapr: Better handling of ibm,pa-features TM bit

2016-06-07 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> There are a few issues with our handling of the ibm,pa-features TM bit: - We don't support transactional memory in PR KVM, so don't tell the OS that we do. - In full emulation we have a minimal implementation of TM that always fai

[PATCH 1/2] Add PowerPC AT_HWCAP2 definitions

2016-06-07 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> We need the PPC_FEATURE2_HAS_HTM bit in a subsequent patch, so add the PowerPC AT_HWCAP2 definitions. Signed-off-by: Anton Blanchard <an...@samba.org> --- diff --git a/include/elf.h b/include/elf.h index 28d448b..8533b2a 100644 --- a/include

Re: [3/3] powerpc: Avoid load hit store when using find_linux_pte_or_hugepte()

2016-05-31 Thread Anton Blanchard via Linuxppc-dev
Hi Michael, > I'd really rather __find_linux_pte_or_hugepte() was an internal > detail, rather than the standard API. > > We do already have quite a few uses, but adding more just further > spreads the details about how the implementation works. > > So I'm going to drop this in favor of

Re: [PATCH 1/3] powerpc: Avoid load hit store in __giveup_fpu() and __giveup_altivec()

2016-05-31 Thread Anton Blanchard via Linuxppc-dev
> Huh? Make it an unsigned long please, which is the type of the msr > field in struct pt_regs to work on both 32 and 64 bit processors. Thanks, not sure what I was thinking there. Will respin. Anton ___ Linuxppc-dev mailing list

Re: [PATCH] powerpc: inline current_stack_pointer()

2016-05-31 Thread Anton Blanchard via Linuxppc-dev
Hi, > current_stack_pointeur() is a single instruction function. it > It is not worth breaking the execution flow with a bl/blr for a > single instruction Check out bfe9a2cfe91a ("powerpc: Reimplement __get_SP() as a function not a define") to see why we made it a function. Anton

Re: [PATCH V2 04/68] powerpc/mm: Use big endian page table for book3s 64

2016-05-30 Thread Anton Blanchard via Linuxppc-dev
Hi, > I see the same issue in unmap_page_range(), __hash_page_64K(), > handle_mm_fault(). This looks to be about 10% slower on POWER8: #include #include #include #define ITERATIONS 1000 #define MEMSIZE (128 * 1024 * 1024) int main(void) { unsigned long i = ITERATIONS;

Re: [PATCH V2 04/68] powerpc/mm: Use big endian page table for book3s 64

2016-05-29 Thread Anton Blanchard via Linuxppc-dev
Hi Ben, > That is surprising, do we have any idea what specifically increases > the overhead so significantly ? Does gcc know about ldbrx/stdbrx ? I > notice in our io.h for example we still do manual ld/std + swap > because old processors didn't know these, we should fix that for > CONFIG_POWER8

Re: [PATCH 2/3] powerpc: Avoid load hit store in setup_sigcontext()

2016-05-29 Thread Anton Blanchard via Linuxppc-dev
Hi, > On Sun, 2016-05-29 at 22:03 +1000, Anton Blanchard wrote: > > From: Anton Blanchard <an...@samba.org> > > > > In setup_sigcontext(), we set current->thread.vrsave then use it > > straight after. Since current is hidden from the compiler via inl

[PATCH 3/3] powerpc: Avoid load hit store when using find_linux_pte_or_hugepte()

2016-05-29 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> In many cases we disable interrupts right before calling find_linux_pte_or_hugepte(). find_linux_pte_or_hugepte() first checks interrupts are disabled before calling __find_linux_pte_or_hugepte(): if (!arch_irqs_di

[PATCH 2/3] powerpc: Avoid load hit store in setup_sigcontext()

2016-05-29 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> In setup_sigcontext(), we set current->thread.vrsave then use it straight after. Since current is hidden from the compiler via inline assembly, it cannot optimise this and we end up with a load hit store. Fix this by using a temporary. S

[PATCH 1/3] powerpc: Avoid load hit store in __giveup_fpu() and __giveup_altivec()

2016-05-29 Thread Anton Blanchard
From: Anton Blanchard <an...@samba.org> In both __giveup_fpu() and __giveup_altivec() we make two modifications to tsk->thread.regs->msr. gcc decides to do a read/modify/write of each change, so we end up with a load hit store: ld r9,264(r10) rldicl

Re: [PATCH V2 04/68] powerpc/mm: Use big endian page table for book3s 64

2016-05-29 Thread Anton Blanchard via Linuxppc-dev
Hi, > This enables us to share the same page table code for > both radix and hash. Radix use a hardware defined big endian > page table This is measurably worse (a little over 2% on POWER8) on a futex microbenchmark: #define _GNU_SOURCE #include #include #include #define ITERATIONS 1000

[PATCH 2/2] powerpc: Align hot loops of some string functions

2016-05-25 Thread Anton Blanchard via Linuxppc-dev
Align the hot loops in our assembly implementation of strncpy(), strncmp() and memchr(). Signed-off-by: Anton Blanchard <an...@samba.org> --- Index: linux.junk/arch/powerpc/lib/string.S === --- linux.junk.orig/arch/power

[PATCH 1/2] powerpc: Remove assembly versions of strcpy, strcat, strlen and strcmp

2016-05-25 Thread Anton Blanchard via Linuxppc-dev
just remove the assembly versions. Signed-off-by: Anton Blanchard <an...@samba.org> --- index e40010a..da3cdff 100644 Index: linux.junk/arch/powerpc/include/asm/string.h === --- linux.junk.orig/arch/powerpc/include/asm/st

[PATCH] powerpc: Improve comment explaining why we modify VRSAVE

2016-05-19 Thread Anton Blanchard via Linuxppc-dev
The comment explaining why we modify VRSAVE is misleading, glibc does rely on the behaviour. Update the comment. Signed-off-by: Anton Blanchard <an...@samba.org> --- diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S index 1c2e7a3..3907fcf 100644 --- a/arch/powerpc/

[PATCH v2] spapr: Don't set the TM ibm,pa-features bit in PR KVM mode

2016-04-29 Thread Anton Blanchard via Linuxppc-dev
We don't support transactional memory in PR KVM, so don't tell the OS that we do. Signed-off-by: Anton Blanchard <an...@samba.org> --- v2: Fix build with CONFIG_KVM disabled, noticed by Alex. diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index b69995e..dc3e3c9 100644 --- a/hw/ppc/spapr.c

[PATCH] powerpc: create_zero_mask() has bad inline assembly constraint

2016-04-29 Thread Anton Blanchard via Linuxppc-dev
er for working with me to find this issue. Signed-off-by: Anton Blanchard <an...@samba.org> Cc: <sta...@vger.kernel.org> Fixes: d0cebfa650a0 ("powerpc: word-at-a-time optimization for 64-bit Little Endian") --- diff --git a/arch/powerpc/include/asm/word-at-a-time.h b/arch/powerpc

[PATCH 3/3] powerpc: Update TM user feature bits in scan_features()

2016-04-14 Thread Anton Blanchard via Linuxppc-dev
instructions and it dies trying. This (together with a QEMU patch) fixes PR KVM, which doesn't currently support TM. Signed-off-by: Anton Blanchard <an...@samba.org> Cc: sta...@vger.kernel.org --- arch/powerpc/kernel/prom.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-)

[PATCH 2/3] powerpc: Update cpu_user_features2 in scan_features()

2016-04-14 Thread Anton Blanchard via Linuxppc-dev
scan_features() updates cpu_user_features but not cpu_user_features2. Amongst other things, cpu_user_features2 contains the user TM feature bits which we must keep in sync with the kernel TM feature bit. Signed-off-by: Anton Blanchard <an...@samba.org> Cc: sta...@vger.kernel.org ---

[PATCH 1/3] powerpc: scan_features() updates incorrect bits

2016-04-14 Thread Anton Blanchard via Linuxppc-dev
MMU-related features") Signed-off-by: Anton Blanchard <an...@samba.org> Cc: sta...@vger.kernel.org --- arch/powerpc/kernel/prom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c index 7030b03..9a3a7c6 100644 --- a/arc

Re: [PATCH 1/3] powerpc: Complete FSCR context switch

2016-04-13 Thread Anton Blanchard via Linuxppc-dev
Hi Jack, > Previously we just saved the FSCR, but only restored it in some > settings, and never copied it thread to thread. This patch always > restores the FSCR and formalizes new threads inheriting its setting so > that later we can manipulate FSCR bits in start_thread. Will this break the

[PATCH] sched/cpuacct: Check for NULL when using task_pt_regs()

2016-04-13 Thread Anton Blanchard via Linuxppc-dev
task_pt_regs() can return NULL for kernel threads, so add a check. This fixes an oops at boot on ppc64. Fixes: d740037fac70 ("sched/cpuacct: Split usage accounting into user_usage and sys_usage") Signed-off-by: Anton Blanchard <an...@samba.org> Reported-and-Tested-by: Srikar

[PATCH] sched/cpuacct: Check for NULL when using task_pt_regs()

2016-04-06 Thread Anton Blanchard via Linuxppc-dev
atch below does fix the oops for me. Anton -- task_pt_regs() can return NULL for kernel threads, so add a check. This fixes an oops at boot on ppc64. Signed-off-by: Anton Blanchard <an...@samba.org> --- diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index df947e0..41f85c4 100644

[PATCH] powerpc: Clear user CPU feature bits if TM is disabled at runtime

2016-04-04 Thread Anton Blanchard via Linuxppc-dev
the user CPU feature bits. Without this patch userspace processes will think they can execute TM instructions and get killed when they try. Signed-off-by: Anton Blanchard <an...@samba.org> Cc: sta...@vger.kernel.org --- Michael I've added stable here because I'm seeing this on a number of d

[PATCH] spapr: Don't set the TM ibm,pa-features bit in PR KVM mode

2016-04-04 Thread Anton Blanchard via Linuxppc-dev
We don't support transactional memory in PR KVM, so don't tell the OS that we do. Signed-off-by: Anton Blanchard <an...@samba.org> --- diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index e7be21e..538bd87 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -696,6 +696,12 @@ stati

Re: PR KVM and TM issues

2016-04-04 Thread Anton Blanchard via Linuxppc-dev
Hi Alexey, > > I can't get an Ubuntu Wily guest to boot on an Ubuntu Wily host in > > PR KVM mode. The kernel in both cases is 4.2. To reproduce: > > > > wget -N > > https://cloud-images.ubuntu.com/wily/current/wily-server-cloudimg-ppc64el-disk1.img > > > > qemu-system-ppc64 -cpu POWER8

PR KVM and TM issues

2016-04-04 Thread Anton Blanchard via Linuxppc-dev
Hi, I can't get an Ubuntu Wily guest to boot on an Ubuntu Wily host in PR KVM mode. The kernel in both cases is 4.2. To reproduce: wget -N https://cloud-images.ubuntu.com/wily/current/wily-server-cloudimg-ppc64el-disk1.img qemu-system-ppc64 -cpu POWER8 -enable-kvm -machine pseries,kvm-type=PR

[PATCH] perf jit: genelf makes assumptions about endian

2016-03-29 Thread Anton Blanchard via Linuxppc-dev
match what gcc emits. We should first look for __powerpc64__, then __powerpc__. Fixes: 9b07e27f88b9 ("perf inject: Add jitdump mmap injection support") Signed-off-by: Anton Blanchard <an...@samba.org> --- diff --git a/tools/perf/util/genelf.h b/tools/perf/util/genelf.h ind

Re: [PATCH] powerpc/process: fix altivec SPR not being saved

2016-03-06 Thread Anton Blanchard via Linuxppc-dev
code doesn't use VRSAVE to determine which registers to > save/restore, but the value of VRSAVE is used to determine if altivec > is being used in several code paths. Nice catch, not sure how I missed that. As Ben suggests, it should definitely go to -stable as well. Feel free to add my sign of

[no subject]

2016-02-06 Thread Anton Blanchard via Linuxppc-dev
--- Begin Message --- > Since binutils 2.26 BFD is doing suffix merging on STRTAB sections. > But dedotify modifies the symbol names in place, which can also modify > unrelated symbols with a name that matches a suffix of a dotted > name. To remove the leading dot of a symbol name we can just >

[PATCH] powerpc: Call check_if_tm_restore_required() in enable_kernel_*()

2015-12-10 Thread Anton Blanchard
hm...@gmail.com> Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/kernel/process.c | 13 ++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index fffc2d2..0cffb2c 100644 --- a/arch/pow

[PATCH v2] powerpc: Call restore_sprs() before _switch()

2015-12-10 Thread Anton Blanchard
rn through ret_from_fork() or ret_from_kernel_thread(). This means restore_sprs() is not getting called for new tasks. Fix this by moving restore_sprs() before _switch(). Signed-off-by: Anton Blanchard <an...@samba.org> Fixes: 152d523e6307 ("powerpc: Create context switch helpers save_spr

[PATCH] powerpc: Fix DSCR inheritance over fork()

2015-12-09 Thread Anton Blanchard
not be working around this in the testcase, it is a kernel bug. Fix it by copying the current DSCR to the child, instead of what we had in the thread struct at last context switch. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/kernel/process.c | 2 +-

Re: [PATCH 2/2] powerpc: Copy only required pieces of the mm_context_t to the paca

2015-12-09 Thread Anton Blanchard
> Currently we copy the whole mm_context_t to the paca but only access a > few bits of it. This is wasteful of space paca and also takes quite > some time in the hot path of context switching. > > This patch pulls in only the required bits from the mm_context_t to > the paca and on context

[PATCH] powerpc: Call restore_sprs() on initial context switch

2015-12-08 Thread Anton Blanchard
This means restore_sprs() is not getting called. Add a call to it in ret_from_fork() and ret_from_kernel_thread(). Signed-off-by: Anton Blanchard <an...@samba.org> Fixes: 152d523e6307 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()") --- arch/powe

[PATCH] powerpc/vdso: Remove sys_ni_syscall and sys_call_table prototypes

2015-12-03 Thread Anton Blanchard
Prototypes for sys_ni_syscall and sys_call_table are available in header files, so remove the prototypes in c code. This was noticed when building with -flto, because the prototype for sys_ni_syscall doesn't match the function and we get a compiler error. Signed-off-by: Anton Blanchard

Re: [PATCH] cxl: Fix build failure due to -Wunused-variable behaviour change

2015-11-25 Thread Anton Blanchard
Hi Torsten, > > -ccflags-y := -Werror > > +ccflags-y := -Werror -Wno-unused-const-variable > > JFYI, my gcc-4.3 does not like this switch. > What's the minimum compiler version to build this code? -Werror is such a moving target. I'm also seeing issues when building with clang, eg:

[PATCH] powerpc: Avoid -maltivec when using clang integrated assembler

2015-11-25 Thread Anton Blanchard
Check the assembler supports -maltivec by wrapping it with call as-option. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile index 96efd82..76a315d

Re: [PATCH 1/5] powerpc: Print MSR TM bits in oops message

2015-11-16 Thread Anton Blanchard
Hi, > > +{ > > + if (val & (MSR_TM | MSR_TS_S | MSR_TS_T)) { > > + printk(",TM["); > > + printbits(val, msr_tm_bits, ""); > > + printk("]"); > > I suspect all these individual printks are going to behave badly if > we have multiple cpus crashing simultaneously.

Re: [PATCH 3/9] powerpc32: checksum_wrappers_64 becomes checksum_wrappers

2015-10-28 Thread Anton Blanchard
Hi Scott, > I wonder why it was 64-bit specific in the first place. I think it was part of a series where I added my 64bit assembly checksum routines, and I didn't step back and think that the wrapper code would be useful on 32 bit. Anton ___

[PATCH 00/19] Context switch improvements

2015-10-28 Thread Anton Blanchard
their sister functions. Scott: There are changes to the SPE code here which I have only been able to compile test. Anton -- Anton Blanchard (19): powerpc: Don't disable kernel FP/VMX/VSX MSR bits on context switch powerpc: Don't disable MSR bits in do_load_up_transact_*() functions powerpc: Create

[PATCH 01/19] powerpc: Don't disable kernel FP/VMX/VSX MSR bits on context switch

2015-10-28 Thread Anton Blanchard
microbenchmark using yield(): http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield --fp 0 0 shows an improvement of almost 3% on POWER8. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/kernel/entry_64.S | 15 +-- 1 file changed, 1 ins

[PATCH 03/19] powerpc: Create context switch helpers save_sprs() and restore_sprs()

2015-10-28 Thread Anton Blanchard
can do. - SPR writes are slow, so check that the value is changing before writing it. A context switch microbenchmark using yield(): http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield 0 0 shows an improvement of almost 10% on POWER8. Signed-off-by: Anton Blanchard

[PATCH 02/19] powerpc: Don't disable MSR bits in do_load_up_transact_*() functions

2015-10-28 Thread Anton Blanchard
Similar to the non TM load_up_*() functions, don't disable the MSR bits on the way out. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/kernel/fpu.S| 4 arch/powerpc/kernel/vector.S | 4 2 files changed, 8 deletions(-) diff --git a/arch/powerpc/kernel/f

[PATCH 05/19] powerpc: Remove UP only lazy floating point and vector optimisations

2015-10-28 Thread Anton Blanchard
UP and SMP, but in preparation for that remove these UP only optimisations. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/include/asm/processor.h | 6 -- arch/powerpc/include/asm/switch_to.h | 8 --- arch/powerpc/kernel/fpu.S| 35 --- arch/powerpc/

[PATCH 04/19] powerpc: Remove redundant mflr in _switch

2015-10-28 Thread Anton Blanchard
No need to execute mflr twice. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/kernel/entry_64.S | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index e84e5bc..c8b4225 100644 ---

[PATCH 06/19] powerpc: Simplify TM restore checks

2015-10-28 Thread Anton Blanchard
Instead of having multiple giveup_*_maybe_transactional() functions, separate out the TM check into a new function called check_if_tm_restore_required(). This will make it easier to optimise the giveup_*() functions in a subsequent patch. Signed-off-by: Anton Blanchard <an...@samba.

[PATCH 07/19] powerpc: Create mtmsrd_isync()

2015-10-28 Thread Anton Blanchard
mtmsrd_isync() will do an mtmsrd followed by an isync on older processors. On newer processors we avoid the isync via a feature fixup. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/include/asm/reg.h | 8 arch/powerpc/kernel/process.c

[PATCH 09/19] powerpc: Move part of giveup_fpu,altivec,spe into c

2015-10-28 Thread Anton Blanchard
Move the MSR modification into new c functions. Removing it from the low level functions will allow us to avoid costly MSR writes by batching them up. Move the check_if_tm_restore_required() check into these new functions. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/p

[PATCH 08/19] powerpc: Remove NULL task struct pointer checks in FP and vector code

2015-10-28 Thread Anton Blanchard
We used to allow giveup_*() to be called with a NULL task struct pointer. Now those cases are handled in the caller we can remove the checks. We can also remove giveup_altivec_notask() which is also unused. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/include/asm/switc

[PATCH 16/19] powerpc: create giveup_all()

2015-10-28 Thread Anton Blanchard
an improvement of 3% on POWER8. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/include/asm/switch_to.h | 1 + arch/powerpc/kernel/process.c| 75 arch/powerpc/kvm/book3s_pr.c | 17 +--- 3 files changed, 63 insertions(

[PATCH 18/19] powerpc: Rearrange __switch_to()

2015-10-28 Thread Anton Blanchard
Most of __switch_to() is housekeeping, TLB batching, timekeeping etc. Move these away from the more complex and critical context switching code. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/kernel/process.c | 52 +-- 1 file chang

[PATCH 13/19] powerpc: Create disable_kernel_{fp,altivec,vsx,spe}()

2015-10-28 Thread Anton Blanchard
for a debug boot option that does this and catches bad uses in other areas of the kernel. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/crypto/aes-spe-glue.c | 1 + arch/powerpc/crypto/sha1-spe-glue.c | 1 + arch/powerpc/crypto/sha256-spe-glue.c| 1 + arch/p

[PATCH 11/19] crypto: vmx: Only call enable_kernel_vsx()

2015-10-28 Thread Anton Blanchard
With the recent change to enable_kernel_vsx(), we no longer need to call enable_kernel_fp() and enable_kernel_altivec(). Signed-off-by: Anton Blanchard <an...@samba.org> --- drivers/crypto/vmx/aes.c | 3 --- drivers/crypto/vmx/aes_cbc.c | 3 --- drivers/crypto/vmx/aes_ctr.c | 3 --- d

[PATCH 14/19] powerpc: Add ppc_strict_facility_enable boot option

2015-10-28 Thread Anton Blanchard
Add a boot option that strictly manages the MSR unavailable bits. This catches kernel uses of FP/Altivec/SPE that would otherwise corrupt user state. Signed-off-by: Anton Blanchard <an...@samba.org> --- Documentation/kernel-parameters.txt | 6 ++ arch/powerpc/include/asm/reg.h

[PATCH 19/19] powerpc: clean up asm/switch_to.h

2015-10-28 Thread Anton Blanchard
Remove a bunch of unnecessary fallback functions and group things in a more logical way. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/include/asm/switch_to.h | 35 ++- arch/powerpc/kernel/process.c| 2 +- 2 files chang

[PATCH 04/19] powerpc: Remove redundant mflr in _switch

2015-10-27 Thread Anton Blanchard
No need to execute mflr twice. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/kernel/entry_64.S | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index e84e5bc..c8b4225 100644 ---

[PATCH 03/19] powerpc: Create context switch helpers save_sprs() and restore_sprs()

2015-10-27 Thread Anton Blanchard
can do. - SPR writes are slow, so check that the value is changing before writing it. A context switch microbenchmark using yield(): http://ozlabs.org/~anton/junkcode/context_switch2.c ./context_switch2 --test=yield 0 0 shows an improvement of almost 10% on POWER8. Signed-off-by: Anton Blanchard

[PATCH 11/19] crypto: vmx: Only call enable_kernel_vsx()

2015-10-27 Thread Anton Blanchard
With the recent change to enable_kernel_vsx(), we no longer need to call enable_kernel_fp() and enable_kernel_altivec(). Signed-off-by: Anton Blanchard <an...@samba.org> --- drivers/crypto/vmx/aes.c | 3 --- drivers/crypto/vmx/aes_cbc.c | 3 --- drivers/crypto/vmx/aes_ctr.c | 3 --- d

[PATCH 10/19] powerpc: Move part of giveup_vsx into c

2015-10-27 Thread Anton Blanchard
functions, and allows us to use flush_vsx_to_thread() in the signal code. Move the check_if_tm_restore_required() check in. Signed-off-by: Anton Blanchard <an...@samba.org> --- arch/powerpc/kernel/process.c | 28 +++- arch/powerpc/kernel/signal_32.c | 4 ++-- arch/p

<    1   2   3   4   5   6   7   8   9   10   >