Hi Michael,
> > 14c00-14c00 g exc_virt_0x4c00_system_call
>^
>What's this? The address? If so it's wrong?
Offset into the binary I think, there's one 64kB page of ELF gunk at
the start.
> Seems likely. But I can't see why.
>
> AFAICS we have never emitted a size for those symbols:
>
Hi Peter,
> > I updated to mainline as of today and tried CPU hotplug via the
> > ppc64_cpu tool:
>
> http://lkml.kernel.org/r/1475922278-3306-1-git-send-email-wanpeng...@hotmail.com
Thanks, this fixes the issue for me.
Anton
Hi,
Unfortunately sysrq-b seems to tie us up in knots, instead of rebooting
the box. This is mainline from today.
Anton
--
Trying to free IRQ 17 from IRQ context!
[ cut here ]
WARNING: CPU: 32 PID: 0 at kernel/irq/manage.c:1460 __free_irq+0x298/0x380
Modules linked in:
Hi,
I updated to mainline as of today and tried CPU hotplug via the ppc64_cpu
tool:
# ppc64_cpu --smt=off
Segmentation fault
Looks like a NULL pointer in select_idle_sibling():
Unable to handle kernel paging request for data at address 0x0078
Faulting instruction address:
From: Anton Blanchard <an...@samba.org>
During context switch, switch_mm() sets our current CPU in mm_cpumask.
We can avoid this atomic sequence in most cases by checking before
setting the bit.
Testing on a POWER8 using our context switch microbenchmark:
tools/testing/selftests/p
From: Anton Blanchard <an...@samba.org>
I see quite a lot of static branch mispredictions on a simple
web serving workload. The issue is in __atomic_add_unless(), called
from _atomic_dec_and_lock(). There is no obvious common case, so it
is better to let the hardware predict the branch.
From: Anton Blanchard <an...@samba.org>
No real need for this to be pr_warn(), reduce it to pr_info().
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/kernel/eeh.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/eeh.c b/
From: Anton Blanchard <an...@samba.org>
An hcall was recently added that does exactly what we need
during kexec - it clears the entire MMU hash table, ignoring any
VRMA mappings.
Try it and fall back to the old method if we get a failure.
On a POWER8 box with 5TB of memory, this r
From: Anton Blanchard <an...@samba.org>
We are starting to see i40e adapters in recent machines, so enable
it in our configs.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/configs/powernv_defconfig | 1 +
arch/powerpc/configs/ppc64_defconfig | 1 +
arch/pow
From: Anton Blanchard <an...@samba.org>
Change a few devices and filesystems that are seldom used any more
from built in to modules. This reduces our vmlinux about 500kB.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/configs/powernv_defconfig | 14 +++-
From: Anton Blanchard <an...@samba.org>
When we issue a system reset, every CPU in the box prints an Oops,
including a backtrace. Each of these can be quite large (over 4kB)
and we may end up wrapping the ring buffer and losing important
information.
Bump the base size from 128kB to
From: Anton Blanchard <an...@samba.org>
We see big improvements with the VMX crypto functions (often 10x or more),
so enable it as a module.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/configs/powernv_defconfig | 2 ++
arch/powerpc/configs/ppc64_defconfig
Hi Marcelo
> This series fixes the memory corruption found by Jan Stancek in
> 4.8-rc7. The problem however also affects previous versions of the
> driver.
If it affects previous versions, please add the lines in the sign off to
get it into the stable kernels.
Anton
Hi Ben,
> Hrm.. this should really be a runtime switch...
I wonder if anyone is still running POWER7 LE, perhaps we could drop it
entirely.
Anton
From: Anton Blanchard <an...@samba.org>
We supported POWER7 CPUs for bootstrapping little endian, but the
target was always POWER8. Now that POWER7 specific issues are
impacting performance, change the default target to POWER8.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
From: Anton Blanchard <an...@samba.org>
POWER8 handles unaligned accesses in little endian mode, but commit
0b5e6661ac69 ("powerpc: Don't set HAVE_EFFICIENT_UNALIGNED_ACCESS on
little endian builds") disabled it for all.
The issue with unaligned little endian accesses is specif
Hi Nick,
> Hmm. If we execute this loop once, we'll only fetch additional nops.
> Twice, and we make up for them by not fetching unused instructions.
> More than twice and we may start winning.
>
> For large sizes it probably helps, but I'd like to see what sizes
> memset sees.
I noticed this
From: Anton Blanchard <an...@samba.org>
__kernel_get_syscall_map and __kernel_clock_getres use cmpli to
check if the passed in pointer is non zero. cmpli maps to a 32 bit
compare on binutils, so we ignore the top 32 bits.
A simple test case can be created by passing in a bogus p
Hi,
> But I can't merge that patch.
>
> Our options are one or both of:
> - get GCC fixed and backport the fix to the compilers we care about.
> - blacklist the broken compiler versions.
>
> Is there a GCC bug filed for this?
Likely: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71709
We
ly
> ftrace/kprobes etc.).
>
> Fix it by adding MEM_KEEP() directives, mirroring what TEXT_TEXT does.
>
> This isn't a problem when CONFIG_MEMORY_HOTPLUG=n, because we use the
> standard INIT_TEXT_SECTION() and EXIT_TEXT macros from vmlinux.lds.h.
Thanks Michael, looks good:
Tested
Hi,
> This patch causes an oops when building with the gold linker:
Found the problem. On binutils .meminit.text is within _stext/_etext:
[Nr] Name TypeAddress OffSize ES Flg
Lk Inf Al
[ 3] .meminit.text PROGBITSc0989d14 999d14
Hi,
> From: Kevin Hao
>
> As we just did for CPU features.
This patch causes an oops when building with the gold linker:
Unable to handle kernel paging request for data at address 0xf000
Faulting instruction address: 0xc0971544
Oops: Kernel access of
Hi Nick,
> Hmm. If we execute this loop once, we'll only fetch additional nops.
> Twice, and we make up for them by not fetching unused instructions.
> More than twice and we may start winning.
>
> For large sizes it probably helps, but I'd like to see what sizes
> memset sees.
I found this in
Hi Christophe,
> > Align the hot loops in our assembly implementation of memset()
> > and backwards_memcpy().
> >
> > backwards_memcpy() is called from tcp_v4_rcv(), so we might
> > want to optimise this a little more.
> >
> > Signed-off-by: Anton Bl
Hi Michael,
> Is VEC_CRYPTO the right feature?
>
> That's new power8 crypto stuff.
The vpmsum* instructions are part of the same pipeline as the vcipher*
instructions, introduced in POWER8.
> I thought this only used VMX? (but I haven't looked closely)
Yes, vcipher* and vpmsum* are VMX
From: Anton Blanchard <an...@samba.org>
Align the hot loops in our assembly implementation of memset()
and backwards_memcpy().
backwards_memcpy() is called from tcp_v4_rcv(), so we might
want to optimise this a little more.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch
From: Anton Blanchard <an...@samba.org>
This patch utilises the GENERIC_CPU_AUTOPROBE infrastructure
to automatically load the crc32c-vpmsum module if the CPU supports
it.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/crypto/crc32c-vpmsum_glue.c | 3 ++-
1 fil
Hi Michael,
> The optimised crc32c implementation depends on VMX (aka. Altivec)
> instructions, so the kernel must be built with Altivec support in
> order for the crc32c code to build.
Thanks for that, looks good.
Acked-by: Anton Blanchard <an...@samba.org>
> Fixes: 6dd
Hi David,
> > This switch few of the page table accessor to use the __raw variant
> > and does the cpu to big endian conversion of constants. This helps
> > in generating better code.
>
> It might be better to say that checks for a value being 0 don't depend
> on the endianness.
>
> In which
> > I noticed tip started failing in my CI environment which tests on
> > QEMU. The failure bisected to commit
> > 425209e0abaf2c6e3a90ce4fedb935c10652bf80
>
> That's very useful, thanks Anton!
>
> I have removed this commit from the series for the time being,
> refactored the followup
Hi Anna-Maria,
> >> Install the callbacks via the state machine and let the core invoke
> >> the callbacks on the already online CPUs.
> >
> > This is causing an oops on ppc64le QEMU, looks like a NULL
> > pointer:
>
> Did you tested it against tip WIP.hotplug?
I noticed tip started failing
Hi,
> From: Sebastian Andrzej Siewior
>
> Install the callbacks via the state machine and let the core invoke
> the callbacks on the already online CPUs.
This is causing an oops on ppc64le QEMU, looks like a NULL pointer:
percpu: Embedded 3 pages/cpu @c0001fe0
From: Anton Blanchard <an...@samba.org>
We see big improvements with the VMX crypto functions (often 10x or more),
so enable it as a module.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/configs/powernv_defconfig | 2 ++
arch/powerpc/configs/ppc64_defconfig
Hi Steven,
> Not in ppc64_defconfig?
Good point. The recent addition of powernv_defconfig made me forget
about ppc64_defconfig. We could do with some rationalisation here.
pseries isn't really pseries, perhaps we should call it ibm_defconfig,
or maybe server_deconfig. ppc64_defconfig continues
From: Anton Blanchard <an...@samba.org>
We see big improvements with the VMX crypto functions (often 10x or more),
so enable it as a module.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/configs/powernv_defconfig | 2 ++
arch/powerpc/configs/pseries_defconfig | 2
From: Anton Blanchard <an...@samba.org>
Use the vector polynomial multiply-sum instructions in POWER8 to
speed up crc32c.
This is just over 41x faster than the slice-by-8 method that it
replaces. Measurements on a 4.1 GHz POWER8 show it sustaining
52 GiB/sec.
A simple btrfs write perfo
From: Anton Blanchard <an...@samba.org>
gcc provides FUNC_START/FUNC_END macros to help with creating
assembly functions. Mirror these in the kernel so we can more easily
share code between userspace and the kernel. FUNC_END is just a
stub since we don't currently annotate the end of
Hi,
I was noticing some pretty big run to run variations on single threaded
benchmarks, and I've isolated it cpuidle issues. If I look at
the cpuidle tracepoint, I notice we only go into the snooze state.
Do we have any known bugs in cpuidle at the moment?
While looking around, I also noticed
From: Anton Blanchard <an...@samba.org>
All of the VMX AES ciphers (AES, AES-CBC and AES-CTR) are set at
priority 1000. Unfortunately this means we never use AES-CBC and
AES-CTR, because the base AES-CBC cipher that is implemented on
top of AES inherits its priority.
To fix this, AES-CBC a
From: Anton Blanchard <an...@samba.org>
When calling ppc-xlate.pl, we pass it either linux-ppc64 or
linux-ppc64le. The script however was expecting linux64le, a result
of its OpenSSL origins. This means we aren't obeying the ppc64le
ABIv2 rules.
Fix this by checking for linux-ppc64le.
From: Anton Blanchard <an...@samba.org>
There are a few issues with our handling of the ibm,pa-features
TM bit:
- We don't support transactional memory in PR KVM, so don't tell
the OS that we do.
- In full emulation we have a minimal implementation of TM that always
fai
From: Anton Blanchard <an...@samba.org>
We need the PPC_FEATURE2_HAS_HTM bit in a subsequent patch, so
add the PowerPC AT_HWCAP2 definitions.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
diff --git a/include/elf.h b/include/elf.h
index 28d448b..8533b2a 100644
--- a/include
Hi Michael,
> I'd really rather __find_linux_pte_or_hugepte() was an internal
> detail, rather than the standard API.
>
> We do already have quite a few uses, but adding more just further
> spreads the details about how the implementation works.
>
> So I'm going to drop this in favor of
> Huh? Make it an unsigned long please, which is the type of the msr
> field in struct pt_regs to work on both 32 and 64 bit processors.
Thanks, not sure what I was thinking there. Will respin.
Anton
___
Linuxppc-dev mailing list
Hi,
> current_stack_pointeur() is a single instruction function. it
> It is not worth breaking the execution flow with a bl/blr for a
> single instruction
Check out bfe9a2cfe91a ("powerpc: Reimplement __get_SP() as a function
not a define") to see why we made it a function.
Anton
Hi,
> I see the same issue in unmap_page_range(), __hash_page_64K(),
> handle_mm_fault().
This looks to be about 10% slower on POWER8:
#include
#include
#include
#define ITERATIONS 1000
#define MEMSIZE (128 * 1024 * 1024)
int main(void)
{
unsigned long i = ITERATIONS;
Hi Ben,
> That is surprising, do we have any idea what specifically increases
> the overhead so significantly ? Does gcc know about ldbrx/stdbrx ? I
> notice in our io.h for example we still do manual ld/std + swap
> because old processors didn't know these, we should fix that for
> CONFIG_POWER8
Hi,
> On Sun, 2016-05-29 at 22:03 +1000, Anton Blanchard wrote:
> > From: Anton Blanchard <an...@samba.org>
> >
> > In setup_sigcontext(), we set current->thread.vrsave then use it
> > straight after. Since current is hidden from the compiler via inl
From: Anton Blanchard <an...@samba.org>
In many cases we disable interrupts right before calling
find_linux_pte_or_hugepte().
find_linux_pte_or_hugepte() first checks interrupts are disabled
before calling __find_linux_pte_or_hugepte():
if (!arch_irqs_di
From: Anton Blanchard <an...@samba.org>
In setup_sigcontext(), we set current->thread.vrsave then use it
straight after. Since current is hidden from the compiler via inline
assembly, it cannot optimise this and we end up with a load hit store.
Fix this by using a temporary.
S
From: Anton Blanchard <an...@samba.org>
In both __giveup_fpu() and __giveup_altivec() we make two modifications
to tsk->thread.regs->msr. gcc decides to do a read/modify/write of
each change, so we end up with a load hit store:
ld r9,264(r10)
rldicl
Hi,
> This enables us to share the same page table code for
> both radix and hash. Radix use a hardware defined big endian
> page table
This is measurably worse (a little over 2% on POWER8) on a futex
microbenchmark:
#define _GNU_SOURCE
#include
#include
#include
#define ITERATIONS 1000
Align the hot loops in our assembly implementation of strncpy(),
strncmp() and memchr().
Signed-off-by: Anton Blanchard <an...@samba.org>
---
Index: linux.junk/arch/powerpc/lib/string.S
===
--- linux.junk.orig/arch/power
just remove the assembly versions.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
index e40010a..da3cdff 100644
Index: linux.junk/arch/powerpc/include/asm/string.h
===
--- linux.junk.orig/arch/powerpc/include/asm/st
The comment explaining why we modify VRSAVE is misleading, glibc
does rely on the behaviour. Update the comment.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
diff --git a/arch/powerpc/kernel/vector.S b/arch/powerpc/kernel/vector.S
index 1c2e7a3..3907fcf 100644
--- a/arch/powerpc/
We don't support transactional memory in PR KVM, so don't tell
the OS that we do.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
v2: Fix build with CONFIG_KVM disabled, noticed by Alex.
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b69995e..dc3e3c9 100644
--- a/hw/ppc/spapr.c
er for working with me to find this issue.
Signed-off-by: Anton Blanchard <an...@samba.org>
Cc: <sta...@vger.kernel.org>
Fixes: d0cebfa650a0 ("powerpc: word-at-a-time optimization for 64-bit Little
Endian")
---
diff --git a/arch/powerpc/include/asm/word-at-a-time.h
b/arch/powerpc
instructions and it dies trying.
This (together with a QEMU patch) fixes PR KVM, which doesn't currently
support TM.
Signed-off-by: Anton Blanchard <an...@samba.org>
Cc: sta...@vger.kernel.org
---
arch/powerpc/kernel/prom.c | 9 +
1 file changed, 5 insertions(+), 4 deletions(-)
scan_features() updates cpu_user_features but not cpu_user_features2.
Amongst other things, cpu_user_features2 contains the user TM feature
bits which we must keep in sync with the kernel TM feature bit.
Signed-off-by: Anton Blanchard <an...@samba.org>
Cc: sta...@vger.kernel.org
---
MMU-related features")
Signed-off-by: Anton Blanchard <an...@samba.org>
Cc: sta...@vger.kernel.org
---
arch/powerpc/kernel/prom.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 7030b03..9a3a7c6 100644
--- a/arc
Hi Jack,
> Previously we just saved the FSCR, but only restored it in some
> settings, and never copied it thread to thread. This patch always
> restores the FSCR and formalizes new threads inheriting its setting so
> that later we can manipulate FSCR bits in start_thread.
Will this break the
task_pt_regs() can return NULL for kernel threads, so add a check.
This fixes an oops at boot on ppc64.
Fixes: d740037fac70 ("sched/cpuacct: Split usage accounting into user_usage and
sys_usage")
Signed-off-by: Anton Blanchard <an...@samba.org>
Reported-and-Tested-by: Srikar
atch below does fix the oops for me.
Anton
--
task_pt_regs() can return NULL for kernel threads, so add a check.
This fixes an oops at boot on ppc64.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index df947e0..41f85c4 100644
the user CPU feature bits.
Without this patch userspace processes will think they can execute
TM instructions and get killed when they try.
Signed-off-by: Anton Blanchard <an...@samba.org>
Cc: sta...@vger.kernel.org
---
Michael I've added stable here because I'm seeing this on a number
of d
We don't support transactional memory in PR KVM, so don't tell
the OS that we do.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index e7be21e..538bd87 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -696,6 +696,12 @@ stati
Hi Alexey,
> > I can't get an Ubuntu Wily guest to boot on an Ubuntu Wily host in
> > PR KVM mode. The kernel in both cases is 4.2. To reproduce:
> >
> > wget -N
> > https://cloud-images.ubuntu.com/wily/current/wily-server-cloudimg-ppc64el-disk1.img
> >
> > qemu-system-ppc64 -cpu POWER8
Hi,
I can't get an Ubuntu Wily guest to boot on an Ubuntu Wily host in PR KVM
mode. The kernel in both cases is 4.2. To reproduce:
wget -N
https://cloud-images.ubuntu.com/wily/current/wily-server-cloudimg-ppc64el-disk1.img
qemu-system-ppc64 -cpu POWER8 -enable-kvm -machine pseries,kvm-type=PR
match what gcc
emits. We should first look for __powerpc64__, then __powerpc__.
Fixes: 9b07e27f88b9 ("perf inject: Add jitdump mmap injection support")
Signed-off-by: Anton Blanchard <an...@samba.org>
---
diff --git a/tools/perf/util/genelf.h b/tools/perf/util/genelf.h
ind
code doesn't use VRSAVE to determine which registers to
> save/restore, but the value of VRSAVE is used to determine if altivec
> is being used in several code paths.
Nice catch, not sure how I missed that. As Ben suggests, it should
definitely go to -stable as well.
Feel free to add my sign of
--- Begin Message ---
> Since binutils 2.26 BFD is doing suffix merging on STRTAB sections.
> But dedotify modifies the symbol names in place, which can also modify
> unrelated symbols with a name that matches a suffix of a dotted
> name. To remove the leading dot of a symbol name we can just
>
hm...@gmail.com>
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/kernel/process.c | 13 ++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index fffc2d2..0cffb2c 100644
--- a/arch/pow
rn through
ret_from_fork() or ret_from_kernel_thread(). This means restore_sprs() is
not getting called for new tasks.
Fix this by moving restore_sprs() before _switch().
Signed-off-by: Anton Blanchard <an...@samba.org>
Fixes: 152d523e6307 ("powerpc: Create context switch helpers save_spr
not be working around this in the testcase, it is a kernel bug.
Fix it by copying the current DSCR to the child, instead of what we
had in the thread struct at last context switch.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/kernel/process.c | 2 +-
> Currently we copy the whole mm_context_t to the paca but only access a
> few bits of it. This is wasteful of space paca and also takes quite
> some time in the hot path of context switching.
>
> This patch pulls in only the required bits from the mm_context_t to
> the paca and on context
This means restore_sprs() is not getting called. Add a call to it
in ret_from_fork() and ret_from_kernel_thread().
Signed-off-by: Anton Blanchard <an...@samba.org>
Fixes: 152d523e6307 ("powerpc: Create context switch helpers save_sprs() and
restore_sprs()")
---
arch/powe
Prototypes for sys_ni_syscall and sys_call_table are available
in header files, so remove the prototypes in c code.
This was noticed when building with -flto, because the prototype for
sys_ni_syscall doesn't match the function and we get a compiler error.
Signed-off-by: Anton Blanchard
Hi Torsten,
> > -ccflags-y := -Werror
> > +ccflags-y := -Werror -Wno-unused-const-variable
>
> JFYI, my gcc-4.3 does not like this switch.
> What's the minimum compiler version to build this code?
-Werror is such a moving target. I'm also seeing issues when building
with clang, eg:
Check the assembler supports -maltivec by wrapping it with
call as-option.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
index 96efd82..76a315d
Hi,
> > +{
> > + if (val & (MSR_TM | MSR_TS_S | MSR_TS_T)) {
> > + printk(",TM[");
> > + printbits(val, msr_tm_bits, "");
> > + printk("]");
>
> I suspect all these individual printks are going to behave badly if
> we have multiple cpus crashing simultaneously.
Hi Scott,
> I wonder why it was 64-bit specific in the first place.
I think it was part of a series where I added my 64bit assembly checksum
routines, and I didn't step back and think that the wrapper code would
be useful on 32 bit.
Anton
___
their sister functions.
Scott: There are changes to the SPE code here which I have only been
able to compile test.
Anton
--
Anton Blanchard (19):
powerpc: Don't disable kernel FP/VMX/VSX MSR bits on context switch
powerpc: Don't disable MSR bits in do_load_up_transact_*() functions
powerpc: Create
microbenchmark using yield():
http://ozlabs.org/~anton/junkcode/context_switch2.c
./context_switch2 --test=yield --fp 0 0
shows an improvement of almost 3% on POWER8.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/kernel/entry_64.S | 15 +--
1 file changed, 1 ins
can
do.
- SPR writes are slow, so check that the value is changing before
writing it.
A context switch microbenchmark using yield():
http://ozlabs.org/~anton/junkcode/context_switch2.c
./context_switch2 --test=yield 0 0
shows an improvement of almost 10% on POWER8.
Signed-off-by: Anton Blanchard
Similar to the non TM load_up_*() functions, don't disable the MSR
bits on the way out.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/kernel/fpu.S| 4
arch/powerpc/kernel/vector.S | 4
2 files changed, 8 deletions(-)
diff --git a/arch/powerpc/kernel/f
UP and SMP, but
in preparation for that remove these UP only optimisations.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/include/asm/processor.h | 6 --
arch/powerpc/include/asm/switch_to.h | 8 ---
arch/powerpc/kernel/fpu.S| 35 ---
arch/powerpc/
No need to execute mflr twice.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/kernel/entry_64.S | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index e84e5bc..c8b4225 100644
---
Instead of having multiple giveup_*_maybe_transactional() functions,
separate out the TM check into a new function called
check_if_tm_restore_required().
This will make it easier to optimise the giveup_*() functions in a
subsequent patch.
Signed-off-by: Anton Blanchard <an...@samba.
mtmsrd_isync() will do an mtmsrd followed by an isync on older
processors. On newer processors we avoid the isync via a feature fixup.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/include/asm/reg.h | 8
arch/powerpc/kernel/process.c
Move the MSR modification into new c functions. Removing it from
the low level functions will allow us to avoid costly MSR writes
by batching them up.
Move the check_if_tm_restore_required() check into these new functions.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/p
We used to allow giveup_*() to be called with a NULL task struct
pointer. Now those cases are handled in the caller we can remove
the checks. We can also remove giveup_altivec_notask() which is also
unused.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/include/asm/switc
an improvement of 3% on POWER8.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/include/asm/switch_to.h | 1 +
arch/powerpc/kernel/process.c| 75
arch/powerpc/kvm/book3s_pr.c | 17 +---
3 files changed, 63 insertions(
Most of __switch_to() is housekeeping, TLB batching, timekeeping etc.
Move these away from the more complex and critical context switching
code.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/kernel/process.c | 52 +--
1 file chang
for a debug boot option that
does this and catches bad uses in other areas of the kernel.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/crypto/aes-spe-glue.c | 1 +
arch/powerpc/crypto/sha1-spe-glue.c | 1 +
arch/powerpc/crypto/sha256-spe-glue.c| 1 +
arch/p
With the recent change to enable_kernel_vsx(), we no longer need
to call enable_kernel_fp() and enable_kernel_altivec().
Signed-off-by: Anton Blanchard <an...@samba.org>
---
drivers/crypto/vmx/aes.c | 3 ---
drivers/crypto/vmx/aes_cbc.c | 3 ---
drivers/crypto/vmx/aes_ctr.c | 3 ---
d
Add a boot option that strictly manages the MSR unavailable bits.
This catches kernel uses of FP/Altivec/SPE that would otherwise
corrupt user state.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
Documentation/kernel-parameters.txt | 6 ++
arch/powerpc/include/asm/reg.h
Remove a bunch of unnecessary fallback functions and group
things in a more logical way.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/include/asm/switch_to.h | 35 ++-
arch/powerpc/kernel/process.c| 2 +-
2 files chang
No need to execute mflr twice.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/kernel/entry_64.S | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index e84e5bc..c8b4225 100644
---
can
do.
- SPR writes are slow, so check that the value is changing before
writing it.
A context switch microbenchmark using yield():
http://ozlabs.org/~anton/junkcode/context_switch2.c
./context_switch2 --test=yield 0 0
shows an improvement of almost 10% on POWER8.
Signed-off-by: Anton Blanchard
With the recent change to enable_kernel_vsx(), we no longer need
to call enable_kernel_fp() and enable_kernel_altivec().
Signed-off-by: Anton Blanchard <an...@samba.org>
---
drivers/crypto/vmx/aes.c | 3 ---
drivers/crypto/vmx/aes_cbc.c | 3 ---
drivers/crypto/vmx/aes_ctr.c | 3 ---
d
functions, and allows us to use
flush_vsx_to_thread() in the signal code.
Move the check_if_tm_restore_required() check in.
Signed-off-by: Anton Blanchard <an...@samba.org>
---
arch/powerpc/kernel/process.c | 28 +++-
arch/powerpc/kernel/signal_32.c | 4 ++--
arch/p
101 - 200 of 1164 matches
Mail list logo