Re: [PATCH 1/3] powerpc: POWER7 optimised copy_page using VMX

2011-06-17 Thread Segher Boessenkool
+ addir1,r1,STACKFRAMESIZE + + .align 5 Do we know that the blank will be filled with something harmless ? Yes. See ppc_handle_align() in gas/config/tc-ppc.c : it fills with nops (ori 0,0,0), and a branch if there are more than four nops, and for POWER6 and POWER7 it puts a

[PATCH 1/3] powerpc: POWER7 optimised copy_page using VMX

2011-06-16 Thread Anton Blanchard
Implement a POWER7 optimised copy_page using VMX. We copy a cacheline at a time using VMX loads and stores. Signed-off-by: Anton Blanchard an...@samba.org --- How do we want to handle per machine optimised functions? I create yet another feature bit, but feature bits might get out of control at

Re: [PATCH 1/3] powerpc: POWER7 optimised copy_page using VMX

2011-06-16 Thread Michael Neuling
Implement a POWER7 optimised copy_page using VMX. We copy a cacheline at a time using VMX loads and stores. Signed-off-by: Anton Blanchard an...@samba.org --- How do we want to handle per machine optimised functions? I create yet another feature bit, but feature bits might get out of

Re: [PATCH 1/3] powerpc: POWER7 optimised copy_page using VMX

2011-06-16 Thread Anton Blanchard
Hi, Yeah, I'm pretty against CPU_FTR_POWER7. Every loon is going to attach anything POWER7 to it. I'm keen to see it setup in __setup_cpu_power7. Either a function pointer or use the patch_instruction infrastructure to avoid indirect function calls on small copies. Instruction

Re: [PATCH 1/3] powerpc: POWER7 optimised copy_page using VMX

2011-06-16 Thread Benjamin Herrenschmidt
On Fri, 2011-06-17 at 14:53 +1000, Anton Blanchard wrote: plain text document attachment (power7_copypage) Implement a POWER7 optimised copy_page using VMX. We copy a cacheline at a time using VMX loads and stores. Signed-off-by: Anton Blanchard an...@samba.org --- How do we want to

Re: [PATCH 1/3] powerpc: POWER7 optimised copy_page using VMX

2011-06-16 Thread Benjamin Herrenschmidt
On Fri, 2011-06-17 at 14:53 +1000, Anton Blanchard wrote: +#include asm/page.h +#include asm/ppc_asm.h + +#define STACKFRAMESIZE 112 + +_GLOBAL(copypage_power7) + mflrr0 + std r3,48(r1) + std r4,56(r1) + std r0,16(r1) + stdu