On Mon, 2011-12-19 at 14:00 +1100, Benjamin Herrenschmidt wrote:
> On Thu, 2011-12-08 at 17:11 +1100, Anton Blanchard wrote:
> > Implement a POWER7 optimised copy_to_user/copy_from_user using VMX.
> > For large aligned copies this new loop is over 10% faster, and for
> > large unaligned copies it i
On Thu, 2011-12-08 at 17:11 +1100, Anton Blanchard wrote:
> Implement a POWER7 optimised copy_to_user/copy_from_user using VMX.
> For large aligned copies this new loop is over 10% faster, and for
> large unaligned copies it is over 200% faster.
Breaks !CONFIG_ALTIVEC build an pops some WARN's wit
On Thu, 2011-12-08 at 17:04 +1100, Anton Blanchard wrote:
> Hi,
>
> > I hate the idea of having a POWER7 FTR bit. Every loon will (and has
> > tried to in the past) attach every POWER7 related thing to it, rather
> > than thinking about what the feature really is for.
> >
> > What about other
One idea would be to have a structure of function pointers for each
CPU that gets runtime patched into the right places,
similar to how we do some of the MMU fixups.
Sounds good to me :-)
Except the indirect jump/call is almost certainly
never predicted - so will be slow.
What indirect jump?
> > One idea would be to have a structure of function pointers for each
> > CPU that gets runtime patched into the right places,
> > similar to how we do some of the MMU fixups.
>
> Sounds good to me :-)
Except the indirect jump/call is almost certainly
never predicted - so will be slow.
You
I hate the idea of having a POWER7 FTR bit. Every loon will (and has
tried to in the past) attach every POWER7 related thing to it, rather
than thinking about what the feature really is for.
What about other processors which could also benefit from this copy
loop? Turning on CPU_FTR_POWER7 for
Hi,
> I hate the idea of having a POWER7 FTR bit. Every loon will (and has
> tried to in the past) attach every POWER7 related thing to it, rather
> than thinking about what the feature really is for.
>
> What about other processors which could also benefit from this copy
> loop? Turning on
> > +#define CPU_FTR_POWER7 = LONG_ASM_CONST(0x2000)
> Can we find a means to do the fixup that does NOT require a FTR bit? I
> have the feeling FSL will want to have various optimized copy functions
> for our different cores and I hate to blow features bits just for this.
+1
I hat
On Dec 7, 2011, at 11:02 PM, Anton Blanchard wrote:
> Index: linux-build/arch/powerpc/include/asm/cputable.h
> ===
> --- linux-build.orig/arch/powerpc/include/asm/cputable.h 2011-09-07
> 15:15:49.096458526 +1000
> +++ linux-bui