Re: [PATCH] ARM: convert all mov.* pc, reg to bx reg for ARMv6+ (part1)

2014-07-04 Thread Tomasz Figa
Hi Russell,

On 01.07.2014 18:19, Russell King wrote:
 ARMv6 and greater introduced a new instruction (bx) which can be used
 to return from function calls.  Recent CPUs perform better when the
 bx lr instruction is used rather than the mov pc, lr instruction,
 and this sequence is strongly recommended to be used by the ARM
 architecture manual (section A.4.1.1).
 
 We provide a new macro ret with all its variants for the condition
 code which will resolve to the appropriate instruction.
 
 Rather than doing this piecemeal, and miss some instances, change all
 the mov pc instances to use the new macro, with the exception of
 the movs instruction and the kprobes code.  This allows us to detect
 the mov pc, lr case and fix it up - and also gives us the possibility
 of deploying this for other registers depending on the CPU selection.
 
 Signed-off-by: Russell King rmk+ker...@arm.linux.org.uk

Build- and boot-tested on mach-s3c64xx.

Tested-by: Tomasz Figa tomasz.f...@gmail.com

Best regards,
Tomasz
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ARM: convert all mov.* pc, reg to bx reg for ARMv6+ (part1)

2014-07-01 Thread Måns Rullgård
Russell King rmk+ker...@arm.linux.org.uk writes:

 ARMv6 and greater introduced a new instruction (bx) which can be used
 to return from function calls.  Recent CPUs perform better when the
 bx lr instruction is used rather than the mov pc, lr instruction,
 and this sequence is strongly recommended to be used by the ARM
 architecture manual (section A.4.1.1).

 We provide a new macro ret with all its variants for the condition
 code which will resolve to the appropriate instruction.

When the source register is not lr the name ret is a misnomer since
only the bx lr instruction is predicted as a function return.  The
bx instruction with other source registers uses the normal prediction
mechanisms, leaving the return stack alone, and should not be used for
function returns.  Any code currently using another register to return
from a function should probably be modified to use lr instead, unless
there are special reasons for doing otherwise.  If code jumping to an
address in a non-lr register is not a return, using the ret name will
make for some rather confusing reading.

I would suggest either using a more neutral name than ret or adding an
alias to be used for non-return jumps so as to make the intent clearer.

-- 
Måns Rullgård
m...@mansr.com
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ARM: convert all mov.* pc, reg to bx reg for ARMv6+ (part1)

2014-07-01 Thread Russell King - ARM Linux
On Tue, Jul 01, 2014 at 05:42:42PM +0100, Måns Rullgård wrote:
 Russell King rmk+ker...@arm.linux.org.uk writes:
 
  ARMv6 and greater introduced a new instruction (bx) which can be used
  to return from function calls.  Recent CPUs perform better when the
  bx lr instruction is used rather than the mov pc, lr instruction,
  and this sequence is strongly recommended to be used by the ARM
  architecture manual (section A.4.1.1).
 
  We provide a new macro ret with all its variants for the condition
  code which will resolve to the appropriate instruction.
 
 When the source register is not lr the name ret is a misnomer since
 only the bx lr instruction is predicted as a function return.  The
 bx instruction with other source registers uses the normal prediction
 mechanisms, leaving the return stack alone, and should not be used for
 function returns.  Any code currently using another register to return
 from a function should probably be modified to use lr instead, unless
 there are special reasons for doing otherwise.  If code jumping to an
 address in a non-lr register is not a return, using the ret name will
 make for some rather confusing reading.
 
 I would suggest either using a more neutral name than ret or adding an
 alias to be used for non-return jumps so as to make the intent clearer.

If you read the patch, the branches which are changed are those which
do indeed return in some way.  There are those which do this having
moved lr to a different register.

As you point out, bx lr /may/ be treated specially (I've actually been
discussing this with Will Deacon over the last couple of days, who has
also been talking to the hardware people in ARM, and Will is happy with
this patch as in its current form.)  This is why I've changed all
mov pc, reg instructions which return in some way to use this macro,
and left others (those which are used to call some function and return
back to the same point) alone.

I have thought about introducing a call macro for those other sites,
but as there are soo few of them, I've left them as-is for the time
being (this patch is already large enough.)

If there are any in the patch which you have specific concerns about,
please specifically point them out those which give you a concern.

Thanks.

-- 
FTTC broadband for 0.8mile line: now at 9.7Mbps down 460kbps up... slowly
improving, and getting towards what was expected from it.
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ARM: convert all mov.* pc, reg to bx reg for ARMv6+ (part1)

2014-07-01 Thread Måns Rullgård
Russell King - ARM Linux li...@arm.linux.org.uk writes:

 On Tue, Jul 01, 2014 at 05:42:42PM +0100, Måns Rullgård wrote:
 Russell King rmk+ker...@arm.linux.org.uk writes:
 
  ARMv6 and greater introduced a new instruction (bx) which can be used
  to return from function calls.  Recent CPUs perform better when the
  bx lr instruction is used rather than the mov pc, lr instruction,
  and this sequence is strongly recommended to be used by the ARM
  architecture manual (section A.4.1.1).
 
  We provide a new macro ret with all its variants for the condition
  code which will resolve to the appropriate instruction.
 
 When the source register is not lr the name ret is a misnomer since
 only the bx lr instruction is predicted as a function return.  The
 bx instruction with other source registers uses the normal prediction
 mechanisms, leaving the return stack alone, and should not be used for
 function returns.  Any code currently using another register to return
 from a function should probably be modified to use lr instead, unless
 there are special reasons for doing otherwise.  If code jumping to an
 address in a non-lr register is not a return, using the ret name will
 make for some rather confusing reading.
 
 I would suggest either using a more neutral name than ret or adding an
 alias to be used for non-return jumps so as to make the intent clearer.

 If you read the patch, the branches which are changed are those which
 do indeed return in some way.  There are those which do this having
 moved lr to a different register.

The patch is huge, and it wasn't obvious (to me) from the diff context
what the non-lr jumps were doing.

 As you point out, bx lr /may/ be treated specially (I've actually been

Most, if not all, Cortex-A cores do this according the public TRMs.
They also do the same thing for mov pc, lr so there will probably be
no performance gain from this change.  It's still a good idea though,
since we don't know what future cores will do.

 discussing this with Will Deacon over the last couple of days, who has
 also been talking to the hardware people in ARM, and Will is happy with
 this patch as in its current form.)  This is why I've changed all
 mov pc, reg instructions which return in some way to use this macro,
 and left others (those which are used to call some function and return
 back to the same point) alone.

In that case the patch should be fine.  Your patch description didn't
make it clear that only actual returns were being changed.

-- 
Måns Rullgård
m...@mansr.com
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ARM: convert all mov.* pc, reg to bx reg for ARMv6+ (part1)

2014-07-01 Thread Will Deacon
Hi Mans,

On Tue, Jul 01, 2014 at 06:24:43PM +0100, Måns Rullgård wrote:
 Russell King - ARM Linux li...@arm.linux.org.uk writes:
  As you point out, bx lr /may/ be treated specially (I've actually been
 
 Most, if not all, Cortex-A cores do this according the public TRMs.
 They also do the same thing for mov pc, lr so there will probably be
 no performance gain from this change.  It's still a good idea though,
 since we don't know what future cores will do.

Funnily enough, that's not actually true (and is more or less what prompted
this patch after discussion with Russell). There are cores out there that
don't predict mov pc, lr at all (let alone do anything with the return
stack).

  discussing this with Will Deacon over the last couple of days, who has
  also been talking to the hardware people in ARM, and Will is happy with
  this patch as in its current form.)  This is why I've changed all
  mov pc, reg instructions which return in some way to use this macro,
  and left others (those which are used to call some function and return
  back to the same point) alone.
 
 In that case the patch should be fine.  Your patch description didn't
 make it clear that only actual returns were being changed.

I'm led to believe that some predictors require lr in order to update the
return stack, whilst others don't. That part is all horribly
micro-architectural, so the current patch is doing the right thing by
sticking to the ARM ARM but enabling us to hook into other registers later
on if we choose.

Cheers,

Will
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ARM: convert all mov.* pc, reg to bx reg for ARMv6+ (part1)

2014-07-01 Thread Stephen Warren
On 07/01/2014 10:19 AM, Russell King wrote:
 ARMv6 and greater introduced a new instruction (bx) which can be used
 to return from function calls.  Recent CPUs perform better when the
 bx lr instruction is used rather than the mov pc, lr instruction,
 and this sequence is strongly recommended to be used by the ARM
 architecture manual (section A.4.1.1).
 
 We provide a new macro ret with all its variants for the condition
 code which will resolve to the appropriate instruction.
 
 Rather than doing this piecemeal, and miss some instances, change all
 the mov pc instances to use the new macro, with the exception of
 the movs instruction and the kprobes code.  This allows us to detect
 the mov pc, lr case and fix it up - and also gives us the possibility
 of deploying this for other registers depending on the CPU selection.

Tested-by: Stephen Warren swar...@nvidia.com

(On an NVIDIA Tegra Jetson TK1 board, both CPU hotplug and system sleep
were tested, which are the use-cases that actually use the edited
assembly files)
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ARM: convert all mov.* pc, reg to bx reg for ARMv6+ (part1)

2014-07-01 Thread Måns Rullgård
Will Deacon will.dea...@arm.com writes:

 Hi Mans,

 On Tue, Jul 01, 2014 at 06:24:43PM +0100, Måns Rullgård wrote:
 Russell King - ARM Linux li...@arm.linux.org.uk writes:
  As you point out, bx lr /may/ be treated specially (I've actually been
 
 Most, if not all, Cortex-A cores do this according the public TRMs.
 They also do the same thing for mov pc, lr so there will probably be
 no performance gain from this change.  It's still a good idea though,
 since we don't know what future cores will do.

 Funnily enough, that's not actually true (and is more or less what prompted
 this patch after discussion with Russell). There are cores out there that
 don't predict mov pc, lr at all (let alone do anything with the return
 stack).

Even ones where the TRM says they do?  I suppose the only way to know
for sure is to measure it.

  discussing this with Will Deacon over the last couple of days, who has
  also been talking to the hardware people in ARM, and Will is happy with
  this patch as in its current form.)  This is why I've changed all
  mov pc, reg instructions which return in some way to use this macro,
  and left others (those which are used to call some function and return
  back to the same point) alone.
 
 In that case the patch should be fine.  Your patch description didn't
 make it clear that only actual returns were being changed.

 I'm led to believe that some predictors require lr in order to update the
 return stack, whilst others don't. That part is all horribly
 micro-architectural, so the current patch is doing the right thing by
 sticking to the ARM ARM but enabling us to hook into other registers later
 on if we choose.

Agreed.

-- 
Måns Rullgård
m...@mansr.com
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ARM: convert all mov.* pc, reg to bx reg for ARMv6+ (part1)

2014-07-01 Thread Robert Jarzmik
Russell King rmk+ker...@arm.linux.org.uk writes:

 ARMv6 and greater introduced a new instruction (bx) which can be used
 to return from function calls.  Recent CPUs perform better when the
 bx lr instruction is used rather than the mov pc, lr instruction,
 and this sequence is strongly recommended to be used by the ARM
 architecture manual (section A.4.1.1).

For ARMv5 - XScale, and more specificaly the mioa701_bootresume.S path (suspend
to RAM) :
Tested-by: Robert Jarzmik robert.jarz...@free.fr

Cheers.

--
Robert
--
To unsubscribe from this list: send the line unsubscribe linux-omap in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] ARM: convert all mov.* pc, reg to bx reg for ARMv6+ (part1)

2014-07-01 Thread Nicolas Pitre
On Tue, 1 Jul 2014, Will Deacon wrote:

 Hi Mans,
 
 On Tue, Jul 01, 2014 at 06:24:43PM +0100, Måns Rullgård wrote:
  Russell King - ARM Linux li...@arm.linux.org.uk writes:
   As you point out, bx lr /may/ be treated specially (I've actually been
  
  Most, if not all, Cortex-A cores do this according the public TRMs.
  They also do the same thing for mov pc, lr so there will probably be
  no performance gain from this change.  It's still a good idea though,
  since we don't know what future cores will do.
 
 Funnily enough, that's not actually true (and is more or less what prompted
 this patch after discussion with Russell). There are cores out there that
 don't predict mov pc, lr at all (let alone do anything with the return
 stack).
 
   discussing this with Will Deacon over the last couple of days, who has
   also been talking to the hardware people in ARM, and Will is happy with
   this patch as in its current form.)  This is why I've changed all
   mov pc, reg instructions which return in some way to use this macro,
   and left others (those which are used to call some function and return
   back to the same point) alone.
  
  In that case the patch should be fine.  Your patch description didn't
  make it clear that only actual returns were being changed.
 
 I'm led to believe that some predictors require lr in order to update the
 return stack, whilst others don't. That part is all horribly
 micro-architectural, so the current patch is doing the right thing by
 sticking to the ARM ARM but enabling us to hook into other registers later
 on if we choose.

May I suggest to have a patch with only the macro definition in it and 
all this discussion in the commit log please?  The usage sites should be 
done in a separate patch to make it clearer.


Nicolas