On 12/10/16 09:59, Christophe Lyon wrote:
Hi Kyrill,

On 7 October 2016 at 17:00, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote:
Hi Christophe,

On 07/09/16 21:05, Christophe Lyon wrote:

The attached patch is a first part to solve PR 67591: it removes
several occurrences of "IT blocks containing 32-bit Thumb
instructions are deprecated in ARMv8" messages in the
gcc/g++/libstdc++/fortran testsuites.

It does not remove them all yet. This patch only modifies the
*cmp_and, *cmp_ior, *ior_scc_scc, *ior_scc_scc_cmp,
*and_scc_scc and *and_scc_scc_cmp patterns.
Additional work is required in sub_shiftsi etc, at least.
I've started looking at these, but I decided I could already
post this self-contained patch to check if this implementation
is OK.

Regarding *cmp_and and *cmp_ior patterns, the addition of the
enabled_for_depr_it attribute is aggressive in the sense that it keeps
only the alternatives with 'l' and 'Py' constraints, while in some
cases the constraints could be relaxed. Indeed, these 2 patterns can
swap their input comparisons, meaning that any of them can be emitted
in the IT-block, and is thus subject to the ARMv8 deprecation.
The generated code is possibly suboptimal in the cases where the
operands are not swapped, since 'r' could be used.

Cross-tested on arm-none-linux-gnueabihf with -mthumb/-march=armv8-a
and --with-cpu=cortex-a57 --with-mode=thumb, showing only improvements:


Bootstrapped OK on armv8l HW.

Is this OK?



  (define_insn_and_split "*ior_scc_scc"
-  [(set (match_operand:SI 0 "s_register_operand" "=Ts")
+  [(set (match_operand:SI 0 "s_register_operand" "=Ts,Ts")
         (ior:SI (match_operator:SI 3 "arm_comparison_operator"
-                [(match_operand:SI 1 "s_register_operand" "r")
-                 (match_operand:SI 2 "arm_add_operand" "rIL")])
+                [(match_operand:SI 1 "s_register_operand" "r,l")
+                 (match_operand:SI 2 "arm_add_operand" "rIL,lPy")])
                 (match_operator:SI 6 "arm_comparison_operator"
-                [(match_operand:SI 4 "s_register_operand" "r")
-                 (match_operand:SI 5 "arm_add_operand" "rIL")])))
+                [(match_operand:SI 4 "s_register_operand" "r,l")
+                 (match_operand:SI 5 "arm_add_operand" "rIL,lPy")])))

Can you please put the more restrictive alternatives (lPy) first?
Same with the other patterns your patch touches.
Ok with that change if a rebased testing run is ok.
Sorry for the delay in reviewing.

OK, I will update my patch accordingly.

However, when I discussed this with Ramana during the Cauldron,
he requested benchmark results. So far, I was able to run spec2006
on an APM machine, and I'm seeing performance changes in the
range 11% improvement (465.tonto) to 7% degradation (433.milc).

Would that be acceptable?

Those sound like quite large swings.
Are you sure the machine was not running anything else at the time
or playing tricks with frequency scaling?
Did all iterations of SPEC show a consistent difference?

If the changes are consistent, could you have a look at the codegen
to see if there are any clues to the differences?

I'd like to get an explanation for these differences before committing
this patch. If they are just an effect of the more restrictive constraints
then there may be not much we can do, but I'd like to make sure there's not
anything else unexpected going on.

The number of warnings (IT blocks containing 32-bit Thumb instructions
are deprecated in ARMv8)
was 712 without my patch and 122 with it. (using the hosts's binutils
I expected some warning, since as I said earlier other patterns need
to be updated.

Understood. That's fine.



Thanks for improving this area!

Reply via email to