Hi Richard,

>         * config/aarch64/aarch64.md (cpymemdi): Remove pattern condition.

> Shouldn't this be a separate patch?  It's not immediately obvious that this 
> is a necessary part of this change.

You mean this?

@@ -1627,7 +1627,7 @@ (define_expand "cpymemdi"
    (match_operand:BLK 1 "memory_operand")
    (match_operand:DI 2 "general_operand")
    (match_operand:DI 3 "immediate_operand")]
-   "!STRICT_ALIGNMENT || TARGET_MOPS"
+   ""

Yes that's necessary since that is the bug.

> +  unsigned align = INTVAL (operands[3]);
>
>This should read the value with UINTVAL.  Given the useful range of the 
>alignment, it should be OK that we're not using unsigned HWI.

I'll fix that.

> +  if (!CONST_INT_P (operands[2]) || (STRICT_ALIGNMENT && align < 16))
>      return aarch64_expand_cpymem_mops (operands);
>
> So what about align=4 and copying, for example, 8 or 12 bytes; wouldn't we 
> want a sequence of LDR/STR in that case?  Doesn't this fall back to MOPS too 
> eagerly?

The goal was to fix the issue in way that is both obvious and can be easily 
backported.
Further improvements can be made to handle other alignments, but it is
slightly tricky (eg. align == 4 won't emit LDP/STP directly using current code
and thus would need additional work to generalize the LDP path).
  
>> +  unsigned max_mops_size = aarch64_mops_memcpy_size_threshold;
>
>I find this name slightly confusing.  Surely it's min_mops_size (since above 
>that we want to use MOPS rather than inlined loads/stores).  But why not just 
>use aarch64_mops_memcpy_size_threshold directly in the one place it's used?

The reason is that in a follow-on patch I check 
aarch64_mops_memcpy_size_threshold
too, so for now this acts as a shortcut for the ridiculously long name.

> Are there any additional tests for this?

There are existing tests that check the expansion which fail if you completely
block expansions with STRICT_ALIGNMENT.

Cheers,
Wilco

Reply via email to