On 5/15/19 9:02 AM, Michael Matz wrote:
> On Wed, 15 May 2019, Aaron Sawdey wrote:
>> Next question would be how do we move from the existing movmem pattern
>> (which Michael Matz tells us should be renamed cpymem anyway) to this
>> new thing. Are you proposing that we still have both movmem and cpymem
>> optab entries underneath to call the patterns but introduce this new
>> memmove_with_hints() to be used by things called by
>> expand_builtin_memmove() and expand_builtin_memcpy()?
>
> I'd say so. There are multiple levels at play:
> a) exposal to user: probably a new __builtint_memmove, or a new combined
> builtin with a hint param to differentiate (but we can't get rid of
> __builtin_memcpy/mempcpy/strcpy, which all can go through the same
> route in the middleend)
> b) getting it through the gimple pipeline, probably just a new builtin
> code, trivial
> c) expanding the new builtin, with the help of next items
> d) RTL block moves: they are defined as non-overlapping and I don't think
> we should change this (essentially they're the reflection of struct
> copies in C)
> e) how any of the above (builtins and RTL block moves) are implemented:
> currently non-overlapping only, using movmem pattern when possible;
> ultimately all sitting in the emit_block_move_hints() routine.
>
> So, I'd add a new method to emit_block_move_hints indicating possible
> overlap, disabling the use of move_by_pieces. Then in
> emit_block_move_via_movmem (alse getting an indication of overlap), do the
> equivalent of:
>
> finished = 0;
> if (overlap_possible) {
> if (optab[movmem])
> finished = emit(movmem)
> } else {
> if (optab[cpymem])
> finished = emit(cpymem);
> if (!finished && optab[movmem]) // can use movmem also for overlap
> finished = emit(movmem);
> }
>
> The overlap_possible method would only ever be used from the builtin
> expansion, and never from the RTL block move expand. Additionally a
> target may optionally only define the movmem pattern if it's just as good
> as the cpymem pattern (e.g. because it only handles fixed small sizes and
> uses a load-all then store-all sequence).
We currently have gimple_fold_builtin_memory_op() figuring out where there
is no overlap and converging __builtin_memmove() to __builtin_memcpy(). Would
you forsee looking for converting __builtin_memmove() with overlap into
a call to __builtin_memmove_hint if it is a case where we can define the
overlap precisely enough to provide the hint? My guess is that this wouldn't
be a very common case.
My goals for this are:
* memcpy() call becomes __builtin_memcpy and goes to optab[cpymem]
* memmove() call becomes __builtin_memmove (or __builtin_memcpy based
on the gimple analysis) and goes through optab[movmem] or optab[cpymem]
I think what you've described meets these goals and cleans things up.
Thanks,
Aaron
--
Aaron Sawdey, Ph.D. [email protected]
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain