Hi Edwin,

sorry for the slow reply.

Currently this patch only supports clipping and returning an unsigned
narrow type. I'm unsure if this is the best way to approach the problem
as there is a similar optab .SAT_TRUNC which performs a similar
operation. The main difference between .NARROW_CLIP and .SAT_TRUNC can
be described in the example above (clipping int32_t to uint8_t)

* .SAT_TRUNC (-1)       => 255
* .NARROW_CLIP (-1)     => 0

This breaks the intended semantics of the code which is why I thought
another optab would make sense. If there is a better way to approach
this which would utilize more of the .SAT_TRUNC optab, please let me
know.

Thanks for writing this down, I think I see now what you meant during the
call. So we have a signed -> signed saturating truncate and an unsigned variant but what we need here is a signed -> unsigned one?

In that case I'd tend to refrain from adding a new IFN because, as you
note, we can achieve the same result with a max (val, 0) and an
IFN_SAT_TRUNC on it. Then indeed the question remains whether we'd rather recognize this in the vectorizer patterns or in match.pd.

As the recognizer is already in match.pd form it could be obvious to also do the conversion there. After all it would only be a few more statements.

On the other hand I'm not sure if we need the recognizer to be in match.pd. If this pattern is going to be useful on the scalar side as well, then probably. What scalar code do we currently emit for the test function? Could we improve it?

One way or another, the general structure looks reasonable to me.

--
Regards
Robin

Reply via email to