Re: Guard use of modulo in cshift (speedup protein)

Richard Guenther Wed, 11 Apr 2012 01:27:39 -0700

On Tue, Apr 10, 2012 at 5:40 PM, Michael Matz <m...@suse.de> wrote:
> Hi,
>
> On Tue, 10 Apr 2012, Steven Bosscher wrote:
>
>> This is OK.
>
> r186283.
>
>> Do you think it would be worthwhile to do this transformation in the
>> middle end too, based on profile information for values?
>
> I'd think so, but it probably requires a new profiler that counts for how
> often 0 <= A <= B for every "A % B".  Just profiling the range of values
> might be misleading (because A <= N and B <= M and N <= M doesn't imply
> that A <= B often holds).
>
> But it would possibly be an interesting experiment already to do such
> transformation generally (without profiling) and see what it gives on some
> benchmarks.  Just to get a feel what's on the plate.


The question is, of course, why on earth is a modulo operation in the
loop setup so expensive that avoiding it improves the performance of
the overall routine so much ... did you expect the code-gen difference
of your patch?

Richard.

>> IIRC value-prof
>> handles constant divmod but not ranges for modulo operations.
>

> Ciao,
> Michael.

Re: Guard use of modulo in cshift (speedup protein)

Reply via email to