On 13.10.2016 13:44, Pitchumani Sivanupandi wrote:
On Monday 26 September 2016 08:19 PM, Georg-Johann Lay wrote:
On 26.09.2016 15:19, Pitchumani Sivanupandi wrote:
Attached patch for PR71676 and PR71678.

PR71676 is for AVR target that generates wrong code when switch case index is
more than 16 bits.

Switch case index of larger than SImode are checked for out of range before
'casesi' expand. RTL expand of casesi gets index as SImode, but index is
compared in HImode and ignores upper 16bits.

Attached patch changes the expansion for casesi to make the index comparison
in SImode and code generation accordingly.

PR71678 is ICE because below pattern in 'casesi' is not recognized.
(set (reg:HI 47)
     (minus:HI (subreg:HI (subreg:SI (reg:DI 44) 0) 0)
               (reg:HI 45)))

Fix of PR71676 avoids the above pattern as it changes the comparison
to SImode.

But this means that all comparisons are now performed in SImode which is a
great performance loss for most programs which will switch on 16-bit values.

IMO we need a less intrusive (w.r.t. performance) approach.

Yes.

I tried to split 'casesi' into several based on case values so that compare is
done
in less expensive modes (i.e. QI or HI). In few cases it is not possible without
SImode subtract/ compare.

Pattern casesi will have index in SI mode. So, out of range checks will be
expensive
as most common uses (in AVR) of case values will be in QI/HI mode.

e.g.
  if case values in QI range
    if upper three bytes index is set
      goto out_of_range

    offset = index - lower_bound (QImode)
    if offset > case_range       (QImode)
      goto out_of_range
    goto jump_table + offset

  else if case values in HI range
    if index[2,3] is set
      goto out_of_range

    offset = index - lower_bound (HImode)
    if offset > case_range       (HImode)
      goto out_of_range
    goto jump_table + offset

This modification will not work for the negative index values. Because code to
check
upper bytes of index will be expensive than the SImode subtract/ compare.

So, I'm trying to update fix to have SImode subtract/ compare if the case
values include
negative integers. For, others will try to optimize as mentioned above. Is that
approach OK?

But the above code will be executed at run time and add even more overhead, or am I missing something? If you conclude statically at expand time from the case ranges then we might hit a similar problem as with the original subreg computation.

Unfortunately, the generated code (setting cc0, a reg and pc) cannot be wrapped into an unspec or parallel and then later be rectified...

I am thinking about a new avr target pass to tidy up the code if no 32-bit computation is needed, but this will be some effort.


Johann


Alternatively we can have flags to generate shorter code for 'casesi' using 
HImode
subtract/ compare. But correctness is not guaranteed (PR71676).

Regards,
Pitchumani



Reply via email to