https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113134

--- Comment #14 from JuzheZhong <juzhe.zhong at rivai dot ai> ---
(In reply to Tamar Christina from comment #13)
> (In reply to JuzheZhong from comment #12)
> > (In reply to Tamar Christina from comment #11)
> > > (In reply to JuzheZhong from comment #10)
> > > > (In reply to Tamar Christina from comment #9)
> > > > > (In reply to JuzheZhong from comment #8)
> > > > > > Suppose the loop mask is generated by whilelo instruction of ARM 
> > > > > > SVE.
> > > > > > 
> > > > > > Suppose we have 8 elements in a single whole vector.
> > > > > > 
> > > > > > mask = whilo (0, res) if res = 6, then mask = 11111000.
> > > > > > data = 12345678
> > > > > > 
> > > > > > Then if it is early break. You are reversing both data and mask as 
> > > > > > follows:
> > > > > > 
> > > > > > new_mask = 00011111
> > > > > > new_data = 87654321
> > > > > > 
> > > > > > Then use the EXTRACT_LAST, we will get value = 1 for early break.
> > > > > > 
> > > > > > Am I right ?
> > > > > 
> > > > > Yeah, the idea being the scalar loop will then run from 1 to 6 to do 
> > > > > any
> > > > > side effects that we couldn't apply.
> > > > > 
> > > > > We went with this approach first because it works for non-masked
> > > > > architectures too. In GCC-15 we'll try to implement staying entirely 
> > > > > inside
> > > > > a vector loop by splitting the mask in elements until first active and
> > > > > element from first active so we can correctly mask the operations.
> > > > 
> > > > Ok. For the current approach. Isn't it the first element is always 
> > > > element 0
> > > > ?
> > > > 
> > > > Since for ARM SVE loop mask is generated by whilelo instructions, it 
> > > > always
> > > > set
> > > > mask bit from 0 to the last active element - 1.
> > > 
> > > sure, but you can't use BIT_FIELD_REF on VLA vectors.
> > 
> > So, for length partial vector. We can use VEC_EXTRACT with index = 0 since
> > VEC_EXTRACT optab allows VLA vectors now for length target.
> 
> Sounds good :)

I wonder whether ARM SVE can also use this approach VEC_EXTRACT with index = 0.

I guess the only issue is that when mask = all zero. That is, there is no
active elements, What behavior should be here for early break ?

Reply via email to