On Tue, Nov 25, 2025 at 09:06:37AM -0600, Peter Bergner wrote:
> On 11/24/25 10:49 PM, Avinash Jayakar wrote:
> > As discussed, we need to relax the requirement on __vector_pair and
> > __vector_quad so that it is not tied up to MMA.
> 
> I can believe enabling __vector_pair on pre-Power10 cpus would work,
> but I don't think you can do the same for __vector_quad, given its
> use as a proxy for the Power10 MMA accumulators.  The rs6000 port
> has some nasty code that was hard to get right that automatically
> emits the MMA insns xxmtacc & xxmfacc to prime & deprime the
> accumulators on XOmode loads and stores.  That would all have to
> be disabled on pre-Power10 cpus.

Yep.  I believe the code checks whether TARGET_MMA is set before doing
the prime and de-prime operations.  but if the __vector_quad type is
just being used as a container, you probably don't want the prime and
de-prime operations if you aren't using the quad registers as MMA
registers.

> Can you remind me what problem you are trying to solve?

I believe it is to allow using the type for future crypto stuff for the
libraries to have a consistant type.  But I agree that while
__vector_pair can be implemented on non-power10 systems (though it will
have to do 2 loads and stores), we should avoid __vector_quad.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: [email protected]

Reply via email to