On Tue, Nov 25, 2025 at 09:06:37AM -0600, Peter Bergner wrote: > On 11/24/25 10:49 PM, Avinash Jayakar wrote: > > As discussed, we need to relax the requirement on __vector_pair and > > __vector_quad so that it is not tied up to MMA. > > I can believe enabling __vector_pair on pre-Power10 cpus would work, > but I don't think you can do the same for __vector_quad, given its > use as a proxy for the Power10 MMA accumulators. The rs6000 port > has some nasty code that was hard to get right that automatically > emits the MMA insns xxmtacc & xxmfacc to prime & deprime the > accumulators on XOmode loads and stores. That would all have to > be disabled on pre-Power10 cpus.
Yep. I believe the code checks whether TARGET_MMA is set before doing the prime and de-prime operations. but if the __vector_quad type is just being used as a container, you probably don't want the prime and de-prime operations if you aren't using the quad registers as MMA registers. > Can you remind me what problem you are trying to solve? I believe it is to allow using the type for future crypto stuff for the libraries to have a consistant type. But I agree that while __vector_pair can be implemented on non-power10 systems (though it will have to do 2 loads and stores), we should avoid __vector_quad. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: [email protected]
