On Donnerstag, 6. Februar 2020 12:45:51 CET Lars Knoll wrote: > One problem is, that we can only get full benefit out of those if we can > offer them inline. That would basically imply making our qsimd_p.h header > public and including that one from qvectornd.h and qmatrixnxn.h (so that we > can implement the operations using the SSE/NEON intrinsics). If we do that, > we could e.g. implement QVector4D holding a __m128 value (and the neon > equivalent on ARM).
One option is also to declare QVector4D as 16 byte aligned. Then it can still be read from and written to fast by SSE code, even if it isn't declared as holding a __m128 value. (unaligned load isn't much faster than aligned load on modern architectures, but aligned reads can also be arguments to other instructions saving many load instructions). > I personally don’t think including qsimd.h (and implicitly immintrin.h) from > our public headers would be a problem, but I’d be happy to hear arguments > for/against it. I don't think it is a problem either. I just don't want to be the one documenting it ;) > As a side note: SSE 4.1 offers some nice additional instructions that would > simplify some of the operations. Should we keep the minimum requirement for > SSE at version 2, or can we raise it to 4.1? That would be great. Especially for QtCore. Though we could start by just making the default SSE4.1 enabled but still offer users (linux distros really), the option to force it down to only SSE2. You could do the same with NEON, but I think we already use that unconditionally if detected at configure time. Regards 'Allan _______________________________________________ Development mailing list [email protected] https://lists.qt-project.org/listinfo/development
