https://bugs.kde.org/show_bug.cgi?id=339596
--- Comment #20 from Julian Seward <[email protected]> --- (In reply to Mark Wielaard from comment #18) > Here are some testcases for the FMA4 instructions. Excellent. > I haven't looked yet at the XOP instructions. > Maybe it is an idea to do FMA4 and XOP as separate patches? Hmm, no particular opinion on that from here. If you find it more convenient, then yes. > First the 256bit ymm operations aren't supported, so they have been disabled > in the testcase for now. But I am not sure we really should enable the fma4 > cpuid bit in valgrind before we really support them. I agree. Enabling cpuid bits before the all the associated instructions have been implemented has caused us trouble before. > Secondly some 128bit xmm operations should clear the upper 128 bits > of the corresponding YMM register to zeros and don't do so at the moment. In guest_amd64_toIR.c there's a function putYMMRegLoAndZU (ZU = "Zero Upper") which does exactly that. Maybe that would be helpful here? > Lastly the "full 0xFF" testcases do show some differences (but the zeros, > ones and random cases all look fine). It may be that such values are infinities, NaNs etc and we don't handle those quite right. It would be much preferable if we did. I think we should indeed try to achieve bit-identical results to the hardware, and review the cases where that seems to be problematic. At least for initial verification, you could copy the scheme used in randV128() in none/tests/arm64/fp_and_simd.c. This guarantees that all F32 and F64 values embedded within vectors are normalised numbers, and so you can at least differentiate easily, failures caused by incorrect handling of denorms (NaNs, Infs etc) vs failures for other reasons. In the long run though I would prefer bit-identical results. -- You are receiving this mail because: You are watching all bug changes.
