I've been running regression tests for geography/proj. They pass on amd64 and aarch64. On 9 i386 and 10 i386 two of them have errors of about 5m compared to the expected "within 1 mm".
I ran paranoia (pkgsrc/benchmarks/paranoia) on 9 and 10. On 9: The number of FAILUREs encountered = 3. The number of SERIOUS DEFECTs discovered = 4. The number of DEFECTs discovered = 3. The number of FLAWs discovered = 2. which is shocking, especially as I remember earlier NetBSD i386 versions being ok, like 5 or 7. So, I am curious if anyone can run paranoia on any of 5 6 7 8 9 10 11 current and to reduce traffic, let me know offlist if you get anything much different from the above. (If your results area disaster but not quite the same numbers, that's not so critical for now.) If you think paranoia is broken, that's interesting too and probably best to be onlist. (I'm going to run it in earmv7hf-el too, but haven't finished proj testing yet.)
