This is a general usability/UX remark: when there are NaNs in the input, the SciPy tests return NaNs as a statistic and p-value. This is an behavior inherited from NumPy reduction functions (such as np.sum), however, in the context of statistical functions this is very unintuitive, potentially misleading (easily can be interpreted as non-significant or highly significant result), and can cost some time to debug.
See for example: * https://stackoverflow.com/questions/77087907/kruskal-wallis-test-always-gives-nan-values * https://github.com/scipy/scipy/issues/20056 A simple solution that I propose is: add a warning message that will appear if there are NaNs in the input, so that the user could be alerted/immediately notified (tipped) on what is the reason for the observed behaviour and how to proceed with debugging such an issue. I am not sure about it, and have not tested it, but knowing the general philosophy of R or Excel, they would just disregard NaNs (implicit .dropna()) and returned the user with the desired statistics. Such radical solution is out of scope, I believe, in Python, but please consider adding at least a warning. Kind Regards, Mikolaj _______________________________________________ SciPy-Dev mailing list -- scipy-dev@python.org To unsubscribe send an email to scipy-dev-le...@python.org https://mail.python.org/mailman3/lists/scipy-dev.python.org/ Member address: arch...@mail-archive.com