[SciPy-Dev] Adding tau-a to scipy.stats.kendalltau variants and then changing Somers' D calculation to using tau-a instead of crosstab for better significant runtime improvements

P. v.H. Fri, 19 Jan 2024 04:43:45 -0800

Hello, 

this is my first time trying to contribute, so please be not too harsh.


When I recently used the scipy.stats.somersd function on larger data I 
experienced quite some runtime problems. I found a way to calculate Somers' D 
in an equivalent manner by using D(Y|X) = tau_a(X, Y)/tau_a(X, X), for which I 
added the support for variant "a" to the scipy.stats.kendalltau function. The 
runtime improvement was significant for large datasets where this approach 
achieved approx. 30 times faster runtimes. I believe the reason for this 
runtime improvement is due to the crosstab calculation in the current setup, 
while kendalltau uses for the disconcordant measures a cypthon implementation 
making it much faster. 

Would be great to have someone I could ask if I have questions in the process 
of submitting my contribution and maybe to also review my code. 

Thanks a lot and best regards coming from Vienna 

Paul
_______________________________________________
SciPy-Dev mailing list -- scipy-dev@python.org
To unsubscribe send an email to scipy-dev-le...@python.org
https://mail.python.org/mailman3/lists/scipy-dev.python.org/
Member address: arch...@mail-archive.com

[SciPy-Dev] Adding tau-a to scipy.stats.kendalltau variants and then changing Somers' D calculation to using tau-a instead of crosstab for better significant runtime improvements

Reply via email to