[ 
https://issues.apache.org/jira/browse/STATISTICS-25?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17220111#comment-17220111
 ] 

Andreas Stefik commented on STATISTICS-25:
------------------------------------------

Hi there Gilles,

 

First, thanks so much for checking and responding so quickly. I appreciate it. 
I'm not a Python expert, but scipy's continuous distributions are here:

[https://github.com/scipy/scipy/blob/v1.5.3/scipy/stats/_continuous_distns.py]

 

My best guess is that the distributions for student's t start around line 5893, 
with_cdf as the function. That seems to call down to their special functions 
library, (e.g., sc.). So far as I can tell, that connects down to the CEPHES 
library in C. I was actually able to track that down here:

[https://github.com/scipy/scipy/blob/master/scipy/special/cephes/stdtr.c]

 

I haven't looked in detail at their algorithm, but on brief inspection it looks 
like a different approach. How accurate it is I couldn't say, but hopefully 
that helps.

> T Distribution Inverse Cumulative Probability Function gives the Wrong Answer
> -----------------------------------------------------------------------------
>
>                 Key: STATISTICS-25
>                 URL: https://issues.apache.org/jira/browse/STATISTICS-25
>             Project: Apache Commons Statistics
>          Issue Type: Bug
>            Reporter: Andreas Stefik
>            Priority: Major
>
> Hi There,
> Given code like this:
>  
> import org.apache.commons.math3.analysis.UnivariateFunction;
> import org.apache.commons.math3.analysis.solvers.BrentSolver;
> import org.apache.commons.math3.distribution.TDistribution;
> public class Main {
>  public static void main(String[] args) {
>  double df = 1E38;
>  double t = 0.975;
>  TDistribution dist = new TDistribution(df);
>  
>  double prob = dist.inverseCumulativeProbability(1.0 - t);
>  
>  System.out.println("Prob: " + prob);
>  }
> }
>  
> It is possible I am misunderstanding, but that seems equivalent to:
>  
> scipy.stats.t.cdf(1.0 - 0.975, 1e38)
>  
> In Python. They give different answers. Python gives 0.509972518193, which 
> seems correct, whereas Apache Commons gives  Prob: -6.462184036284304E-10. 
> That's a huge difference.
> My hunch is that as you get closer to infinity it begins to fail, but I 
> haven't checked carefully. For calls with much smaller degrees of freedom, 
> you get answers that are basically the same as Python or online calculators.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to