Hello,
I have been playing around with the statistical distributions in R, and
overall I think the accuracy is very good. However, it seems that for the
Student's t distribution, the CDF loses accuracy when evaluated at values
close to zero. For instance, I did the following in R
----------------------------------
df<-seq(1,100,by=1)
z<-c(0.50003183098839799,0.50003535533895194,0.50003675525997071,
0.50003749999985481,0.50003796066840744,0.50003827327749706,
0.50003849914427922,0.50003866990364754,0.50003880349244212 ,
0.50003891083995444,0.50003899897813187,0.50003907263208447,
0.50003913510150055,0.50003918874627440,0.50003923531785055,
0.50003927612461441,0.50003931217478748,0.50003934425324170,
0.50003937297989520,0.50003939886014204, 0.50003942229165621,
0.50003944360703978,0.50003946308016112,0.50003948094039441,
0.50003949738053710,0.50003951256485324,0.50003952663295181,
0.50003953969680248,0.50003955185925653,0.50003956322006460,
0.50003957385523301,0.50003958382054481 ,0.50003959318443636,
0.50003960200394315,0.50003961032679112,0.50003961818144815,
0.50003962562026172,0.50003963266089213,0.50003963934773465,
0.50003964569404735,0.50003965173577758,0.50003965749688895,
0.50003966298323521, 0.50003966823056478,0.50003967322766096,
0.50003967801868676,0.50003968260005904,0.50003968700228751,
0.50003969121916547,0.50003969526955183,0.500039699153400
63,0.50003970290428668,0.50003970650705731,0.50003970997149927 ,
0.50003971332909936,0.50003971654204993,0.50003971964040972,
0.50003972264367180,0.50003972553808163,0.50003972835715427,
0.50003973106835642,0.50003973370765664,0.50003973624942966,
0.50003973868896101,0.50003974107556448, 0.50003974338818691,
0.50003974563557085,0.50003974781567961,0.50003974993203681,
0.50003975199594708,0.50003975399737965,0.50003975593675354,
0.50003975782715593,0.50003975966389691,0.50003976145762119,
0.50003976321975063,0.50003976489560775 ,0.50003976655049909,
0.50003976818673812,0.50003976975798736,0.50003977127434285,
0.50003977277055756,0.50003977423495483,0.50003977566285773,
0.50003977705769798,0.50003977841313474,0.50003977975147973,
0.50003978102874791, 0.50003978230822732,0.50003978356836509,
0.50003978477872879,0.50003978596096421,0.50003978713049724,
0.50003978827577344,0.50003978935715154,0.50003979045422919,
0.50003979153680134,0.50003979256756137,0.50003979358957851,
0.50003979462027492 )
y<-pt(1e-4,df)
plot(df,(z-y)/z, type="s")
----------------------------------
Basically, I evaluate the CDF of the Student's t distribution at 1e-4, for
integer degrees of freedom from 1 to 100; the variable y contains the values
computed by R, and the variable z contains the values obtained by
Mathematica (with about 17 digits precision). The plot of relative accuracy
shows that we get only about 10 digits of accuracy with R. It seems there is
something is going wrong in the computations. Is this correct, or am I
missing something?
I am using a precompiled binary version of R on windows xp. Following
information might be helpful as well
----------------------------
> version
_
platform i386-pc-mingw32
arch i386
os mingw32
system i386, mingw32
status
major 2
minor 5.1
year 2007
month 06
day 27
svn rev 42083
language R
version.string R version 2.5.1 (2007-06-27)
> sessionInfo()
R version 2.5.1 (2007-06-27)
i386-pc-mingw32
locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
attached base packages:
[1] "stats" "graphics" "grDevices" "utils" "datasets" "methods"
"base"
>
----------------------------
Any help is appreciated.
Thanks.
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.