Hello R-Users,
        I'm new to R so I apologize in advance for any big mistake I might  
be doing. I'm trying to fit a set of samples with some probabilistic  
curve, and I have an important question to ask; in particular I have  
some data, from which I calculate manually the CDF, and then I import  
them into R and try to fit: I have the x values (my original samples)  
and the y values (P(X<x)).
        To attempt the fit I've both fitdistr() and nls(), in the way you  
can read in the piece of code at the end of the email. Because the  
fit with all data doesn't work very well, I decided to take a subset  
of samples randomly chosen (for some random x, the correspondant y is  
chosen).
        The first big problem is that fitdistr and nls, in the way I use  
them in the code, they don't get me similar results. I think I'm  
making a mistake but I can't really understand which one.
        From this first issue, a second one arises because the plot with nls  
is similar (maybe not a great fit bust still...) to my original CDF  
while the plot of fitdistr is basically a straight line of constant  
value y=1. On the other side, the fitdistr outperforms in the  
Kolmogorov-Smirnov test, which for nls has values of probability  
around 0 (not a good score huh?).
        Can u please tell me if there's a major mistake in the code?

Thanks in advance,
Luca


------ BEGINNING OF CODE  
----------------------------------------------------------------
cdf.all=read.table("all_failures.cdf", header=FALSE, col.names=c 
("ttr", "cdf"), sep=":" )

allvals.x=array(t(cdf.all[1]))
allvals.y=array(t(cdf.all[2]))
library(MASS)
bestval.exp.nls=bestval.exp.fit=-1
plot(allvals.x, allvals.y)

for(it in 1:100){
        #extract random samples
        random=sort(sample(1:length(allvals.x), 15))
        somevals.x=allvals.x[c(random)]
        somevals.y=allvals.y[c(random)]
        #fit with nls and fitdistr
        fit.exp = fitdistr(somevals.y, "exponential")
        nls.exp <- nls(somevals.y ~ pexp(somevals.x, rate), start=list(rate=. 
0001), model=TRUE)
        #plot what you get out of the two fits
        lines(allvals.x, pexp(allvals.x, coef(fit.exp)), col=it)
        lines(allvals.x, pexp(allvals.x, coef(nls.exp)), col=it)
        #perform kolmogorov-smirnov test on your fit
        ks.exp.nls = ks.test(somevals.y, "pexp", coef(nls.exp))
        ks.exp.fit = ks.test(somevals.y, "pexp", coef(fit.exp))
        
        bestval.exp.fit = max(bestval.exp.fit, ks.exp.fit$p.value)
        bestval.exp.nls = max(bestval.exp.nls, ks.exp.nls$p.value)
}

print(bestval.exp.fit)
print(bestval.exp.nls)

----------END OF  
CODE--------------------------------------------------------------------

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to