Hi, I have aggregated data in 5 categories and want to fit a gamma distribution to it (works fine). My question is that on theoretical grounds I could claim that the observation in the first category (the zero count) are certain, i.e. I know for sure the number. This would then mean in fitting a gamma distribution to the remaining 4 categories.
My question is now how can I compare the fit between the two to choose what fits better? I think the AIC/BIC (or all likelihood based test) don't apply as I compare two different data sets (one with data for all 5 categories and one with the data of the 4 categories, hence the number of observations differs substantially) and similarly I think chi2 square test is not really revealing as in the number of observations is so large (the Null is rejected in both cases). Or would one of the tests still apply, claiming one is somehow nested in the other? Another I idea might be to fit to the 5 categories but restrict the estimator somehow so that it replicates the number of observations in the first category (all basically zero). I don't know if that is advisable (and if so, feasible). Any ideas or comments welcome. Below is some code with example data (but I have much more data to fit than just those). library(stats4) breaks.5 <- c(0,.25,20,40,60,120) #breaks counts.5 <- c(4422, 40225, 10973, 3145 ,3228) #number of obs, breaks.4 <- c(.25,20,40,60,120) #like above but just without the first categories counts.4 <- c( 40225, 10973, 3145 ,3228) #number of obs, ll <- function(shape, rate) #my likelihood function { z <- pgamma(breaks, shape=shape, rate=rate) -sum(counts * log(diff(z))) } #fitting the 5 categories breaks <-breaks.5 counts <- counts.5 results.mle.5<-mle(ll, start=list(shape=1, rate=1/mean(breaks))) #fitting the 4 categories breaks <-breaks.4 counts <- counts.4 results.mle.4<-mle(ll, start=list(shape=1, rate=1/mean(breaks))) Thanks, Stefan ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.