In article <[EMAIL PROTECTED]>, Jimc10 <[EMAIL PROTECTED]> wrote: >To all who have helped me on the previous thread thank you very much. I am >reposting this beause the question has become more focused.
>I am studying a stochastic Markov process and using a maximum likelihood >technique to fit observed data to theoretical models. As a first step I am >using a Monte Carlo technique to generate simulated data from a known model to >see if my fitting method is acurate. In particular I want to know if I can use >this techniques to dtermine the number of free parameters in the Markov Model. >I have been using the Log(Likelihood) method which seems to be widely acceted. >I am getting very small Log(Likelihood ratios) in cases when I know the more >complex model is correct (i.e. H0 should be rejected). When I first observed >this I tried increasing the N values, and found a decrease rather than an >increase in the Log(Likelihood ratio). I now think I know why. I am posting in >hopes of finding out if my proposed solution is 1)statistical heracy, 2)so >obvious that I should have realized it 6 months ago, or 3)a plausible idea in >need of validation. >The likelihood fuction I have been using up to now which I will call the FULL >likelihood function is: >L= (1/Sqrt( | CV-Matrix |))*exp((-1/2)*(O-E).(CV-Matrix^-1).(O-E)) >Where | CV-Matrix | is the determinant of the Covariance matrix, (O) is the >vector of observed values in time order and (E) is the vector of the values >predicted by the Markov model for the corresponding times. The Covariance >matrix is generated by the Markov model. >IN A NUTSHELL: It appears that the factor (1/Sqrt( | CV-Matrix |)) is the >source of the problem. In many MLE discriptions this is a constant and drops >out. In my case there is a big difference between the (1/Sqrt( | CV-Matrix |)) >for different models (several log units). I believe this may be biasing the fit >in some way. >MY PROPOSAL: I have begun fitting my data to the follwing simplified likelihood >formula: >L= exp((-1/2)*(O-E).(CV-Matrix^-1).(O-E)). >Does this seem reasonable? It is highly unlikely that it would give asymptotically optimal estimators, although there are cases where this does happen. It can happen that it will be consistent and have positive efficiency, for example if the parameter effect on E is such that L would be O(n) for any wrong parameter, and O(1) for the true parameter, all this in probability, and the covariance matrix does not blow up in too bad a manner. If the major problem is with the fit of the covariance matrix, it will not be good, and if E does not involve some of the parameters, but the covariance matrix can go to infinity on those, by doing that, L can go to 0, which would maximize it as it is negative. As you say the covariance matrix varies considerably, I would suggest including it. Maximum likelihood is ASYMPTOTICALLY optimal in LARGE samples. It may not be good for small samples; it pays to look at how the actual likelihood function behaves. The fit is always going to improve with more parameters. I believe your best bet would be robust approximate Bayesian analysis. This is hard to describe in a newsgroup posting, and in any case requires some user input. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399 [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 ================================================================= Instructions for joining and leaving this list and remarks about the problem of INAPPROPRIATE MESSAGES are available at http://jse.stat.ncsu.edu/ =================================================================