Thank you very much for clarifying this point. My algorithm is certainly pretty bad because as you say I am basically looking at zeros. One point I don't really understand is that for a pollen type I have a lot of pollen collected at date 1, some at time 2, few at time 3 and not at all at time 4. I get a significant difference between time 1 and 2 but no significance between 1 and 3 or 1 and 4. That is illogical...maybe is it anyway a problem of the residuals because the residuals are pretty well balanced for time points with fitted values >0, but for time points with no pollen collected there is no variance at all. Well I think that if I had a very large number of data such that the non-zero part of my data would look nicely continuous I could use some zero-inflated models, but with only 4 points in time and a positive part of the model which does not fit well a continuous distribution it is difficult. I'd certainly better take a descriptive way of presenting my data for sparse pollen types.
Best wishes Valérie > Message du 04/02/13 à 13h15 > De : "Liz Pryde" > A : "v_coudr...@voila.fr" > Copie à : > Objet : Re: [R-sig-eco] proportion data with many zeros > > Hi, > If you're using a categorical predictor those QQ plots Etc are pretty > useless. Just do a residuals vs fits plots and make sure the residuals look > Randomly scattered. > > Is the problem with the smaller pollen types just that they're very low > across all time scales? The algorithm won't fit b/c you're basically looking > at zero data - or a vector of zeroes. So you can assume that this is sig diff from the abundant types. This is to do with the way ML estimation works - it's a bit complicated. > Some people suggest using bayes methods for this (& it works well) but its > way too over-complicated for what you're trying to answer. > > The mean variance relationship is specified by the 'family' part if the GLM > formula. It is essentially the error structure if your data. > Liz > > > On 04/02/2013, at 7:55 PM, v_coudr...@voila.fr wrote: > > > I tried to use tweedie and it again worked very well for the most abundant > > pollen types and when trying to fit the less abundant ones I got the error: > > "glm.fit: > > algorithm did not converge". > > I have the impress that it is hopeless to try fitting a model...But anyway > > thank you very much for making me aware of tweedie. I still should go a bit > > more into the > > theorical background. I just wonder about the residuals. For the pollen > > types that can be modelled, the QQ-plots don't look very nice, but the > > residuals are relatively > > well homogeneously distributed. It is difficult to judge how good the fit > > is, but the results make sense in regard to the raw data. > > > > Valérie > > ___________________________________________________________ > > CAN 2013 : résultats et matchs en direct à suivre sur Voila.fr > > http://sports.voila.fr/football/can/ > ___________________________________________________________ CAN 2013 : résultats et matchs en direct à suivre sur Voila.fr http://sports.voila.fr/football/can/ _______________________________________________ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology