> >>> Currently the `gaussian' family's initialization code signals an error if > >>> any response data are zero or negative and a log link is used. Given that > >>> zero or negative response data are perfectly legitimate under the GLM > >>> fitted using `gaussian("log")', this seems a bit unsatisfactory. Might > >>> it be worth changing it? > >> > >> Well, that's not really true: it says it is unable to find starting > >> values, and asks you to provide some. I don't really see why your choices > >> are satisfactory: what happens if all the data are negative for example? > > > > - then the suggested code traps it. > > Yes, but why (it can happen too)?
- isn't this (i.e. all response <=0 ) a case where the MLE's of the linear predictor must tend to minus infinity? So there's probably an argument for handling it differently [although my proposed error message is not the most informative, since modifying the starting values won't help]. best, Simon > > >> We could try even harder, but code that is almost never used tends to get > >> broken whilst no one is testing it. So if you want to pursue it I think > >> we need a comprehensive test suite. > > > > - well, I guess, but I wouldn't have thought that zeros in data modelled > > using `gaussian("log")' is such a rare occurance is it? [Or did you mean > > that `gaussian("log")' is almost never used, and should hence be kept > > simple]. > > I meant initialization code for this case would rarely get used, as I > suppose those people who do this are used to providing starting values > (or there are few or no such people). > > > > > I suppose there are arguments both ways... > > > > best, > > Simon > > > >> > >>> The current offending code from `gaussian' is: > >>> > >>> initialize = expression({ > >>> n <- rep.int(1, nobs) > >>> if (is.null(etastart) && is.null(start) && is.null(mustart) && > >>> ((family$link == "inverse" && any(y == 0)) || > >>> (family$link == "log" && any(y <= 0)))) stop( > >>> "cannot find valid starting values: please specify some") > >>> mustart <- y > >>> }) > >>> > >>> A possible replacement would be ... > >>> > >>> initialize = expression({ > >>> n <- rep.int(1, nobs) > >>> if (is.null(etastart) && is.null(start) && is.null(mustart) && > >>> ((family$link == "inverse" && all(y == 0)) || > >>> (family$link == "log" && all(y <= 0)))) stop( > >>> "cannot find valid starting values: please specify some") > >>> mustart <- y > >>> if (family$link=="log") { > >>> iy <- y<=0 > >>> if (sum(iy)) mustart[iy] <- min(y[!iy])*.5 > >>> } else if (family$link=="inverse") { > >>> iy <- y==0 > >>> if (sum(iy)) mustart[iy] <- min(abs(y[!iy]))*.5 > >>> } > >>> }) > >>> > >>> best, > >>> Simon > >>> > >>> > >>>> - Simon Wood, Mathematical Sciences, University of Bath, Bath BA2 7AY > >>>> - +44 (0)1225 386603 www.maths.bath.ac.uk/~sw283/ > >>> > >>> ______________________________________________ > >>> R-devel@r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-devel > >>> > >>> > >> > >> -- > >> Brian D. Ripley, [EMAIL PROTECTED] > >> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > >> University of Oxford, Tel: +44 1865 272861 (self) > >> 1 South Parks Road, +44 1865 272866 (PA) > >> Oxford OX1 3TG, UK Fax: +44 1865 272595 > >> > > > > > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel