If a graphical presentation provides improved insight then that is sufficient justification. The existence of "better" more precise methods, does not change that.
I, too, sometimes use jitter() to avoid overplotting of observations, but I think the dot-plots in de la Cruz's code are even better. It is the histogram that is misleading (due to paucity of data), not the effort to elucidate the joint behavior of zeros and ones. http://www.esapubs.org/bulletin/backissues/086-1/bulletinjan2005.htm#et Please try a variation that his code provides: plot.logi.hist(independ = altitude, depend = tree, logi.mod = 1, type = "dit", boxp = TRUE, rug = TRUE, las.h = 1) which does not use the histograms but instead uses "dit plots" to provide a helpful, visceral feel for the behavior of the observations. Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jari Oksanen Sent: Thursday, September 15, 2005 3:17 AM To: Frank E Harrell Jr Cc: [email protected]; Beale, Colin Subject: Re: [R] Graphical presentation of logistic regression On Wed, 2005-09-14 at 06:29 -0500, Frank E Harrell Jr wrote: > Beale, Colin wrote: > > Hi, > > > > I wonder if anyone has written any code to implement the suggestions of > > Smart et al (2004) in the Bulletin of the Ecological Society of America > > for a new way of graphically presenting the results of logistic > > regression (see > > www.esapubs.org/bulletin/backissues/085-3/bulletinjuly2004_2column.htm#t > > ools1 for the full text)? I couldn't find anything relating to this sort > > of graphical representation of logistic models in the archives, but > > maybe someone has solved it already? In short, Smart et al suggest that > > a logistic regression be presented as a combination of the two > > histograms for successes and failures (with one presented upside down at > > the top of the figure, the other the right way up at the bottom) > > overlaid by the probability function (ie logistic curve). It's somewhat > > hard to describe, but is nicely illustrated in the full text version > > above. I think it is a sensible way of presenting these results and am > > keen to do so - at the moment I can only do this by generating the two > > histograms and the logistic curve separately (using hist() and lines()), > > then copying and pasting the graphs out of R and inverting one in a > > graphics package, before overlying the others. I'm sure this could be > > done within R and would be a handy plotting function to develop. Has > > anyone done so, or can anyone give me any pointers to doing this? I > > really nead to know how to invert a histogram and how to overlay this > > with another histogram "the right way up". > > > > Any thoughts would be welcome. > > > > Thanks in advance, > > Colin > > From what you describe, that is a poor way to represent the model > except for judging discrimination ability (if the model is calibrated > well). Effect plots, odds ratio charts, and nomograms are better. See > the Design package for details. > You're correct when you say that this is a poor way to represent the model. However, you should have some understanding to us ecologists who are simple creatures working with tangible subjects such as animals and plants (microbiologists work with less tangible things). Therefore we want to have a concrete and simple representation. After all, the example was about occurrence of an animal against a concrete environmental variable, and a concrete representation was suggested. Nomograms and things are abstractions that you understand first after long education and training (I tried the Design package and I didn't understand the nomogram plot). I tried with one concrete example with my own data, and the inverted histogram method was patently misleading (with Baz Rowlingson's neat and compact code, sorry for the repetition). The method would be useful with dense and regular data only, but now the clearest visual cue was the uneven sampling intensity. With my limited knowledge on R facilities, I can now remember only two ways two preserve the concreteness of display in the base R: jitter() to avoid overplotting of observations, and sunflowerplot() to show the amount of overplotting. I think Ecological Society of America would be happy to receive papers to suggest better ways to represent binary response data, if some of the knowledgeable persons in this groups would decided to educate them (I'm not an ESA member, so I wouldn't be educated: therefore 'them' instead of 'us'). The ESA bulletin will be influential in manuscript submitted to the Society journals in the future, and the time for action is now. cheers, jari oksanen -- Jari Oksanen -- Dept Biology, Univ Oulu, 90014 Oulu, Finland Ph. +358 8 5531526, cell +358 40 5136529, fax +358 8 5531061 email [EMAIL PROTECTED], homepage http://cc.oulu.fi/~jarioksa/ ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
