[EMAIL PROTECTED] (Dalby, James WLAP:EX) wrote in news:[EMAIL PROTECTED]:
> I have been refreshing my knowledge of outliers and influential > observations in regression analysis and could use some clarification > on the difference between the two. I'm aware that some outiers are > influential while others are not, but I'm wondering whether all > influential observations are outliers. Is it possible for an > influential observation to not be an outlier? If you know of a graph > that would answer this, please refer me to it. The problem is that "outlier" is a rather poorly-defined term. By the definitions I use, it is in fact possible for an influential observation not to be an outlier. By my definition, an outlier is an observation whose value was generated by a different process than a supermajority of the observations, i.e. it's an indicator of inhomogeneity in the system. An influential observation, OTOH, is simply an observation that, if excluded from the modelling process, would result in a substantially different model. There's no logical reason that a completely homogeneous process (which by definition would not produce outliers) could not produce influential observations. The presence of influential observations suggests that any model that takes them into account may be biased from the "true" model for the underlying population. The presence of outliers (assuming that the different process isn't simply observation error), OTOH, suggests that there may not *be* One True Model for the underlying population and that a model that takes them into account may be an average of apples and oranges (take a look at the classic Hertzsprung-Russell illustration, where inhomogeneous observations result in a model that's physically impossible). . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
