On Tue, 13 Nov 2007 20:07:33 +0200, Anon. <[EMAIL PROTECTED]> wrote: >Sami Ullah wrote: >> Hey Ecologers: >> >> I have a various variables for running multiple linear regression model >> using GLM. Some of my predictor variables are non-normally distributed. >> Using multiple linear regression, I use proc-univariate to check if the >> residuals in the regression model met the normality criteria, which the >> model did. >> >> Now I am wondering if it is advisable if I can keep the skewed predictor >> variables in the model or have to go for non-parametric analysis? >> >> >The distribution of the predictor variables is irrelevant, so you can >happily keep them in. Well, the distribution is almost irrelevant. You >can get problems if they are co-linear (i.e. highly correlated), or if >you have outliers (which can have a large influence on the fit).
Agree. One extra thing....I would argue that normality of explanatory variables (predictors) is actually bad. It means that most observations have the same (or similar) value for that explanatory variable, which may (!) make it more difficult to find a significant effect. Bad experimental design. Perhaps a histogram shaped like the uniform distribution would be the best. It means that you have similar number of observations for each part of you sampled gradient...for that explanatory variable. My suggestion is not to transform anything (dependent and independent), unless you have outliers in your explanatory variables. Or if you have something trivial like weight-length data. > >I've come across the impression that the predictors have to be normally >distributed a few times, but I don't know where it originates from - >certainly not from statistical theory. It is a fairy tail that many people seem to believe in. Also the normality of the response variable, the raw data, is such a fairy tail. The assumtpion is that you have normality of your response (dependent) variable at EACH X value, and without 30 replicates or so, you cannot check this. And who has 30 replicates? Kind regards, Alain Dr. Alain F. Zuur First author of: 1. Analysing Ecological Data (2007). Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p. URL: www.springer.com/0-387-45967-7 2. Analysing Ecological data using GLMM and GAMM in R. (2008). Zuur, AF, Ieno, EN, Walker, N and Smith, GM. Springer. 3. An introduction to R for the life scientists: - With a paper submission guide - (2008). Zuur, AF, Ieno, EN and Meesters, EHGW. Springer Other books: http://www.brodgar.com/books.htm Statistical consultancy, courses, data analysis and software Highland Statistics Ltd. 6 Laverock road UK - AB41 6FN Newburgh Tel: 0044 1358 788177 Email: [EMAIL PROTECTED] URL: www.highstat.com URL: www.brodgar.com > >Bob > >-- >Bob O'Hara >Department of Mathematics and Statistics >P.O. Box 68 (Gustaf Hällströmin katu 2b) >FIN-00014 University of Helsinki >Finland > >Telephone: +358-9-191 51479 >Mobile: +358 50 599 0540 >Fax: +358-9-191 51400 >WWW: http://www.RNI.Helsinki.FI/~boh/ >Blog: http://deepthoughtsandsilliness.blogspot.com/ >Journal of Negative Results - EEB: www.jnr-eeb.org >=========================================================================
