On Tue, 13 Nov 2007 20:07:33 +0200, Anon. <[EMAIL PROTECTED]> wrote:

>Sami Ullah wrote:
>> Hey Ecologers:
>>
>> I have a various variables for running multiple linear regression model
>> using GLM. Some of my predictor variables are non-normally distributed.
>> Using multiple linear regression, I use proc-univariate to check if the
>> residuals in the regression model met the normality criteria, which the
>> model did.
>>
>> Now I am wondering if it is advisable if I can keep the skewed predictor
>> variables in the model or have to go for non-parametric analysis?
>>
>>
>The distribution of the predictor variables is irrelevant, so you can
>happily keep them in.  Well, the distribution is almost irrelevant.  You
>can get problems if they are co-linear (i.e. highly correlated), or if
>you have outliers (which can have a large influence on the fit).

Agree. One extra thing....I would argue that normality of explanatory 
variables (predictors) is actually bad. It means that most observations 
have the same (or similar) value for that explanatory variable, which may 
(!) make it more difficult to find a significant effect. Bad experimental 
design. Perhaps a histogram shaped like the uniform distribution would be 
the best. It means that you have similar number of observations for each 
part of you sampled gradient...for that explanatory variable.  

My suggestion is not to transform anything (dependent and independent), 
unless you have outliers in your explanatory variables. Or if you have 
something trivial like weight-length data. 


>
>I've come across the impression that the predictors have to be normally
>distributed a few times, but I don't know where it originates from -
>certainly not from statistical theory.

It is a fairy tail that many people seem to believe in. Also the normality 
of the response variable, the raw data, is such a fairy tail. The 
assumtpion is that you have normality of your response (dependent) 
variable at EACH X value, and without 30 replicates or so, you cannot 
check this. And who has 30 replicates?

Kind regards,
Alain



Dr. Alain F. Zuur
First author of:   

1. Analysing Ecological Data (2007).  
Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p.
URL: www.springer.com/0-387-45967-7


2. Analysing Ecological data using GLMM and GAMM in R. (2008). 
Zuur, AF, Ieno, EN, Walker, N and Smith, GM. Springer.


3. An introduction to R for the life scientists: - With a paper submission 
guide - (2008).
Zuur, AF, Ieno, EN and Meesters, EHGW. Springer


Other books: http://www.brodgar.com/books.htm


Statistical consultancy, courses, data analysis and software
Highland Statistics Ltd.
6 Laverock road
UK - AB41 6FN Newburgh
Tel: 0044 1358 788177
Email: [EMAIL PROTECTED]
URL: www.highstat.com
URL: www.brodgar.com






>
>Bob
>
>--
>Bob O'Hara
>Department of Mathematics and Statistics
>P.O. Box 68 (Gustaf Hällströmin katu 2b)
>FIN-00014 University of Helsinki
>Finland
>
>Telephone: +358-9-191 51479
>Mobile: +358 50 599 0540
>Fax:  +358-9-191 51400
>WWW:  http://www.RNI.Helsinki.FI/~boh/
>Blog: http://deepthoughtsandsilliness.blogspot.com/
>Journal of Negative Results - EEB: www.jnr-eeb.org
>=========================================================================

Reply via email to