On Wed, 12 Feb 2003 07:35:59 +0100, "Rafal" <[EMAIL PROTECTED]>
wrote:

> Hello
> 
> I have a network of sample plots distirubuted in a regular grid in
> researched area. On each plot I measured my dependent variable and set of
> 'independent' (as independent as real world relations can be where
> everything sems to be related to everything). 

Well, you are crossing the two meanings of 'independent'.
I try to ignore the one where  it is the contrast to 'dependent',
which means 'predicted'.  But *you*  need to pay special 
attention to the other meaning, where 'independent' says
that the variables are not part of an autocorrelated series, 
such as Time (in one dimension)  or Geography (in 2D).

It sounds as if you will not have good tests, owing to 
this sort of dependency.

>                                       Now I am using multivariate
> regression to create a model  allowing to estimate dependent variable from
> set of independent. Dependent variable has normal distribution. The most
> correlated with dependent variable seems to be height above sea level
> although its distribution is very different from normal because of shape of
> area where research was made. My questions  are:
> 1)Should I try to transform it to obtain distribution more similar to normal

To (1):  Probably, No.   Thom posted one reason, that it is the 
residual that matters, not the raw distribution.  Is there a rational
basis for transformation, relative to what you are studying? - for
instance, if "altitude"  matters because of absolute air pressure,
I  imagine that "air pressure"  might be related as a log-function
instead of the raw measurements.  A transformation that is 
totally arbitrary -- I mostly feel that way about what is called
the "started logarithm", log(x+c) -- is going to be hard to 
explain, unless you can tuck it into a black-box  of 
"fuzzy logic"  or "neural nets".

> as literature suggests?

 - And, you want to be careful, in that reading of the literature.
Arbitrary transformations, into rank-orders or otherwise, are 
suggested for the purpose of providing a powerful 
uni-variate test.  If you are moving to modeling, you usually
want only the rational transformations (as I just described),
so the model with maintain sense.


> 2) When forward or backward stepwise regression is used, is there a limit
> for the number of independent variables one can take into procedure? Does
> the literature which suggest taking 5 to 20 times less variables then cases
> speaks about final equation obtained from frinished regression procedure or
> amount before stepwise procedure?

Check my stats-faq  for comments posted a few years
ago, by various people,  on why stepwise regression 
is a bad idea for most applications.  You could also
search  sci.stat.*   with   groups.google.com  for more
recent comments.


-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to