Hi all,

I have a question for the list regarding grouping (binning) of the
independent variable in a linear regression. This is routinely done
(at least in limnology) in studies involving so-called biomass
size-spectra. I'm aware of other (better) methods to fit non-linear
models. However, I need to compare my results with older literature
where this method is used widely, and I'd like to know first if the
method has a problem or if it is outright wrong.

My independent variable is mean body size of the individuals of a
species (M) and the dependent is either biomass (B, g/m2) or
population density (D, indiv/m2) of the species. Body size is
lognormally distributed, and the number of species in the sample is
~100. The model to fit is: D= aM^b. First, data are log-transformed in
order to apply linear least-squares regression. So the model becomes
log(D)= log(a)+ b log(M). The appropriateness of this transformation
and possible bias in the estimation of parameters have been discussed
before (Zar, Smith, others) so my question in not about that. After
log-transforming, sizes are grouped into even-spaced categories, and
the densities/biomasses for all sizes within a size group are summed
up. So, the independent variable becomes the center of each
log-size-bin, and the dependent becomes the sum of all log-densities
for each size-bin. Obviously, the number of data gets reduced from the
original N to the number of size groups/bins used. After grouping, the
log-log model is fitted by least-squares regression.

So my questions are:
Is this binning of a log-transformed variable statistically
appropriate for this problem?
Shouldn't be better to use directly the size and density for each
species without any grouping?

Thanks in advance for any suggestion or literature.
Cheers

Francisco de Castro
Potsdam University

Reply via email to