### [R] Deviance function in regression trees

```Hello all. I have heard over and over that CART and its various tree-like
brethren are non-parametric techniques. When I read the chapter in
Chambers and Hastie on tree-based models it states that tree-based models
can be generalized (GTMs) in a manner similar to GLMs by specifying a
different deviance function to distributions other than the gaussian error
distribution ( section 9.4.3).  I have an application in which the response
variable is a continuous variable representing tree counts within a unit
area and thus would be best described by a poisson distribution. The error
distribution for this data is not gaussian. If this is the case, will the
gaussian error distribution used in most regression tree packages, be
appropriate? Are there ways to specify the error distribution in R or Should
I log transform the response variable?  If the specification of error
distribution in regression trees is important, than are these techniques
truly  non-parametric. Thanks for your inputs.

Solomon Dobrowski
Tahoe Environmental Research Center (TERC)
John Muir Institute of the Environment
University of California, Davis
530 754 9354

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

```

### Re: [R] Deviance function in regression trees

```The short answer is that a Poisson distribution is a discrete
distribution: if that is appropriate to your data the rpart function (in
the package of that name) has a suitable option.

On Mon, 28 Aug 2006, Solomon Dobrowski wrote:

Hello all. I have heard over and over that CART and its various tree-like
brethren are non-parametric techniques.  When I read the chapter in
Chambers and Hastie on tree-based models it states that tree-based models
can be generalized (GTMs) in a manner similar to GLMs by specifying a
different deviance function to distributions other than the gaussian error
distribution ( section 9.4.3).  I have an application in which the response
variable is a continuous variable representing tree counts within a unit
area and thus would be best described by a poisson distribution. The error
distribution for this data is not gaussian. If this is the case, will the
gaussian error distribution used in most regression tree packages, be
appropriate? Are there ways to specify the error distribution in R or Should
I log transform the response variable?  If the specification of error
distribution in regression trees is important, than are these techniques
truly  non-parametric. Thanks for your inputs.

Solomon Dobrowski
Tahoe Environmental Research Center (TERC)
John Muir Institute of the Environment
University of California, Davis
530 754 9354

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
and provide commented, minimal, self-contained, reproducible code.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help