Hi Philipp,

Disregard my previous message. This is a bug, and you patch looks fine. More
below.

On 13 June 2007 at 15:36, Philipp Benner wrote:
| Package: r-cran-rpart
| Version: 3.1.35-1
| Severity: normal
| Tags: patch
| 
| 
| While implementing an algorithm using rpart() I discovered the following
| problem:
| 
| (sample script rpart.bug.R)
| 
--------------------------------------------------------------------------------
| library(rpart)
| 
| pima    <- read.table("pima-indians-diabetes.r.data")
| n       <- nrow(pima)
| formula <- class ~ pregnant + glucose + blood.pressure + triceps +
| insulin + bmi + pedigree + age
| 
| adaboost <- function() {
|   w <- abs(rnorm(n))
|   hyp <- rpart(formula, data=pima, weights=w, method="class")
|   prev <- hyp
| }
| 
--------------------------------------------------------------------------------
| 
--------------------------------------------------------------------------------
| 01 > source("rpart.bug.R")
| 02 > adaboost()
| 03 Error in eval(expr, envir, enclos) : object "w" not found
| 04 > w <- abs(rnorm(n)) # global definition of w (with random values)
| 05 > adaboost()         # now it works
| 06 > remove(w)          # removing w again
| 07 > adaboost()
| 08 Error in eval(expr, envir, enclos) : object "w" not found
| 09 > debug(rpart)
| 10 > c <- adaboost()
| 11 debugging in: rpart(formula, frame, weight, method = "class")
| 12 debug: {
| 13 [...]
| 14 debug: m <- eval(m, parent.frame())
| 15 Browse[1]> 
| 16 Error in eval(expr, envir, enclos) : object "weight" not found
| 
--------------------------------------------------------------------------------
| 
| If rpart() is used in a function (so not at the toplevel environment) it
| will use the wrong weight vector and even fail if it is not defined in
| the top environment. I fixed that bug and attached the patch to this
| mail. However, I'm not completely sure whether the patch is correct
| although it seems to work.

I re-examined this with a self-contained example following the rpart help
example:
-----------------------------------------------------------------------------
library(rpart)

myformula <- formula(Kyphosis ~ Age + Number + Start)
mydata <- kyphosis
myweight <- abs(rnorm(nrow(mydata)))

adaboostGood <- function(mydata, myformula, myweight) {
  hyp <- rpart(myformula, data=mydata, weights=myweight, method="class")
  prev <- hyp
}
adaboostGood(mydata, myformula, myweight)
rm(myweight)


adaboostBad <- function(mydata, myformula) {
  myweight <- abs(rnorm(nrow(mydata)))
  hyp <- rpart(myformula, data=mydata, weights=myweight, method="class")
  prev <- hyp
}
adaboostBad(mydata, myformula, myweight)
-----------------------------------------------------------------------------

As you say, adaboostGood works not because the argument is passed, but
because myweight is found in the calling environment.  When it is removed
there, but defined locally as in adaboostBad, it does in deed fail.  The 'm'
object inside rpart seems to need it set as your patch suggests. One could
wrap the assignment in a test for w not is.missing or is.null I suppose.

I am CCing Prof Ripley who maintains rpart upstream.  

Regards, Dirk

| Regards
| 
| 
| -- System Information:
| Debian Release: lenny/sid
|   APT prefers testing
|   APT policy: (990, 'testing'), (500, 'unstable'), (500, 'stable')
| Architecture: powerpc (ppc)
| 
| Kernel: Linux 2.6.18
| Locale: [EMAIL PROTECTED], [EMAIL PROTECTED] (charmap=ISO-8859-15)
| Shell: /bin/sh linked to /bin/bash
| 
| Versions of packages r-cran-rpart depends on:
| ii  libc6                       2.3.6.ds1-13 GNU C Library: Shared libraries
| ii  r-base-core                 2.4.1-2      GNU R core of statistical 
computin
| ii  r-cran-survival             2.31-1       GNU R package for survival 
analysi
| 
| r-cran-rpart recommends no packages.
| 
| -- no debconf information
| 
| -- 
| Philipp Benner
| --- rpart/R/rpart.old.s       2007-06-09 12:59:13.000000000 +0200
| +++ rpart/R/rpart.s   2007-06-09 13:23:35.000000000 +0200
| @@ -2,7 +2,7 @@
|  #
|  #  The recursive partitioning function, for S
|  #
| -rpart <- function(formula, data, weights, subset,
| +rpart <- function(formula, data, subset, weights = NULL,
|                  na.action=na.rpart, method, model=FALSE, x=FALSE, y=TRUE,
|                  parms, control, cost, ...)
|  {
| @@ -17,6 +17,7 @@
|       m$x <- m$y <- m$parms <- m$... <- NULL
|       m$cost <- NULL
|       m$na.action <- na.action
| +        m$weights <- weights
|       m[[1]] <- as.name("model.frame")
|       m <- eval(m, parent.frame())
|       }

-- 
Hell, there are no rules here - we're trying to accomplish something. 
                                                  -- Thomas A. Edison


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to