Re: [R] Rpart query

Achim Zeileis Tue, 12 Oct 2010 02:50:29 -0700

On Mon, 11 Oct 2010, jagdeesh_mn wrote:


Hi,

Being a novice this is my first usage of R.

I am trying to use rpart for building a decision tree in R. And I have the
following dataframe


Outlook Temp    Humidity        Windy   Class
Sunny   75      70      Yes     Play
Sunny   80      90      Yes     Don't Play
Sunny   85      85      No      Don't Play
Sunny   72      95      No      Don't Play
Sunny   69      70      No      Play
Overcast        72      90      Yes     Play
Overcast        83      78      No      Play
Overcast        64      65      Yes     Play
Overcast        81      75      No      Play

The first line indicating the header. When I use the formula,

"CART<-rpart(Class ~ Outlook + Temp + Humidity + Windy, data=dataframe)"

and trying to plot the values of CART using plot(CART), I get the following
error,

"Error in plot.rpart(CART) : fit is not a tree, just a root".

Am I missing something here? Any help would be greatly appreciated. Btw, the
dataframe was obtained by reading a csv which shouldn't be an issue.

The error message says it all: In this tiny data set rpart() decides thatit doesn't split the data at all and thus just retains a root and not atree.

If you want to make rpart() split the data, you can modify some of itshyperparameters, e.g., the minimum number of observations required toattempt a split.

The data above are often used in machine learning textbooks to introducethe concept of recursive partitioning. They are also provided in the"RWeka" package. However, many (statistical) recursive partitioningalgorithms will be default consider the data too small to attemptsplitting.


## load RWeka and data
library("RWeka")
weather <- read.arff(system.file("arff", "weather.arff",
  package = "RWeka"))

## J4.8 tree (Java implementation of C4.5, revision 8)
j48 <- J48(play ~ ., data = weather)
j48

## RPart tree (R implementation of CART)
library("rpart")
rp <- rpart(play ~ ., data = weather, minsplit = 5)
plot(rp)
text(rp)

## Conditional inference tree
library("party")
ct <- ctree(play ~ ., data = weather,
  control = ctree_control(minsplit = 5, mincriterion = 0.3))
plot(ct)

As you see, all trees have different opinions about how the data should besplit. However, in this tiny data set, nothing could be consideredstatistically significant.

I would recommend to use some larger data set to try to understand how thedifferent algorithms work.


hth,
Z

-Jagdeesh


--
View this message in context: 
http://r.789695.n4.nabble.com/Rpart-query-tp2991198p2991198.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Rpart query

Reply via email to