Re: [R] Cross-validation error with tune and with rpart

Prof Brian Ripley Sat, 31 Dec 2011 06:15:39 -0800

On 31/12/2011 12:34, Israel Saeta Pérez wrote:

Hello list,


I'm trying to generate classifiers for a certain task using several
methods, one of them being decision trees. The doubts come when I want to
estimate the cross-validation error of the generated tree:

tree<- rpart(y~., data=data.frame(xsel, y), cp=0.00001)
ptree<- prune(tree,
cp=tree$cptable[which.min(tree$cptable[,"xerror"]),"CP"])
ptree$cptable


            CP nsplit rel error xerror       xstd
1  0.33120000      0    1.0000 1.0000 0.02856022
2  0.08640000      1    0.6688 0.6704 0.02683544
3  0.02986667      2    0.5824 0.5856 0.02584564
4  0.02880000      5    0.4928 0.5760 0.02571738
5  0.01920000      6    0.4640 0.5168 0.02484761
6  0.01440000      8    0.4256 0.5056 0.02466708
7  0.00960000     12    0.3552 0.5024 0.02461452
8  0.00880000     15    0.3264 0.4944 0.02448120
9  0.00800000     17    0.3088 0.4768 0.02417800
10 0.00480000     25    0.2448 0.4672 0.02400673


If I got it right, "xerror" stands for the cross-validation error (using
10-fold by default), this is pretty high (0.4672 over 1).

You didn't get it right. Please read the documentation, or contemplatewhy the first line is exactly one. In any case, that table is not abouterror rates for the final tree: it is part of the model selection step(to cross-validate the final tree you would need to include the choiceof pruning inside the cross-validation)

Did you look up the rpart technical report or one of the booksexplaining its output? Google 'rpart technical report' if you need tofind it.


[...]

--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cross-validation error with tune and with rpart

Reply via email to