I have some questions on using library rpart. Given my data below, the plotcp gives me increasing 'xerrors' across different cp's with huge xstd (plot attached). What causes the problem or it's not a problem at all? I am thinking 'xerror's should be decreasing when 'cp' gets smaller. Also what the 'xstd' really tells us? If the error bars for each xerror overlap for different cp's, does that mean we don't have significant improvement for misclassification rate when we split the tree?
My data have are two classes with 138 observations and 129 attributes. Here is what I did:
dim(man.dat[,c(1,8:136)])[1] 138 130
man.dt1 <- rpart(Target~.,data=man.dat[,c(1,8:136)], method='class',cp=1e-5, parms=list(split='information'))
plotcp(man.dt1)
printcp(man.dt1)
Classification tree: rpart(formula = Target ~ ., data = man.dat[, c(1, 8:136)], method = "class", parms = list(split = "information"), cp = 1e-05)
Variables actually used in tree construction: [1] CHX.V CYN.Cu SPF.Bi
Root node error: 25/138 = 0.18116
n= 138
CP nsplit rel error xerror xstd 1 0.18667 0 1.00 1.00 0.18098 2 0.00001 3 0.44 1.12 0.18897
I would appreciate your help on this,
Weidong
_________________________________________________________________
Instant message with integrated webcam using MSN Messenger 6.0. Try it now FREE! http://msnmessenger-download.com
plotcp.pdf
Description: Adobe PDF document
______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help