HI, Dear R community,
I am writing the following function to create one data set(*tree.pred*) and
one vector(*valid.out*) from loops. Later, I want to use the data set from
this loop to plot curves. I have tried return, list, but I can not use the
*tree.pred* data and *valid.out* vector.
auc.tree<- function(msplit,mbucket) {
* tree.pred<-data.frame()
valid.out<-vector()*
for(i in 1:10) {
cat('Fold ', i, '\n')
out.fold.c <-((i-1)*c.each.part +1):(i*c.each.part)
out.fold.n <-((i-1)*n.each.part +1):(i*n.each.part)
train.cv <- n.cc[-out.fold.c, c(2:401, 418)]
train.nv <- n.nn[-out.fold.n, c(2:401, 418)]
train.v<-rbind(train.cv, train.nv) #training data for feature
selection
# grow tree
fit.dimer <- rpart(out ~ ., method="class", data=train.v)
at<-grep("<leaf>", fit.dimer$frame[, "var"], value=FALSE, ignore.case=TRUE)
varr<-as.character(unique(fit.dimer$frame[-at, "var"]))
train.cc <- n.cc[-out.fold.c, ]
valid.cc <- n.cc[out.fold.c, ]
train.nn <- n.nn[-out.fold.n,]
valid.nn <- n.nn[out.fold.n,]
train<-rbind(train.cc, train.nn) #training data
valid<-rbind(valid.cc, valid.nn) # validation data
#creat data set contains the following variables
myvar<-names(gh5_h) %in% c(varr, "num_cell","num_genes","position",
"acid_per", "base_per", "charge_per", "hydrophob_per", "polar_per", "out")
train<-train[myvar] # update training set
valid<-valid[myvar]
control<-rpart.control(xval=10, cp=0.01, minsplit=5, minbucket=5) #control
the size of the initial tree
tree.fit <- rpart(out ~ ., method="class", data=train,
control=control) # model fitting
p.tree<- prune(tree.fit,
cp=tree.fit$cptable[which.min(tree.fit$cptable[,"xerror"]),"CP"]) # prune
the tree
#get the prediction for the valid data set.
tree.pred.r <-predict(p.tree, newdata=valid, type="prob")
valid.r<-valid$out
tree.pred <-rbind(tree.pred, tree.pred.r)
valid.out<-c(valid.out, valid.r)
cat('Dim of tree.pred', dim(tree.pred), 'length of valid.out',
length(valid.out), '\n' )
}
*list(tree.pred)
list(valid.out)*
cat('Minsplit ', msplit, 'Minbucket', mbucket, '\n')
cat('10-cross validation is done! \n')
}
> auc.tree(5, 5) #
Fold 1
Dim of tree.pred 141 2 length of valid.out 141
Fold 2
Dim of tree.pred 282 2 length of valid.out 282
Fold 3
Dim of tree.pred 423 2 length of valid.out 423
Fold 4
Dim of tree.pred 564 2 length of valid.out 564
Fold 5
Dim of tree.pred 705 2 length of valid.out 705
Fold 6
Dim of tree.pred 846 2 length of valid.out 846
Fold 7
Dim of tree.pred 987 2 length of valid.out 987
Fold 8
Dim of tree.pred 1128 2 length of valid.out 1128
Fold 9
Dim of tree.pred 1269 2 length of valid.out 1269
Fold 10
Dim of tree.pred 1410 2 length of valid.out 1410
Minsplit 5 Minbucket 5
10-cross validation is done!
if use return, it will print on the screen, you still can not use it to
plot. Can anyone help me with this, thanks so much!
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome Institute
Bldg 400 Rm 457
2800 Mitchell Dr
Walnut Creet, CA 94598
Phone: 925-927-2856
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.