Re: [R] caret: Error when using rpart and CV != LOOCV

2012-05-17 Thread Max Kuhn
Dominik, There are a number of formulations of this statistic (see the Kvålseth[*] reference below). I tend to think of R^2 as the proportion of variance explained by the model[**]. With the traditional formula, it is possible to get negative proportions (if there are extreme outliers in the

Re: [R] caret: Error when using rpart and CV != LOOCV

2012-05-17 Thread Dominik Bruhn
Hy Max, thanks again for the answer. I checked the caret implementation and you were right. If the predictions for the model constant (or sd(pred)==0) then the implementation returns a NA for the rSquare (in postResample). This is mainly because the caret implementation uses `cor` (from the

Re: [R] caret: Error when using rpart and CV != LOOCV

2012-05-16 Thread Max Kuhn
More information is needed to be sure, but it is most likely that some of the resampled rpart models produce the same prediction for the hold-out samples (likely the result of no viable split being found). Almost every incarnation of R^2 requires the variance of the prediction. This particular

Re: [R] caret: Error when using rpart and CV != LOOCV

2012-05-16 Thread Dominik Bruhn
Thanks Max for your answer. First, I do not understand your post. Why is it a problem if two of predictions match? From the formula for calculating R^2 I can see that there will be a DivByZero iff the total sum of squares is 0. This is only true if the predictions of all the predicted points from

Re: [R] caret: Error when using rpart and CV != LOOCV

2012-05-16 Thread Dominik Bruhn
Sorry for the follow-up, but I dig deeper into the problem. My text on the R^2 was wrong: In my opinion, and at least to Wikipedia, R^2 yields a division by zero iff SStot (the total sum of squares) is zero. SStot is the sum of the sum of the difference between the observed (not the predicted)

Re: [R] caret: Error when using rpart and CV != LOOCV

2012-05-16 Thread Max Kuhn
Dominik, See this line:   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.  30.37   30.37   30.37   30.37   30.37   30.37 The variance of the predictions is zero. caret uses the formula for R^2 by calculating the correlation between the observed data and the predictions which uses sd(pred) which

[R] caret: Error when using rpart and CV != LOOCV

2012-05-15 Thread Dominik Bruhn
Hy, I got the following problem when trying to build a rpart model and using everything but LOOCV. Originally, I wanted to used k-fold partitioning, but every partitioning except LOOCV throws the following warning: Warning message: In nominalTrainWorkflow(dat = trainData, info = trainInfo,