> I have read that it is best to select the complexity parameter which minimises the cross-validated (x) error of the model, but elsewhere I have read that the optimum cp is the first value on the left above the '1+SE' line of the complexity paramter plot.
If you plot x=complexity vs y= cross-validated error, the plot will in theory decline sharply as you go from left to right, then hit a minimum, and then rise. In practice there is often a "flat" area around the minimum. The idea is that within the flat area, which model is the actual numeric minimum is pretty random, and one would like to pick the smallest model from among the "nearly tied" ones. The 1 SE rule formalizes this. Tree models are so very interpretable that one's tendency is to keep too many branches. Cross-validation and particular the 1 se rule prune more of the tree away that we'd like, often leaving nothing but a root node. Terry T. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.