On page 409 of "Applied Predictive Modeling" by Max Kuhn, it states that the gbm function can accomodate only two class problems when referring to the distribution parameter.
>From gbm help re: the distribution parameter: Currently available options are "gaussian" (squared error), "laplace" (absolute loss), "tdist" (t-distribution loss), "bernoulli" (logistic regression for 0-1 outcomes), "huberized" (huberized hinge loss for 0-1 outcomes), "multinomial" (classification when there are more than 2 classes), "adaboost" (the AdaBoost exponential loss for 0-1 outcomes), "poisson" (count outcomes), "coxph" (right censored observations), "quantile", or "pairwise" (ranking measure using the LambdaMart algorithm). I would have thought that huberized and multinomial would also be possible. Is that not so? In any case, how would anything different from bernoulli (the default) be specified when using the caret train function since distribution appears not to be among the list of parameters that caret recognises? > getModelInfo("gbm")[["gbm"]]$parameters parameter class label 1 n.trees numeric # Boosting Iterations 2 interaction.depth numeric Max Tree Depth 3 shrinkage numeric Shrinkage 4 n.minobsinnode numeric Min. Terminal Node Size Is that a limitation of the caret package? Or is there something I'm not getting? -- ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ___ Patrick Connolly {~._.~} Great minds discuss ideas _( Y )_ Average minds discuss events (:_~*~_:) Small minds discuss people (_)-(_) ..... Eleanor Roosevelt ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.