Re: [R] Random Forest, Variable Mismatch

Peter Langfelder Sat, 15 Feb 2014 09:16:05 -0800

On Sat, Feb 15, 2014 at 8:43 AM, Lorenzo Isella
<lorenzo.ise...@gmail.com> wrote:
> Dear All,
> I am a bit puzzled.
> I am developing a random forest model.
> The data is large and it involves hundred of predictors, but the code I have
> written is relatively simple.
> After training my random forest model, I apply it on some new data set to
> carry out some prediction, as you can see below
>
>
> response_validation <- predict(rf,newdata=mydata,
>                                type="response")
>
> but I get this error message
>
> Error in predict.randomForest(rf, newdata = mydata, type = "response") :
>   variables in the training data missing in newdata


This error is thrown when the column names in original and new data do
not agree. Make sure the column names in your original data and the
new data 'mydata' are the same.


>
> I am confused because I checked that there is no missing data neither in my
> training nor in my test data sets and the data types of the columns of both
> the test and train data sets are perfectly identical.

column types are not enough - the column names must be the same.

HTH,

Peter

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Random Forest, Variable Mismatch

Reply via email to