Hi,

I am a graduate student applying published R scripts to compare the 
classification accuracy of 2 predictive models, one built using discriminant 
function analysis and one using random forests (webpage link for these scripts 
is provided below).  The purpose of these models is to predict the biotic 
integrity of streams.  Specifically, I am trying to compare the classification 
accuracy (i.e., prediction of group membership)of both the DFA and RF models 
using k-fold crossvalidation for the following metrics: AUC ROC, percent 
correctly classified, specificity, sensitivity, and Kappa. I would also like to 
obtain the F statistic, Wilks lambda, MSE or RMSE for the random forest models 
as the script does not contain code to get this data.  I think I need to use 
the caret package to obtain the classification accuracy, but I keep getting 
error messages when I apply the train function to my data.  As I am relatively 
new to R and my thesis committee is unable to help as they are also unf!
 amiliar with R, I thought it best to ask for help.  Would someone be willing 
to help me?


Thanks,
Robin

http://www.epa.gov/wed/pages/models/rivpacs/rivpacs.htm


> TrainDataDFAgrps2 <-predcal
> TrainClassesDFAgrps2 <-grp.2;
> DFAgrps2Fit1 <- train(TrainDataDFAgrps2, TrainClassesDFAgrps2,
+  method = "lda",
+ tuneLength = 10,
+ trControl = trainControl(method = "cv"));
Error in train.default(TrainDataDFAgrps2, TrainClassesDFAgrps2, method = "lda", 
 :
  wrong model type for regression

> RFgrps2Fit1 <- train(TrainDataRFgrps2, TrainClassesRFgrps2,
+  method = "rf",
+ tuneLength = 10,
+ trControl = trainControl(method = "cv"));
There were 50 or more warnings (use warnings() to see the first 50)

Clip of predcal (same length as grp.2, but too much data to display all):
> predcal
          Reference_Test HUC12_AREA_HA_log10 ELEV_m M_Slp_sqt Precip_mm 
Temp_CX10
2370                   R                 3.7  588.0       2.2      1751       
148
559                    R                 4.0  643.1       1.8      1674       
141
2062                   R                 4.0  643.1       1.8      1674       
141
2467                   R                 4.0  643.1       1.8      1674       
141
1176                   R                 3.9  694.3       2.4      1534       
131
1840                   R                 3.9  694.3       2.4      1534       
131
2052                   R                 3.9  694.3       2.4      1534       
131
1174                   R                 4.1  605.0       2.1      1382       
138
1841                   R                 4.1  605.0       2.1      1382       
138
2051                   R                 4.1  605.0       2.1      1382       
138
1831                   R                 4.1  363.9       1.7       937       
156


Grps.2:
grp.2
  [1] 1 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 
2 1 2 1 2 1 1
[45] 2 2 1 1 1 1 1 1 1 2 2 1 1 1 2 2 1 2 2 1 1 1 2 2 2 2 2 2 1 1 1 2 2 2 1 2 2 
2 2 2 2 2 2 1
[89] 1 2 2 2 2 2 1 1 2 2 2 1 2 1 2 2 1 2 1 1 2







        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to