Hello,

I have prepared an svm on some training data and would like to use the svm 
model for predicting binary outcome from new data.

The input data frame contains several numeric and factor variables. Usually I 
construct the input matrix of the entities to be predicted with a perl script 
that writes it to a file (since the data comes from different sources and some 
text processing is needed). This file is then read read via read.table within 
R. It is possible that I'd like to perform prediction on many new cases or on a 
single new case.

There are now two problems:

1. If the constructed matrix for the cases to be predicted does not contain all 
the factor levels that were used to build the model (the factor levels found 
the training set) the svm throws an error (Error in scale ...).

I've tried to factors, but instead of getting the level labes I get the numeric 
values:

> tmp <- sapply(11:15, function(i) factor(new.dat[,i], 
> levels=c('A','C','G','T')))
> tmp
      [,1] [,2] [,3] [,4] [,5]
 [1,]    3    4    4    2    2
 [2,]    4    2    2    1    1
 [3,]    2    1    1    1    1
 [4,]    1    1    1    1    1
 [5,]    1    1    2    1    3
 [6,]    2    1    3    4    3
 [7,]    3    4    3    3    1
 [8,]    3    3    1    4    1
 [9,]    1    4    1    1    4
[10,]    1    1    4    4    4

> new.dat[,14]
 [1] "C" "A" "A" "A" "A" "T" "G" "T" "A" "T"


2. When reading a data frame with the variables and factos for a single new 
case (one row), read.table always treats the variables as strings (variables 
and factors), and worse - one of the factors contains a level named 'T' that is 
replaced by TRUE during read.table. I've tried as.is = T and F, and the result 
for she single row data frame is the same (T is replaxced by TRUE). I'm running 
R 2.1.0.

Any suggestions how to read a data frame (with at least one row) and to treat 
factor columns as such, and how to adjust the factor levels before passing the 
data frame to predict.svm?

        thanks in advance,
        +kind regards,

        Arne


        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to