[R] Nominal variables in SVM?

Noah Silverman Wed, 12 Aug 2009 13:49:04 -0700

Hi,

The answers to my previous question about nominal variables has lead meto a more important question.


What is the "best practice" way to feed nominal variable to an SVM.

For example:
color = ("red, "blue", "green")

I could translate that into an index so I wind up with
color= (1,2,3)

But my concern is that the SVM will now think that the values arenumeric in "range" and not discrete conditions.

Another thought would be to create 3 binary variables from the singlecolor variable, so I have:


red = (0,1)
blue = (0,1)
green = (0,1)

A example fed to the SVM would have one positive and two negative valuesto indicate the color value:

i.e. for a blue example:
red = 0, blue =1 , green = 0

Or, do any of the SVM packages intelligently handle this internally sothat I don't have to mess with it. If so, do I need to be concernedabout different "translation" of the data if the test data set isn'texactly the same as the training set.

For example:
training data  =  color ("red, "blue", "green")
test data = color ("red, "green")

How would I be sure that the "red" and "green" examples get encoded thesame so that the SVM is accurate?


Thanks in advance!!

-N

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Nominal variables in SVM?

Reply via email to