Hi Steve,
Thank you very much for your reply. Your code is more readable and obvious than mine Could you please help me in these questions?: 1) Formula is an alternative to y parameter in SVM. is it correct? 2) I forgot to remove the class label from the dataset besides I gave the program the class label in formula parameter but the program works! Could you please clarify this point to me? Cheers, Amy > Date: Wed, 6 Jan 2010 18:44:13 -0500 > Subject: Re: [R] svm > From: mailinglist.honey...@gmail.com > To: amy_4_5...@hotmail.com > CC: r-help@r-project.org > > Hi Amy, > > On Wed, Jan 6, 2010 at 4:33 PM, Amy Hessen <amy_4_5...@hotmail.com> wrote: > > Hi Steve, > > > > Thank you very much for your reply. > > > > Im trying to do something systematic/general in the program so that I can > > try different datasets without changing much in the program (without knowing > > the name of the class label that has different name from dataset to > > another ) > > > > Could you please tell me your opinion about this code:- > > > > library(e1071) > > > > mydata<-read.delim("the_whole_dataset.txt") > > > > class_label <- names(mydata)[1] # Ill always put the > > class label in the first column. > > > > myformula <- formula(paste(class_label,"~ .")) > > > > x <- subset(mydata, select = - mydata[, 1]) > > > > mymodel<-(svm(myformula, x, cross=3)) > > > > summary(model) > > > > ################ > > Since you're not doing anything funky with the formula, a preference > of mine is to just skip this way of calling SVM and go "straight" to > the svm(x,y,...) method: > > R> mydata <- as.matrix(read.delim("the_whole_dataset.txt")) > R> train.x <- mydata[,-1] > R> train.y <- mydata[,1] > > R> mymodel <- svm(train.x, train.y, cross=3, type="C-classification") > ## or > R> mymodel <- svm(train.x, train.y, cross=3, type="eps-regression") > > As an aside, I also like to be explicit about the type="" parameter to > tell what I want my SVM to do (regression or classification). If it's > not specified, the SVM picks which one to do based on whether or not > your y vector is a vector of factors (does classification), or not > (does regression) > > > Do I have to the same steps with testingset? i.e. the testing set must not > > contain the label too? But contains the same structure as the training set? > > Is it correct? > > I guess you'll want to report your accuracy/MSE/something on your > model for your testing set? Just load the data in the same way then > use `predict` to calculate the metric your after. You'll have to have > the labels for your data to do that, though, eg: > > testdata <- as.matrix(read.delim('testdata.txt')) > test.x <- testdata[,-1] > test.y <- testdata[,1] > preds <- predict(mymodel, test.x) > > Let's assume you're doing classification, so let's report the accuracy: > > acc <- sum(preds == test.y) / length(test.y) > > Does that help? > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact _________________________________________________________________ [[elided Hotmail spam]] [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.