Pau, Sorry for getting back to you for this again. I am getting confused about your interpretation of 3). It is obvious from your code that increasing C results in* smaller *number of SVs, this seems to contradict with your interpretation " * Increasing the value of C (...) forces the creation of a more accurate model.* A more accurate model is done my adding more SV". In addition, I got to know that the number of SVs increases with C decreasing is because there are many bounded SVs (whose alpha = C, remember 0 < alpha <= C), those SVs with alpha smaller than C is called free SVs. Here is another question: is the complexity of the boundary determined by number of total SVs (bounded SV + free SV) or free SVs only?
Thanks a bunch, -Jack On Thu, Jul 15, 2010 at 4:17 AM, Pau Carrio Gaspar <paucar...@gmail.com>wrote: > Hi Jack, > > to 1) and 2) there are telling you the same. I recommend you to read the > first sections of the article it is very well writen and clear. There you > will read about duality. > > to 3) I interpret the scatter plot so: * Increasing the value of C (...) > forces the creation of a more accurate model.* A more accurate model is > done my adding more SV ( till we get a convex hull of the data ) > > hope it helps > Regards > Pau > > 2010/7/14 Jack Luo <jluo.rh...@gmail.com> > >> Pau, >> >> Thanks a lot for your email, I found it very helpful. Please see below for >> my reply, thanks. >> >> -Jack >> >> On Wed, Jul 14, 2010 at 10:36 AM, Pau Carrio Gaspar >> <paucar...@gmail.com>wrote: >> >>> Hello Jack, >>> >>> 1 ) why do you thought that " larger C is prone to overfitting than >>> smaller C" ? >>> >> >> *There is some statement in the link http://www.dtreg.com/svm.htm >> >> "To allow some flexibility in separating the categories, SVM models have >> a cost parameter, C, that controls the trade off between allowing >> training errors and forcing rigid margins. It creates a soft marginthat >> permits some misclassifications. Increasing the value of >> C increases the cost of misclassifying points and forces the creation of >> a more accurate model that may not generalize well." >> >> My understanding is that this means larger C may not generalize well >> (prone to overfitting). >> * >> >> 2 ) if you look at the formulation of the quadratic program problem you >> will see that C rules the error of the "cutting plane " ( and overfitting >> ). Therfore for hight C you allow that the "cutting plane" cuts worse the >> set, so SVM needs less points to build it. a proper explanation is in >> Kristin P. Bennett and Colin Campbell, "Support Vector Machines: Hype or >> Hallelujah?", SIGKDD Explorations, 2,2, 2000, 1-13. >> http://www.idi.ntnu.no/emner/it3704/lectures/papers/Bennett_2000_Support.pdf >> >> *Could you be more specific about this? I don't quite understand. * >> >>> >>> 3) you might find usefull this plots: >>> >>> library(e1071) >>> m1 <- matrix( c( >>> 0, 0, 0, 1, 1, 2, 1, 2, 3, 2, 3, 3, 0, >>> 1,2,3, 0, 1, 2, 3, >>> 1, 2, 3, 2, 3, 3, 0, 0, 0, 1, 1, 2, 4, 4,4,4, >>> 0, 1, 2, 3, >>> 1, 1, 1, 1, 1, 1, -1,-1, -1,-1,-1,-1, 1 ,1,1,1, 1, >>> 1,-1,-1 >>> ), ncol = 3 ) >>> >>> Y = m1[,3] >>> X = m1[,1:2] >>> >>> df = data.frame( X , Y ) >>> >>> par(mfcol=c(4,2)) >>> for( cost in c( 1e-3 ,1e-2 ,1e-1, 1e0, 1e+1, 1e+2 ,1e+3)) { >>> #cost <- 1 >>> model.svm <- svm( Y ~ . , data = df , type = "C-classification" , kernel >>> = "linear", cost = cost, >>> scale =FALSE ) >>> #print(model.svm$SV) >>> >>> plot(x=0,ylim=c(0,5), xlim=c(0,3),main= paste( "cost: ",cost, "#SV: ", >>> nrow(model.svm$SV) )) >>> points(m1[m1[,3]>0,1], m1[m1[,3]>0,2], pch=3, col="green") >>> points(m1[m1[,3]<0,1], m1[m1[,3]<0,2], pch=4, col="blue") >>> points(model.svm$SV[,1],model.svm$SV[,2], pch=18 , col = "red") >>> } >>> * >>> * >> >> *Thanks a lot for the code, I really appreciate it. I've run it, but I am >> not sure how should I interpret the scatter plot, although it is obvious >> that number of SVs decreases with cost increasing. * >> >>> >>> Regards >>> Pau >>> >>> >>> 2010/7/14 Jack Luo <jluo.rh...@gmail.com> >>> >>>> Hi, >>>> >>>> I have a question about the parameter C (cost) in svm function in e1071. >>>> I >>>> thought larger C is prone to overfitting than smaller C, and hence leads >>>> to >>>> more support vectors. However, using the Wisconsin breast cancer example >>>> on >>>> the link: >>>> http://planatscher.net/svmtut/svmtut.html >>>> I found that the largest cost have fewest support vectors, which is >>>> contrary >>>> to what I think. please see the scripts below: >>>> Am I misunderstanding something here? >>>> >>>> Thanks a bunch, >>>> >>>> -Jack >>>> >>>> > model1 <- svm(databctrain, classesbctrain, kernel = "linear", cost = >>>> 0.01) >>>> > model2 <- svm(databctrain, classesbctrain, kernel = "linear", cost = >>>> 1) >>>> > model3 <- svm(databctrain, classesbctrain, kernel = "linear", cost = >>>> 100) >>>> > model1 >>>> >>>> Call: >>>> svm.default(x = databctrain, y = classesbctrain, kernel = "linear", >>>> cost = 0.01) >>>> >>>> >>>> Parameters: >>>> SVM-Type: C-classification >>>> SVM-Kernel: linear >>>> cost: 0.01 >>>> gamma: 0.1111111 >>>> >>>> Number of Support Vectors: 99 >>>> >>>> > model2 >>>> >>>> Call: >>>> svm.default(x = databctrain, y = classesbctrain, kernel = "linear", >>>> cost = 1) >>>> >>>> >>>> Parameters: >>>> SVM-Type: C-classification >>>> SVM-Kernel: linear >>>> cost: 1 >>>> gamma: 0.1111111 >>>> >>>> Number of Support Vectors: 46 >>>> >>>> > model3 >>>> >>>> Call: >>>> svm.default(x = databctrain, y = classesbctrain, kernel = "linear", >>>> cost = 100) >>>> >>>> >>>> Parameters: >>>> SVM-Type: C-classification >>>> SVM-Kernel: linear >>>> cost: 100 >>>> gamma: 0.1111111 >>>> >>>> Number of Support Vectors: 44 >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help@r-project.org mailing list >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >> > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.