Thanks a lot Andy.
I read that paper and followed the instructions, but met with a lot
peculiarities:
1. In using "tune" function for "svm", the best "cost" value turns out to be
multi-peaks, and not with a single global peak. So I don't know which peak
to follow in order to refine my search grid and do more detailed search in a
smaller/focused range. Please see below.
- Detailed performance results:
cost error
1 0.0004882813 0.05065909
2 0.0005608879 0.05122727
3 0.0006442910 0.04895130
4 0.0007400960 0.04725000
5 0.0008501470 0.04497078
6 0.0009765625 0.04497078
7 0.0011217757 0.04497078
8 0.0012885819 0.04440260
9 0.0014801920 0.04155844
10 0.0017002941 0.03985065
11 0.0019531250 0.04099675
12 0.0022435515 0.04327273
13 0.0025771639 0.04099675
14 0.0029603839 0.03929221
15 0.0034005881 0.03986039
16 0.0039062500 0.04157143
17 0.0044871029 0.04099675
18 0.0051543278 0.04042857
19 0.0059207678 0.03871753
20 0.0068011763 0.03871429
21 0.0078125000 0.03985065
22 0.0089742059 0.04042532
23 0.0103086556 0.04042532
24 0.0118415357 0.04099675
25 0.0136023526 0.04042532
26 0.0156250000 0.04440260
2. I first tried 2^(-15:15), and found the best "cost" to be around 2^(-8),
then I reduce the range, run "tune" on cost values 2^(-11:-6), and it
returned a best "cost" value to be 2^(-9), which is different from 2^(-8),
then I run it on seq(-11, -6, by = 0.2), the best "cost" value was found to
be 2^(-7.2), and with the above multi-peaks... each time the best "cost" is
at a different value. And with the above multi-peaks, a lot of local optima,
I don't know what range should I focus on for the next step...
The code I've used is as below:
obj <- tune(svm, x, y,
ranges = list(cost = 2^seq(-11, -6, by=0.2)),
tunecontrol = tune.control(sampling = "cross") ,
kernel='linear'
)
------------------------
What can I do now?
Thanks a lot!
On 2/28/06, Liaw, Andy <[EMAIL PROTECTED]> wrote:
>
> You might find
> http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf<http://www.csie.ntu.edu.tw/%7Ecjlin/papers/guide/guide.pdf>
> helpful.
>
>
> Parameter tuning is essential for avoiding overfitting.
>
> Andy
>
> -----Original Message-----
> *From:* Michael [mailto:[EMAIL PROTECTED]
> *Sent:* Tuesday, February 28, 2006 3:30 PM
> *To:* Liaw, Andy
> *Cc:* [email protected]
> *Subject:* Re: [R] does svm have a CV to obtain the best "cost" parameter?
>
> Hi Andy,
>
> Thanks a lot for your answer! So what do I do if the model overfits?
>
> Thanks a lot!
>
> On 2/28/06, Liaw, Andy < [EMAIL PROTECTED]> wrote:
> >
> > From: Michael
> > >
> > > Hi all,
> > >
> > > I am using the "svm" command in the e1071 package.
> > >
> > > Does it have an automatic way of setting the "cost" parameter?
> >
> > See ?best.svm in that package.
> >
> > > I changed a few values for the "cost" parameter but I hope there is a
> > > systematic way of obtaining the best "cost" value.
> > >
> > > I noticed that there is a "cross" (Cross validation)
> > > parameter in the "svm"
> > > function.
> > >
> > > But I did not see how it can be used to optimize the "cost" parameter.
> >
> > >
> > > By the way, what does a 0 training error and a high testing
> > > error mean?
> > > Varying "cross=5", or "cross=10", etc. does not change the
> > > training error
> > > and testing error at all. How to improve?
> >
> > Overfitting, which varying different validation method will not solve.
> >
> > Andy
> >
> > > Thanks a lot!
> > >
> > > M.
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > [email protected] mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> > >
> > >
> >
> >
> >
> > ------------------------------------------------------------------------------
> > Notice: This e-mail message, together with any attachments, contains
> > information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station, New
> > Jersey, USA 08889), and/or its affiliates (which may be known outside the
> > United States as Merck Frosst, Merck Sharp & Dohme or MSD and in Japan, as
> > Banyu) that may be confidential, proprietary copyrighted and/or legally
> > privileged. It is intended solely for the use of the individual or entity
> > named on this message. If you are not the intended recipient, and have
> > received this message in error, please notify us immediately by reply e-mail
> > and then delete it from your system.
> >
> > ------------------------------------------------------------------------------
> >
>
>
> ------------------------------------------------------------------------------
> Notice: This e-mail message, together with any attachments...{{dropped}}
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html