Dear All,
I'm using the Scipy SVM tool (the one derived from LibSVM I think). I have 
training set and dataset. The training set is took within the dataset. The 
training set is around the 10% of the dataset. 
Before train my SVM is suggested to scale the data in order to get zero mean 
and unit variance. 
There are two options: - scale the training set, train the SVM, scale the whole 
dataset, classify the dataset; - scale the whole dataset, take from it the 
training set, train the SVM, classify the dataset. 
The second seems to me more logic than the first but happens that I get 
extremely better result using the first option than the second one!!! 
Is it this normal ?? Probably is a dummy question but I have not too much 
experience with that!
To scale the data I use sklearn.preprocessing.scale(MyData).
Any suggestions, test that I could do, is really really welcome!
Thanks,Solimyr                                    
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to