Re: [Scikit-learn-general] Faster SVM training time with sparse input

Caleb Sun, 16 Mar 2014 02:38:29 -0700

I am currently playing with the RandomForestEmbedding in scikit-learn using the 
MNIST data. I train the RandomForestEmbedding using random matrix of the shape 
(1000,784) and use it to transform the MNIST data then feed it into linear SVM. 
The training time of the SVM is about 37s and the accuracy is about 93%.


On the other hand, if I train the RandomForestEmbedding using the first 1000 
samples of the dataset and then feed the transformed data into linear SVM, the 
training time drops to about 3s and the accuracy is about 97%.

Both transformed data has the same level of sparsity. The increase in accuracy 
might not be surprised. But I wonder why the drastic drop in training time. Any 
idea?

--
Caleb 



On Sunday, March 2, 2014 11:34 PM, Olivier Grisel <olivier.gri...@ensta.org> 
wrote:
 
Inverting 0 and 1 does not change the problem mathematically but does
change the number of multiplication addition operations performed when
computing the dot products or euclidean norms involved in the
computation of columns of the kernel matrix when a sparse
representation of the input data is used.

-- 
Olivier

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech

_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] Faster SVM training time with sparse input

Reply via email to