Hi Olivier,
Thanks for your advices. Maybe I should rephrase my question. The basic
situation is shown below.
T1
|--------> LinearSVC (longer training time, about 30s)
data -- T2
|--------> LinearSVC (significantly shorter training time, about 3s)
I transform my data using different transformation T1 and T2 and then feed it
into the LInearSVC. What I found is that the classifier is trained
significantly faster with the transformed data using T2.
Since both transformed data has the same number of instances, we are looking at
factors other than number of instances that affect the training time. So my
question is basically what are the properties of a dataset that make the
LinearSVC training time shorter?
In my case, T1 is RandomForestEmbedding trained with random data and T2 is
RandomForestEmbedding trained with the original data. Since T1(data) and
T2(data) are both sparse matrix of similar size, I am wondering what are the
other factors that make the training time significantly shorter?
---
Caleb
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general