On 09/13/2012 09:27 PM, Lars Buitinck wrote: > 2012/9/13 Dimitrios Pritsos<[email protected]>: >> There is a Great difference in the performance of SVM.fit() method >> (OneClassSVM in particular) depending on the input. When the input is a >> Sparse Matrix the Training is Extremely slow for a very small amount of >> data i.e. 180x1000 matrix where 1000 are the features size and 180 are >> the samples. On the other hand an Array input of the same size is quite >> fast, even faster than the Libsvm Python API as I can recall. >> >> Is that normal or I ve encountered some short of a bug? > No, that's not normal. Good you tell us... > > 1. How slow is slow? More than 10min per Class training. In same cases more than 30min so I stopped it. In Dense Arrays for the same matrix requires only few sec per Class training. > 2. Is it equally slow when you use the deprecated > sklearn.svm.sparse.OneClassSVM? I don't know I can test it and let you know, after I will finish a test that is currently running. > 3, How sparse is your data? I.e., how many zeros are there in X? > They normally be very sparse maybe 30% or less of variables are non-zero because they derive from Documents (in particular webpages) with a Corpus Vocabulary sized from 1000 to 120000 terms/tokens.
Regards ------------------------------------------------------------------------------ Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
