2012/1/23 Lars Buitinck <[email protected]>:
> 2012/1/23 Dimitrios Pritsos <[email protected]>:
>> I will give it a try however in some of my tests had a memory management
>> problem. As I can recall it was mostly because of numpy function that
>> might ask from pyTable to load every thing in main men. I guess some
>> loops and some slicing might solve the problem.
>
> No experience with PyTables, sorry.
>
>> However I fist try to figure out how to use linear_model.SGDClassifier
>> which it suppose to be capable to be trained in stages. Plus since I am
>> using Linear Kernel it won't effect my results.
>
> Is that an SVC(kernel="linear") or a LinearSVC? The latter should be
> able to handle a 50k samples array if the number of features is kept
> within some bound (a few 100k should certainly be fine).

Indeed SVC will not scale to 50k samples, only LinearSVC will. In any
case I found SGDClassifier (with the fit method) to be much faster
than LinearSVC or LogisticRegression (i.e. any liblinear based
models). And discrete naive Bayes models are sometimes even faster.

Dimitrios: also if you are trying to work with scipy.sparse CSR
matrices, be careful to read the docstring of the classifier: the
supported input format are changing quite a bit in the current master:
we are trying to merge all classifier implementations to accept both
dense numpy arrays and sparse CSR matrices as input but this is still
a work in progress. Sometimes the classifier that support the sparse
variant is kept separated in a `.sparse` subpackage.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to