On 01/23/2012 09:11 PM, Olivier Grisel wrote:
> 2012/1/23 Dimitrios Pritsos<[email protected]>:
>> However, when I do the same test using partial_fit() for the same
>> sub-set of my Data Set (see above) I am getting ~20%.
>>
>> Any Suggestions?
> Do a grid search to find the best alpha on SGDClassifier (and on C for
> the LinearSVC classifier). For instance:
>
>>>> from sklearn.grid_search import GridSearchCV
>>>> from sklearn.linear_model import SGDClassifier
>>>> from sklearn.datasets import fetch_20newsgroups_vectorized
>>>> twenty = fetch_20newsgroups_vectorized()
>>>> param_grid = {'alpha': [1e-3, 1e-4, 1e-5]}
>>>> gs = GridSearchCV(SGDClassifier(), param_grid).fit(twenty.data, 
>>>> twenty.target)
>>>> gs.best_estimator_
> SGDClassifier(alpha=0.0001, class_weight=None, eta0=0.0, fit_intercept=True,
>         learning_rate='optimal', loss='hinge', n_iter=5, n_jobs=1,
>         penalty='l2', power_t=0.5, rho=0.85, seed=0, shuffle=False,
>         verbose=0, warm_start=False)
>>>> gs.best_score_
> 0.8575220898001239
>
> You can also include 'n_iter': [5, 10, 50] and 'class_weight':
> ['auto', None] in the param_grid but beware of the combinatorial
> explosion in computation time.
>
> Don't worry about partial_fit as your data will fit in memory with the
> CSR format.
>

Thank you very much for the advice. I will try this too(today!).
however, it seems that I might need to use the partial_fit() in the near 
feature after I will collect/crawl a new corpus.
So a question is, my result (20%) was due to some short of bug in 
partial_fit() or my incorrect use of this function?

Best Regards,

Dimitrios















------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to