Hi all,
I've got a text classification problem on which LogisticRegression
consistently outperforms SGDClassifier(loss="log") by a few percentage
points on the smallish [O(10^5) points] datasets I've been using for
initial development/testing. The data set I'll ultimately be using for
training is big enough [O(10^9) to begin but incrementally increasing from
there] that I'll want to do online learning with
SGDClassifier.partial_fit()...
What I want to know is whether I can train an initial LogisticRegression
classifier, then use its coef_ to initialize a SGDClassifier(loss="log")
that would subsequently be updated via partial_fit() as new/more data come
in? Or is there stuff going on under the hood that would preclude this?
Thanks!
Fred.
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general