[Scikit-learn-general] TF-Idf

Ark Fri, 21 Sep 2012 16:32:28 -0700

Hello,
    I am trying to classify a large document set with LinearSVC. I get good
 accuracy. However I was wondering how to optimize the interface to this
classifier. For e.g.If I have an predict interface that accepts the raw 
document,
and uses a precomputed classifier object, the time to predicttaken is non-
trivial. In my case vectorizing the document took about 7s but predicting only
about 0.46s. Hence the question is how to efficiently scale this type of model.
[e.g. Is there a way to store vocabulary, and whenever a new document predict
request is made, the tfidf transform use this stored vocabulary?]
Ark.



------------------------------------------------------------------------------
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

[Scikit-learn-general] TF-Idf

Reply via email to