2012/7/20 Philipp Singer <[email protected]>:
> Everything works fine now. The sad thing though is that I still can't
> really improve the classification results. The only thing I can achieve
> is to get a higher recall for the classes working well in the background
> model, but the precision sinks at the same time. Overall I am staying at
> about the same average score when incorporating the background model.
>
> If anyone has any further ideas, please let me know ;)

Well, since Gael already mentioned semi-supervised training using
label propagation: I have an old PR which has still not been merged,
mostly because of API reasons, that implements semi-supervised
training of Naive Bayes using an EM algorithm:

    https://github.com/scikit-learn/scikit-learn/pull/430

I've seen improvements in F1 score when doing text classification with
this algorithm. It may take some work to get this up to speed with the
latest scikit-learn, though.

(Just out of curiosity, which topic models did you try? I'm looking
into these for my own projects.)

-- 
Lars Buitinck
Scientific programmer, ILPS
University of Amsterdam

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to