Hi Philipp. First, you should ensure that the features all have approximately the same scale. For example they should all be between zero and one - if the LDA features are much smaller than the other ones, then they will probably not be weighted much.
Which LDA package did you use? I am not very experienced with this kind of model, but maybe it would be helpful to look at some univariate statistics, like ``feature_selection.chi2``, to see if the LDA features are actually helpful. Cheers, Andy ----- Ursprüngliche Mail ----- Von: "Philipp Singer" <kill...@gmail.com> An: scikit-learn-general@lists.sourceforge.net Gesendet: Freitag, 14. September 2012 13:47:30 Betreff: [Scikit-learn-general] Combining TFIDF and LDA features Hey there! I have seen in the past some few research papers that combined tfidf based features with LDA topic model features and they could increase their accuracy by some useful extent. I now wanted to do the same. As a simple step I just attended the topic features to each train and test sample with the existing tfidf features and performed my standard LinearSVC - oh btw thanks that the confusion with dense and sparse is now resolved in 0.12 ;) - on it. The problem now is, that the results are overall exactly similar. Some classes perform better and some worse. I am not exactly sure if this is a data problem, or comes from my lack of understanding of such feature extension techniques. Is it possible that the huge amount of tfidf features somehow overrules the rather small number of topic features? Do I maybe have to some feature modification - because tfidf and LDA features are of different nature? Maybe it is also due to the classifier and I need something else? Would be happy if someone could shed a little light on my problems ;) Regards, Philipp ------------------------------------------------------------------------------ Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ Got visibility? Most devs has no idea what their production app looks like. Find out how fast your code is with AppDynamics Lite. http://ad.doubleclick.net/clk;262219671;13503038;y? http://info.appdynamics.com/FreeJavaPerformanceDownload.html _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general