2011/12/15 Joël Schaerer <joel.schae...@gmail.com>: >> Last summer, I spoke to some folks who combined text and SIFT[1] >> features for classifying images on Flickr. They just concatenated the >> feature vectors end-to-end and trained a single SVM on the result, >> with pretty good performance. So this is not necessarily a bad idea. >> >> The text features being numerous is not the problem. If none of them >> turn out to be very discriminative, but some of your other features >> are, then the text features should be largely ignored be a classifier >> trained on a mix of features. So I'd first try this simple approach >> before digging into classifiers combinations. >> >> Be sure to ask around on http://metaoptimize.com/qa for more advice. >> >> [1] https://en.wikipedia.org/wiki/Scale-invariant_feature_transform >> > > Thanks for your answer! I guess I'll have to try then. One concern I > didn't mention in my original post is that my training set is going to > be rather limited (on the order of 50 data points at the beginning), so > I think there is a significant risk of attributing a lot of importance > to some text features who don't have any. > > In any case, I'd be interested in an answer to my technical question: is > there a scikits-learn way of combining classifiers?
There is currently no way to combine classifiers that are trained on samples with different features shards and identical target variables. On a related topic, a generic bagging and boosting meta estimators is definitely on the roadmap and early work using decision trees is already available in the sklearn.ensemble package. This will likely evolve quite a bit in the coming weeks. However this work by combining classifiers that are all on the same input features and target variables (albeit they don't necessarily see all the samples in boosting). -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ 10 Tips for Better Server Consolidation Server virtualization is being driven by many needs. But none more important than the need to reduce IT complexity while improving strategic productivity. Learn More! http://www.accelacomm.com/jaw/sdnl/114/51507609/ _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general