2011/12/15 Joël Schaerer <joel.schae...@gmail.com>:
>> Last summer, I spoke to some folks who combined text and SIFT[1]
>> features for classifying images on Flickr. They just concatenated the
>> feature vectors end-to-end and trained a single SVM on the result,
>> with pretty good performance. So this is not necessarily a bad idea.
>>
>> The text features being numerous is not the problem. If none of them
>> turn out to be very discriminative, but some of your other features
>> are, then the text features should be largely ignored be a classifier
>> trained on a mix of features. So I'd first try this simple approach
>> before digging into classifiers combinations.
>>
>> Be sure to ask around on http://metaoptimize.com/qa for more advice.
>>
>> [1] https://en.wikipedia.org/wiki/Scale-invariant_feature_transform
>>
>
> Thanks for your answer! I guess I'll have to try then. One concern I
> didn't mention in my original post is that my training set is going to
> be rather limited (on the order of 50 data points at the beginning), so
> I think there is a significant risk of attributing a lot of importance
> to some text features who don't have any.
>
> In any case, I'd be interested in an answer to my technical question: is
> there a scikits-learn way of combining classifiers?

There is currently no way to combine classifiers that are trained on
samples with different features shards and identical target variables.

On a related topic, a generic bagging and boosting meta estimators is
definitely on the roadmap and early work using decision trees is
already available in the sklearn.ensemble package. This will likely
evolve quite a bit in the coming weeks. However this work by combining
classifiers that are all on the same input features and target
variables (albeit they don't necessarily see all the samples in
boosting).

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
10 Tips for Better Server Consolidation
Server virtualization is being driven by many needs.  
But none more important than the need to reduce IT complexity 
while improving strategic productivity.  Learn More! 
http://www.accelacomm.com/jaw/sdnl/114/51507609/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to