Hi, Andy and Joel,

thanks for the heads-up and the discussion. I agree that it is more of an 
engineering tool and that's why I haven't considered asking about it initially. 
However, it seems that some people were interested in that (probably primarily 
"kaggler's") so I just wanted to know if something like this could be useful in 
the scikit-learn library.

I would be happy to add it either as example and/or implementation since I am a 
big fan of scikit-learn and would be happy to give something back if I can :)

I can dig into some literature on the weekend and see what I can find. But my 
feeling is that -- like Andy said -- it is more of an engineering tool (in 
contrast to bagging and AdaBoost).

So, shall I go ahead and open an issue in the GitHub repo to continue the 
discussion? 

Andy, could you give me a quick follow-up on the transformer method. I am 
wondering what the transformer should return in this case.

Best,
Sebastian

> On Jan 14, 2015, at 9:07 AM, Andy <t3k...@gmail.com> wrote:
> 
> On 01/14/2015 02:06 AM, Joel Nothman wrote:
>> I wonder if these ensembles, while common, are too non-standard. Are there 
>> well-analysed variants of these models in the literature, or standard ways 
>> to configure them? If not, perhaps this is best presented as an example 
>> rather than avaialable in the library...
> Well, there is "stacking" but that is rarely used in practice, I think.
> FeatureUnion is also more of an engineering tool than a theoretical one...
> 
>> 
>> On 14 January 2015 at 13:21, Andy <t3k...@gmail.com 
>> <mailto:t3k...@gmail.com>> wrote:
>> Hi Sebastian.
>> I think this might be useful as these times of algorithms are often used
>> in competitions.
>> It would also be nice to provide a transform method, so that one could
>> also learn another model on top
>> (like here
>> http://zacstewart.com/2014/08/05/pipelines-of-featureunions-of-pipelines.html
>>  
>> <http://zacstewart.com/2014/08/05/pipelines-of-featureunions-of-pipelines.html>).
>> 
>> Cheers,
>> Andy
>> 
>> 
>> On 01/10/2015 07:13 PM, Sebastian Raschka wrote:
>> > Hi,
>> >
>> > I  wrote a short blog post about implementing a conservative majority rule 
>> > ensemble classifier in scikit-learn someone asked me whether this would be 
>> > interesting for the scikit-learn library.
>> >
>> > The idea behind it is quite simple: Using the weighted or unweighted 
>> > majority rule from different classification models (naive Bayes, Logistic 
>> > Regression, Random Forests etc.) to predict the class label.
>> >
>> > clf1 = LogisticRegression()
>> > clf2 = RandomForestClassifier()
>> > clf3 = GaussianNB()
>> >
>> > eclf = EnsembleClassifier(clfs=[clf1, clf2, clf3], weights=[1,1,1])
>> >
>> > for clf, label in zip([clf1, clf2, clf3, eclf], ['Logistic Regression', 
>> > 'Random Forest', 'naive Bayes', 'Ensemble']):
>> >      scores = cross_validation.cross_val_score(clf, X, y, cv=5, 
>> > scoring='accuracy')
>> >      print("Accuracy: %0.2f (+/- %0.2f) [%s]" % (scores.mean(), 
>> > scores.std(), label))
>> >
>> > (more details in the blog post: 
>> > http://sebastianraschka.com/Articles/2014_ensemble_classifier.html 
>> > <http://sebastianraschka.com/Articles/2014_ensemble_classifier.html>)
>> >
>> > If you would consider this as useful, let me know, and I would be happy to 
>> > contribute it to the scikit-learn library.
>> >
>> > Best,
>> > Sebastian
>> >
>> >
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > Dive into the World of Parallel Programming! The Go Parallel Website,
>> > sponsored by Intel and developed in partnership with Slashdot Media, is 
>> > your
>> > hub for all things parallel software development, from weekly thought
>> > leadership blogs to news, videos, case studies, tutorials and more. Take a
>> > look and join the conversation now. http://goparallel.sourceforge.net 
>> > <http://goparallel.sourceforge.net/>
>> > _______________________________________________
>> > Scikit-learn-general mailing list
>> > Scikit-learn-general@lists.sourceforge.net 
>> > <mailto:Scikit-learn-general@lists.sourceforge.net>
>> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general 
>> > <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
>> 
>> 
>> ------------------------------------------------------------------------------
>> New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
>> GigeNET is offering a free month of service with a new server in Ashburn.
>> Choose from 2 high performing configs, both with 100TB of bandwidth.
>> Higher redundancy.Lower latency.Increased capacity.Completely compliant.
>> http://p.sf.net/sfu/gigenet <http://p.sf.net/sfu/gigenet>
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net 
>> <mailto:Scikit-learn-general@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general 
>> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
>> 
>> 
>> 
>> ------------------------------------------------------------------------------
>> New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
>> GigeNET is offering a free month of service with a new server in Ashburn.
>> Choose from 2 high performing configs, both with 100TB of bandwidth.
>> Higher redundancy.Lower latency.Increased capacity.Completely compliant.
>> http://p.sf.net/sfu/gigenet <http://p.sf.net/sfu/gigenet>
>> 
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net 
>> <mailto:Scikit-learn-general@lists.sourceforge.net>
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general 
>> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>
> 
> ------------------------------------------------------------------------------
> New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
> GigeNET is offering a free month of service with a new server in Ashburn.
> Choose from 2 high performing configs, both with 100TB of bandwidth.
> Higher redundancy.Lower latency.Increased capacity.Completely compliant.
> http://p.sf.net/sfu/gigenet <http://p.sf.net/sfu/gigenet>
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net 
> <mailto:Scikit-learn-general@lists.sourceforge.net>
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general 
> <https://lists.sourceforge.net/lists/listinfo/scikit-learn-general>

------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to