On 01/14/2015 02:06 AM, Joel Nothman wrote:
I wonder if these ensembles, while common, are too non-standard. Are there well-analysed variants of these models in the literature, or standard ways to configure them? If not, perhaps this is best presented as an example rather than avaialable in the library...
Well, there is "stacking" but that is rarely used in practice, I think. FeatureUnion is also more of an engineering tool than a theoretical one...
On 14 January 2015 at 13:21, Andy <t3k...@gmail.com <mailto:t3k...@gmail.com>> wrote:Hi Sebastian. I think this might be useful as these times of algorithms are often used in competitions. It would also be nice to provide a transform method, so that one could also learn another model on top (like here http://zacstewart.com/2014/08/05/pipelines-of-featureunions-of-pipelines.html). Cheers, Andy On 01/10/2015 07:13 PM, Sebastian Raschka wrote: > Hi, > > I wrote a short blog post about implementing a conservative majority rule ensemble classifier in scikit-learn someone asked me whether this would be interesting for the scikit-learn library. > > The idea behind it is quite simple: Using the weighted or unweighted majority rule from different classification models (naive Bayes, Logistic Regression, Random Forests etc.) to predict the class label. > > clf1 = LogisticRegression() > clf2 = RandomForestClassifier() > clf3 = GaussianNB() > > eclf = EnsembleClassifier(clfs=[clf1, clf2, clf3], weights=[1,1,1]) > > for clf, label in zip([clf1, clf2, clf3, eclf], ['Logistic Regression', 'Random Forest', 'naive Bayes', 'Ensemble']): > scores = cross_validation.cross_val_score(clf, X, y, cv=5, scoring='accuracy') > print("Accuracy: %0.2f (+/- %0.2f) [%s]" % (scores.mean(), scores.std(), label)) > > (more details in the blog post: http://sebastianraschka.com/Articles/2014_ensemble_classifier.html) > > If you would consider this as useful, let me know, and I would be happy to contribute it to the scikit-learn library. > > Best, > Sebastian > > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming! The Go Parallel Website, > sponsored by Intel and developed in partnership with Slashdot Media, is your > hub for all things parallel software development, from weekly thought > leadership blogs to news, videos, case studies, tutorials and more. Take a > look and join the conversation now. http://goparallel.sourceforge.net > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net <mailto:Scikit-learn-general@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. http://p.sf.net/sfu/gigenet _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net <mailto:Scikit-learn-general@lists.sourceforge.net> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. http://p.sf.net/sfu/gigenet _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------ New Year. New Location. New Benefits. New Data Center in Ashburn, VA. GigeNET is offering a free month of service with a new server in Ashburn. Choose from 2 high performing configs, both with 100TB of bandwidth. Higher redundancy.Lower latency.Increased capacity.Completely compliant. http://p.sf.net/sfu/gigenet
_______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general