2013/1/11 Gilles Louppe <[email protected]>: > Hi there, > > I agree with you on having long-term goals. We should indeed define > where we want the library to go. > > Before going into such introspection though, I think we should get > more insight about how our current user base is using Scikit-Learn. I > have been involved in the project for more than one year and a half > and I have to say that I unfortunately don't know well our users > (beside the unhappy ones who report bugs). > > - How many users do we actually have? a few dozens? hundreds? thousands? > - What are they actually using the library for? > - How are they doing that? > - Is that what we want and how we want them to do? > - ... does this match with our vision for the project?
+1 We need to come up with a survey form using google doc's form (now renamed to google drive forms) and advertise it here and over our social networks. http://google.about.com/od/toolsfortheoffice/ss/forms_googledoc.htm Any volunteer to start a draft? We should embargo its diffusion over social networks until we agree on the questions but we can share it on the mailing list to collaboratively edit / review the survey together. > Those are some questions for which I would be curious to get answers. > > I got some insight this year when I made my students use Scikit-Learn > for assignments in our local Machine Learning course. They started > from zero knowledge in Python and in Machine Learning, and they end up > tackling a real world problem (I made them compete locally on the > Impermium dataset, trying to detect insults in social commentary). > Once the assignments were done, I asked them to give me some feedback > about Scikit-Learn to see what they like and dislike. I was planning > on sending you email to give you that feedback, but here is the > opportunity. So here it is: > > + They quickly got acquainted with the library. It was easy and > straightforward for most of them. (I actually received a lot less > questions in comparison with when were using Matlab.) > + They found the documentation very well structured and very helpful. > + They were glad to find nearly all the algorithms we study in class > (both not all though). > + They liked the well-structured and common API between the > estimators. This indeed made the library a lot easier to use and > learn. > - Some had hard time to understand the error messages. I indeed agree > that some error messages may look cryptic for novices. > - Not all estimators are sparse-compatible. > - Some basic estimators are missing. They complained about the lack of > neural networks. Some generic ensemble methods are also missing > (Stacking and Bagging are two easy but very useful ensemble methods > that we should have in my opinion). > - Some try to implement their own estimators, but they failed to make > it compatible with our grid-search module. (I think what is missing > here is some documentation regarding what is the expected interface.) > > Overall this feedback is very positive. If our goal is to build a > toolbox such that non-experts can quickly get theirs hands on machine > learning and have results quite easily, then I think we are not that > far from that! Thanks very much for the feedback. > However, this also highlights some important lacking > features in the library that I think we should fix before going for > 1.0. > > What is your opinion on this? Indeed, so high priority for the 1.0: - python 3 support - polish support for sparse matrices + better error messages on invalid inputs - polish grid search / model selection + better documentation on the expectation - add missing yet basic ensemble strategies: bagging and stacking -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and much more. Get web development skills now with LearnDevNow - 350+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only -- learn more at: http://p.sf.net/sfu/learnmore_122812 _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
