Hi there, I agree with you on having long-term goals. We should indeed define where we want the library to go.
Before going into such introspection though, I think we should get more insight about how our current user base is using Scikit-Learn. I have been involved in the project for more than one year and a half and I have to say that I unfortunately don't know well our users (beside the unhappy ones who report bugs). - How many users do we actually have? a few dozens? hundreds? thousands? - What are they actually using the library for? - How are they doing that? - Is that what we want and how we want them to do? - ... does this match with our vision for the project? Those are some questions for which I would be curious to get answers. I got some insight this year when I made my students use Scikit-Learn for assignments in our local Machine Learning course. They started from zero knowledge in Python and in Machine Learning, and they end up tackling a real world problem (I made them compete locally on the Impermium dataset, trying to detect insults in social commentary). Once the assignments were done, I asked them to give me some feedback about Scikit-Learn to see what they like and dislike. I was planning on sending you email to give you that feedback, but here is the opportunity. So here it is: + They quickly got acquainted with the library. It was easy and straightforward for most of them. (I actually received a lot less questions in comparison with when were using Matlab.) + They found the documentation very well structured and very helpful. + They were glad to find nearly all the algorithms we study in class (both not all though). + They liked the well-structured and common API between the estimators. This indeed made the library a lot easier to use and learn. - Some had hard time to understand the error messages. I indeed agree that some error messages may look cryptic for novices. - Not all estimators are sparse-compatible. - Some basic estimators are missing. They complained about the lack of neural networks. Some generic ensemble methods are also missing (Stacking and Bagging are two easy but very useful ensemble methods that we should have in my opinion). - Some try to implement their own estimators, but they failed to make it compatible with our grid-search module. (I think what is missing here is some documentation regarding what is the expected interface.) Overall this feedback is very positive. If our goal is to build a toolbox such that non-experts can quickly get theirs hands on machine learning and have results quite easily, then I think we are not that far from that! However, this also highlights some important lacking features in the library that I think we should fix before going for 1.0. What is your opinion on this? Cheers, Gilles On 11 January 2013 01:19, Olivier Grisel <olivier.gri...@ensta.org> wrote: > 2013/1/11 Vlad Niculae <zephy...@gmail.com>: >> I completely agree with everyone regarding 1.0 and I really think we should >> make a clear list of issues for this (just saying API is pretty vague). >> However there is life after the 1.0, and I think Andy's message was more >> about that kind of long-term decisions. > > Agreed. > >> We should avoid getting features that aren't used by users, and equally well >> features that aren't of interest to active developers. I don't feel like >> scikit-learn is at risk at the moment, but we must avoid ending up with >> (more?) semi-orphaned modules that most developers are afraid to touch in >> case an issue is reported. > > Very true. > > -- > Olivier > http://twitter.com/ogrisel - http://github.com/ogrisel > > ------------------------------------------------------------------------------ > Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and > much more. Get web development skills now with LearnDevNow - > 350+ hours of step-by-step video tutorials by Microsoft MVPs and experts. > SALE $99.99 this month only -- learn more at: > http://p.sf.net/sfu/learnmore_122812 > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ Master HTML5, CSS3, ASP.NET, MVC, AJAX, Knockout.js, Web API and much more. Get web development skills now with LearnDevNow - 350+ hours of step-by-step video tutorials by Microsoft MVPs and experts. SALE $99.99 this month only -- learn more at: http://p.sf.net/sfu/learnmore_122812 _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general