On 01/18/2012 11:44 PM, Gael Varoquaux wrote: > On Wed, Jan 18, 2012 at 11:37:15PM +0100, Andreas wrote: > >> Having this feature might get us a LOT of attention. >> But this is really not a simple project. >> > Before trying to jump to the super fancy features, I'd rather have a > polished and versatile version of the scikit. I totally agree - I tried to do as much polishing as I can the last couple of weeks. There is still a lot to do. I opened some issues today and yesterday to track stuff that seemed important to me.
I have no experience with GSoC and I will totally bow to you wisdom there. My thinking was that single algorithms are more "project-like" than doing polishing here and there. There is important refactoring being done by Lars and Mathieu at the moment which is really great. But I wouldn't give that to someone as a project. > They are many things that I > find that we haven't explored right. For instance these are my personal > pain points: > > * we don't have an online learning framework. > > * Our model selection framework is still weak > > - see > > https://github.com/scikit-learn/scikit-learn/pull/443#issuecomment-3231270 > > - also, it the difficulty to do nested cross-validation with a specific > cross-validation strategy, > > * we are light on the semi-supervised API > > * our parameter naming is not uniform-enough across models. > > All these are points that I'd like to see addressed, because I fear that > they could all induce a change in API or conventions. I noticed some cross-validation issues but not all that you mentioned. We should maybe plan a bit more on that. About online and semi-supervised learning: I feel these are two specific sub-fields that many people are interested in but that are not central to machine learning. I am not sure I would want the scikit api to focus on these. If you go to a machine learning conference, I'm pretty sure there will be more people working on structured learning than on semi-supervised and online learning. Don't get me wrong. I don't want to quickly forcestructured learning into the scikit. It is a long term goal of me to have this in a nice accessible form. I just wanted to mention it as an option. > And I'd like API > and conventions to be stabilized, to be able to push out a 1.0 (I am > talking 6 months to 1 year horizon). > > I couldn't agree more! Cheers, Andy ------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
