On 01/18/2012 11:44 PM, Gael Varoquaux wrote:
> On Wed, Jan 18, 2012 at 11:37:15PM +0100, Andreas wrote:
>    
>> Having this feature might get us a LOT of attention.
>> But this is really not a simple project.
>>      
> Before trying to jump to the super fancy features, I'd rather have a
> polished and versatile version of the scikit.
I totally agree - I tried to do as much polishing as I can the
last couple of weeks.
There is still a lot to do. I opened some issues today and
yesterday to track stuff that seemed important to me.

I have no experience with GSoC and I will totally bow
to you wisdom there. My thinking was that single
algorithms are more "project-like" than doing polishing here and
there.

There is important refactoring being done by Lars and Mathieu
at the moment which is really great. But I wouldn't give that
to someone as a project.


>   They are many things that I
> find that we haven't explored right. For instance these are my personal
> pain points:
>
>   * we don't have an online learning framework.
>
>   * Our model selection framework is still weak
>
>      - see
>        
> https://github.com/scikit-learn/scikit-learn/pull/443#issuecomment-3231270
>
>      - also, it the difficulty to do nested cross-validation with a specific
>         cross-validation strategy,
>
>   * we are light on the semi-supervised API
>
>   * our parameter naming is not uniform-enough across models.
>
> All these are points that I'd like to see addressed, because I fear that
> they could all induce a change in API or conventions.
I noticed some cross-validation issues but not all that you mentioned.
We should maybe plan a bit more on that.

About online and semi-supervised learning:
I feel these are two specific sub-fields that many people are interested
in but that are not central to machine learning.
I am not sure I would want the scikit api to focus on these.
If you go to a machine learning conference, I'm pretty sure
there will be more people working on structured learning
than on semi-supervised and online learning.

Don't get me wrong. I don't want to quickly forcestructured
learning into the scikit. It is a long term goal of me to
have this in a nice accessible form. I just wanted to mention
it as an option.

>   And I'd like API
> and conventions to be stabilized, to be able to push out a 1.0 (I am
> talking 6 months to 1 year horizon).
>
>    
I couldn't agree more!

Cheers,
Andy

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to