Re: [Scikit-learn-general] GSoC 2012 pre-application

Vlad Niculae Thu, 05 Apr 2012 05:26:23 -0700

Hi everyone

I have updated my proposal thanks to your excellent suggestions.


I also pointed out the style of optimization that will be applied by linking to 
my blog post on optimizing orthogonal matching pursuit code. Unfortunately this 
will also flash the bug I introduced before everyone's eyes. I hope it doesn't 
look so bad… does it? :)

I have yet to point out obvious low-hanging fruits. What do you suggest?

Do you think the proposal makes it clear enough that it's not just about making 
stuff run faster, but also setting up a benchmarking system and making sure 
things stay fast and that new code will be easily benchable?

I plan to submit tonight.

Regards,
Vlad

On Apr 4, 2012, at 21:32 , Olivier Grisel wrote:

> Le 4 avril 2012 20:19, Alexandre Gramfort
> <alexandre.gramf...@inria.fr> a écrit :
>> hello vlad,
>> 
>> hope you're doing better.
>> 
>> My gut feeling reading the proposal is that you clearly know what you're 
>> talking
>> about as you know well the code base but I think you should be more specific
>> about where the low hanging fruits are and which modules deserve some love
>> in terms of speed.
> 
> Maybe you could state explicitly that the work will include a
> scalability profile of all the available models:
> 
> Pickup a selection of ~5 differents datasets with very different
> n_samples, n_features and sparsity profiles and compile a list of all
> the estimators that are able to converge to a useable model in less
> than 1s, 10s, 100s or 1000s for instance and less than 1GB memory for
> instance.
> 
> This kind of high level information would a be really nice complement
> to the table in [1] for instance.
> 
> [1] http://scikit-learn.org/dev/modules/clustering.html
> 
> While doing so, you could using the cProfile / line_profiler modules
> to help identify low hanging fruits.
> 
> -- 
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
> 
> ------------------------------------------------------------------------------
> Better than sec? Nothing is better than sec when it comes to
> monitoring Big Data applications. Try Boundary one-second 
> resolution app monitoring today. Free.
> http://p.sf.net/sfu/Boundary-dev2dev
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Re: [Scikit-learn-general] GSoC 2012 pre-application

Reply via email to