2012/5/6 Vlad Niculae <[email protected]>:
> Hello everybody,
>
> I will start my effort for my GSoC project for this year, as discussed, with 
> making the linear models faster where applicable, most importantly in 
> multi-task regression problems.
>
> The plan (which will be piloted now, and towards the middle of the summer, 
> hopefully will get nailed down), is:
>
> 1. Choose some datasets for benchmarking the regression problem.
> These need to explore as many of the possible gotchas as we can: wide X, tall 
> X, sparse X, etc. Maybe use our generators.

Yes: condition number and non-linear separability (convex blobs and
folded layers) would be interesting too.


> 2. Set up a (pilot) benchmark runner using these datasets.
> This will slowly build up into a nice speed.pypy -like (but hopefully 
> cleaner) interface so we can monitor the overall performance of the scikit.

We should also include a cProfile output in the report (maybe along
with a line profile of the interesting functions).

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to