2012/5/6 Vlad Niculae <[email protected]>: > Hello everybody, > > I will start my effort for my GSoC project for this year, as discussed, with > making the linear models faster where applicable, most importantly in > multi-task regression problems. > > The plan (which will be piloted now, and towards the middle of the summer, > hopefully will get nailed down), is: > > 1. Choose some datasets for benchmarking the regression problem. > These need to explore as many of the possible gotchas as we can: wide X, tall > X, sparse X, etc. Maybe use our generators.
Yes: condition number and non-linear separability (convex blobs and folded layers) would be interesting too. > 2. Set up a (pilot) benchmark runner using these datasets. > This will slowly build up into a nice speed.pypy -like (but hopefully > cleaner) interface so we can monitor the overall performance of the scikit. We should also include a cProfile output in the report (maybe along with a line profile of the interesting functions). -- Olivier http://twitter.com/ogrisel - http://github.com/ogrisel ------------------------------------------------------------------------------ Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ _______________________________________________ Scikit-learn-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
