2012/1/23 Gael Varoquaux <[email protected]>:
> On Mon, Jan 23, 2012 at 02:17:21PM +0100, Olivier Grisel wrote:
>> Hehe, that would be nice but I am affraid Gael won't let me do this as
>> part of the main scikit repository: large scale examples mean
>> largescale datasets ;)
>
> Why can't we just generate data. The goal is to get the idea through, not
> to solve SETI@HOME on our users laptop :).

Indeed we could extend / refactor the multilabel dataset generator to
output arbitrarily big sparse CSR data with a text document structure.

Would be nice for benchmarks too. I'll add that on my TODO list of
interesting-stuff-but-not-that-a-priority-so-if-you-want-you-can-implement-it-yourself-before-i-do.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel

------------------------------------------------------------------------------
Try before you buy = See our experts in action!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-dev2
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to