2013/12/9 Nick Pentreath <[email protected]>:
> This is a cool idea. And it is fairly straightforward. I hacked up an
> illustration this evening: https://gist.github.com/MLnick/7880766
>
> The better approach would be to amend the sklearn svmlight code to accept
> iterables of strings in addition to file handles, and then pretty much no
> additional code should be required (though since that part is in Cython I am
> not sure, I'm just assuming it should work by eyeballing for now).

Indeed. Related evolution of the svmlight loader but not directly
useful for the spark integration: seekable chunk reading with byte
offsets:

https://github.com/scikit-learn/scikit-learn/pull/935

Still you might want to have that piece of code in mind.

> Olivier, might it make sense to put on the Wiki page for the event a few
> ideas of what to look at / tackle? Not sure if this is usually done or
> helpful etc for these events.

Yes sure please feel free to go ahead.

-- 
Olivier

------------------------------------------------------------------------------
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to