I was wondering what would work better for distributing cross-validation
jobs: IPython parallel or Spark? I tried with IPython parallel in the
past but remember having some issues with jobs crashing etc.

Michal

On 29/11/13 12:40, scikit-learn-general-requ...@lists.sourceforge.net wrote:
>> >013/11/27 Nick Pentreath <nick.pentre...@gmail.com>:
>> > CC'ing Spark Dev list
>> >
>> > I have been thinking about this for quite a while and would really love to
>> > see this happen.
>> >
>> > Most of my pipeline ends up in Scala/Spark these days - which I love, but 
>> > it
>> > is partly because I am reliant on custom Hadoop input formats that are just
>> > way easier to use from Scala/Java - but I still use Python a lot for data
>> > analysis and interactive work. There is some good stuff happening with
>> > Breeze in Scala and MLlib in Spark (and IScala) but the breadth just 
>> > doesn't
>> > compare as yet - not to mention IPython and plotting!
>> >
>> > There is a PR that was just merged into PySpark to allow arbitrary
>> > serialization protocols between the Java and Python layers. I hope to try 
>> > to
>> > use this to allow PySpark users to pull data from arbitrary Hadoop
>> > InputFormats with minimum fuss. This I believe will open the way for many
>> > (including me!) to use PySpark directly for virtually all distributed data
>> > processing without "needing" to use Java
>> > (https://github.com/apache/incubator-spark/pull/146)
>> > (http://mail-archives.apache.org/mod_mbox/incubator-spark-dev/201311.mbox/browser).


------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349351&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to