How does Julia interact with spark. I would be interested, mainly because I 
seem to find scala syntax a little obscure and it would be great to see actual 
numbers comparing scala, Python, Julia workloads. 

> On Feb 1, 2014, at 16:08, Aureliano Buendia <[email protected]> wrote:
> 
> A much (much) better solution than python, (and also scala, if that doesn't 
> make you upset) is julia.
> 
> Libraries like numpy and scipy are bloated when compared with julia c-like 
> performance. Julia comes with eveything that numpy+scipy come with + more - 
> performance hit.
> 
> I hope we can see an official support of julia on spark very soon.
> 
> 
>> On Thu, Jan 30, 2014 at 4:30 PM, nileshc <[email protected]> wrote:
>> Hi there,
>> 
>> *Background:*
>> I need to do some matrix multiplication stuff inside the mappers, and trying
>> to choose between Python and Scala for writing the Spark MR jobs. I'm
>> equally fluent with Python and Java, and find Scala pretty easy too for what
>> it's worth. Going with Python would let me use numpy + scipy, which is
>> blazing fast when compared to Java libraries like Colt etc. Configuring Java
>> with BLAS seems to be a pain when compared to scipy (direct apt-get
>> installs, or pip).
>> 
>> *Question:*
>> I posted a couple of comments on this answer at StackOverflow:
>> http://stackoverflow.com/questions/17236936/api-compatibility-between-scala-and-python.
>> Basically it states that as of Spark 0.7.2, the Python API would be slower
>> than Scala. What's the performance scenario now? The fork issue seems to be
>> fixed. How about serialization? Can it match Java/Scala Writable-like
>> serialization (having knowledge of object type beforehand, reducing I/O)
>> performance? Also, a probably silly question - loops seem to be slow in
>> Python in general, do you think this can turn out to be an issue?
>> 
>> Bottomline, should I choose Python for computation-intensive algorithms like
>> PageRank? Scipy gives me an edge, but does the framework kill it?
>> 
>> Any help, insights, benchmarks will be much appreciated. :)
>> 
>> Cheers,
>> Nilesh
>> 
>> 
>> 
>> --
>> View this message in context: 
>> http://apache-spark-user-list.1001560.n3.nabble.com/Python-API-Performance-tp1048.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 

Reply via email to