Hi Jeff, > they relied on a 3rd party to containerize a Pythonprogram for transmission
That is due to the pecularities of Python's serialization module than anything intrinsic to creating a Spark binding. (E.g. Python's pickle format doesn't have support for serializing code and closures, so some extra code was required.) This isn't an issue in Julia since Base.serialize() already has the needed functionality. An initial implementation of a Spark binding done in the same style as PySpark is available at http://github.com/jey/Spock.jl -Jey On Tue, May 19, 2015 at 11:51 AM, Jeff Waller <[email protected]> wrote: > Is this the part where I say Julia-Spark again? > > I think this is pretty doable in time. It will likely be more or less a > port of PySpark since Julia > and Python are similar in capability. I think I counted about 6K lines > (including comments). > > According to the pyspark presentation, they relied on a 3rd party to > containerize a Python > program for transmission -- I think I'm remembering this right. That might > be a problem to > overcome.
