Spark.jl provides Julia bindings for Apache Spark - by far the most popular computational framework in Hadoop ecosystem. Find it at:
https://github.com/dfdx/Spark.jl There's still *a lot* of work to do (Spark API is *huge*), but Spark.jl already supports: - map and reduce functions, as well as map_partitions, map_partitions_with_index, count, collect and others; - text files on local disk and HDFS (and theoretically and Hadoop-compatible file system); - local, Standalone and Mesos masters (YARN is quite different, though I work hard to add it as well); - adding custom JARs and other files See Roadmap <https://github.com/dfdx/Spark.jl/issues/1> for detailed status and nearest plans. Since Spark's API is so wide and it's hard to prioritize, *I heavily encourage users to submit bugs and feature requests*. Fill free to open new issues or add +1 to push a feature. And as usual, bug reports and pull requests are welcome too. *Question to the community:* should this package be transferred to some Julia organization (e.g. JuliaParallel) to make it easier to discover?
