Say I want to build a complete Spark distribution against Hadoop 2.6+ as fast as possible from scratch.
This is what I’m doing at the moment: ./make-distribution.sh -T 1C -Phadoop-2.6 -T 1C instructs Maven to spin up 1 thread per available core. This takes around 20 minutes on an m3.large instance. I see that spark-ec2, on the other hand, builds Spark as follows <https://github.com/amplab/spark-ec2/blob/a990752575cd8b0ab25731d7820a55c714798ec3/spark/init.sh#L21-L22> when you deploy Spark at a specific git commit: sbt/sbt clean assembly sbt/sbt publish-local This seems slower than using make-distribution.sh, actually. Is there a faster way to do this? Nick