Hmm it seems you have to use the Hadoop Api to create the parquet file in order to get it to locally parallelise. Which is quite weird considering the blocks are visible to spark. Anyhow some of my derived files are created using this API and all is hunky dory now.
Lesson of the day, generate your files using a spark context :D -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Can-t-get-a-local-job-to-parallelise-using-0-9-0-from-git-with-parquet-and-avro-tp1130p1338.html Sent from the Apache Spark User List mailing list archive at Nabble.com.
