Re: Spark Streaming recover from Checkpoint with Spark SQL

2015-03-12 Thread Marius Soutier
Thanks, the new guide did help - instantiating the SQLContext inside foreachRDD did the trick for me, but the SQLContext singleton works as well. Now the only problem left is that spark.driver.port is not retained after starting from a checkpoint, so my Actor receivers are running on a random

Spark Streaming recover from Checkpoint with Spark SQL

2015-03-11 Thread Marius Soutier
Hi, I’ve written a Spark Streaming Job that inserts into a Parquet, using stream.foreachRDD(_insertInto(“table”, overwrite = true). Now I’ve added checkpointing; everything works fine when starting from scratch. When starting from a checkpoint however, the job doesn’t work and produces the

Re: Spark Streaming recover from Checkpoint with Spark SQL

2015-03-11 Thread Marius Soutier
Forgot to mention, it works when using .foreachRDD(_.saveAsTextFile(“”)). On 11.03.2015, at 18:35, Marius Soutier mps@gmail.com wrote: Hi, I’ve written a Spark Streaming Job that inserts into a Parquet, using stream.foreachRDD(_insertInto(“table”, overwrite = true). Now I’ve added

Re: Spark Streaming recover from Checkpoint with Spark SQL

2015-03-11 Thread Tathagata Das
Can you show us the code that you are using? This might help. This is the updated streaming programming guide for 1.3, soon to be up, this is a quick preview. http://people.apache.org/~tdas/spark-1.3.0-temp-docs/streaming-programming-guide.html#dataframe-and-sql-operations TD On Wed, Mar 11,