Re: Automatic driver restart does not seem to be working in Spark Standalone

2015-11-25 Thread Kay-Uwe Moosheimer
Testet with Spark 1.5.2 Š Works perfect when exit code is non-zero.
And does not Restart with exit code equals zero.


Von:  Prem Sure 
Datum:  Mittwoch, 25. November 2015 19:57
An:  SRK 
Cc:  
Betreff:  Re: Automatic driver restart does not seem to be working in Spark
Standalone

I think automatic driver restart will happen, if driver fails with non-zero
exit code.
  --deploy-mode cluster
  --supervise


On Wed, Nov 25, 2015 at 1:46 PM, SRK  wrote:
> Hi,
> 
> I am submitting my Spark job with supervise option as shown below. When I
> kill the driver and the app from UI, the driver does not restart
> automatically. This is in a cluster mode.  Any suggestion on how to make
> Automatic Driver Restart work would be of great help.
> 
> --supervise
> 
> 
> Thanks,
> Swetha
> 
> 
> 
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Automatic-driver-restart-d
> oes-not-seem-to-be-working-in-Spark-Standalone-tp25478.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 





How to use data from Database and reload every hour

2015-11-05 Thread Kay-Uwe Moosheimer
I have the following problem.
We have MySQL and an Spark cluster.
We need to load 5 different validation-instructions (several thousand of
entries each) and use this information on the executors to decide if content
from Kafka-Streaming is for process a or b.
The streaming data from kafka are json messages and the validation-info from
MySQL says „if field a is that and field b ist that then process a“ and so
on.

The tables on MySQL are changing over time and we have to reload the data
every hour.
I tried to use broadcasting where I load the data and store it on HashSets
and HashMaps (java code), but It’s not possible to redistribute the data.

What would be the best way to resolve my problem?
Se native jdbc in executor task an load the data – can the executor store
this data on HashSets etc. for next call so that I only load the data every
hour?
Use other possibilities?