Re: Amazon Elastic Cache + Spark Streaming

2017-09-22 Thread ayan guha
AWS Elastic Cache supports MemCach and Redis. Spark has a Redis connector which I believe you can use to connect to Elastic Cache. On Sat, Sep 23, 2017 at 5:08 AM, Saravanan Nagarajan wrote: > Hello, > > Anybody tried amazon elastic

Re: What are factors need to Be considered when upgrading to Spark 2.1.0 from Spark 1.6.0

2017-09-22 Thread Gokula Krishnan D
Thanks for the reply. Forgot to mention that, our Batch ETL Jobs are in Core-Spark. > On Sep 22, 2017, at 3:13 PM, Vadim Semenov > wrote: > > 1. 40s is pretty negligible unless you run your job very frequently, there > can be many factors that influence that. >

Re: What are factors need to Be considered when upgrading to Spark 2.1.0 from Spark 1.6.0

2017-09-22 Thread Vadim Semenov
1. 40s is pretty negligible unless you run your job very frequently, there can be many factors that influence that. 2. Try to compare the CPU time instead of the wall-clock time 3. Check the stages that got slower and compare the DAGs 4. Test with dynamic allocation disabled On Fri, Sep 22,

Amazon Elastic Cache + Spark Streaming

2017-09-22 Thread Saravanan Nagarajan
Hello, Anybody tried amazon elastic cache.Just give me some pointers. Thanks!

What are factors need to Be considered when upgrading to Spark 2.1.0 from Spark 1.6.0

2017-09-22 Thread Gokula Krishnan D
Hello All, Currently our Batch ETL Jobs are in Spark 1.6.0 and planning to upgrade into Spark 2.1.0. With minor code changes (like configuration and Spark Session.sc) able to execute the existing JOB into Spark 2.1.0. But noticed that JOB completion timings are much better in Spark 1.6.0 but no

RE: plotting/resampling timeseries data

2017-09-22 Thread Brian Wylie
@vermanuraq Great thanks, just what I needed.. I knew I was missing something simple. Cheers, -brian -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail:

Re: graphframes on cluster

2017-09-22 Thread Imran Rajjad
sorry for posting without complete information I am connecting to spark cluster with the driver program as the backend of web application. This is intended to listen to job progress and some other work. Below is how I am connecting to the cluster sparkConf = new SparkConf().setAppName("isolated

Re: Checkpoints not cleaned using Spark streaming + watermarking + kafka

2017-09-22 Thread MathieuP
The expected setting to clean these files is : - spark.sql.streaming.minBatchesToRetain More info on structured streaming settings : https://github.com/jaceklaskowski/spark-structured-streaming-book/blob/master/spark-sql-streaming-properties.adoc -- Sent from: