Re: Questions about caching

2019-01-01 Thread Gourav Sengupta
Hi Andrew, If you use Spark UI then all your questions are already answered there let me know if you need any help to browse the UI to look at the contents that are cached. Regards, Gourav On Tue, 11 Dec 2018, 17:13 Andrew Melo Greetings, Spark Aficionados- > > I'm working on a project to

Python

2019-01-01 Thread Gourav Sengupta
Hi, Can I please confirm which version of Python 3.x is supported by Spark 2.4? Regards, Gourav

Re: Back pressure not working on streaming

2019-01-01 Thread JF Chen
yes, 10 is a very low value for testing initial rate. And from this article https://www.linkedin.com/pulse/enable-back-pressure-make-your-spark-streaming-production-lan-jiang/, it seems spark back pressure is not available for dstream? So ,max rate per partition is the only available back pressure

Re: Back pressure not working on streaming

2019-01-01 Thread Dillon Bostwick
Unsubscribe On Tue, Jan 1, 2019 at 10:03 PM JF Chen wrote: > I have set spark.streaming.backpressure.enabled to true, > spark.streaming.backpressure.initialRate > to 10. > Once my application started, it received 32 million messages on first > batch. > My application runs every 300 seconds,

Re: Back pressure not working on streaming

2019-01-01 Thread HARSH TAKKAR
There is separate property for max rate , by default is is not set, so if you want to limit the max rate you should provide that property a value. Initial rate =10 means it will pick only 10 records per receiver in the batch interval when you start the process. Depending upon the consumption

Back pressure not working on streaming

2019-01-01 Thread JF Chen
I have set spark.streaming.backpressure.enabled to true, spark.streaming.backpressure.initialRate to 10. Once my application started, it received 32 million messages on first batch. My application runs every 300 seconds, with 32 kafka partition. So what's is the max rate if I set initial rate to