One solution is to read the scheduling delay and my actor can go to sleep if 
needed. Is this possible?

From: Lin Zhao <l...@exabeam.com<mailto:l...@exabeam.com>>
Date: Wednesday, January 27, 2016 at 5:28 PM
To: "user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Spark streaming flow control and back pressure

I have an actor receiver that reads data and calls "store()" to save data to 
spark. I was hoping spark.streaming.receiver.maxRate and 
spark.streaming.backpressure would help me block the method when needed to 
avoid overflowing the pipeline. But it doesn't. My actor pumps millions of 
lines to spark when backpressure and the rate limit is in effect. Whereas these 
data is slow flowing into the input blocks, the data created sits around and 
creates memory problem.

Is there guideline how to handle this? What's the best way for my actor to know 
it should slow down so it doesn't keep creating millions of messages? Blocking 
store() call seems aptable.

Thanks, Lin

Reply via email to