I have an actor receiver that reads data and calls "store()" to save data to spark. I was hoping spark.streaming.receiver.maxRate and spark.streaming.backpressure would help me block the method when needed to avoid overflowing the pipeline. But it doesn't. My actor pumps millions of lines to spark when backpressure and the rate limit is in effect. Whereas these data is slow flowing into the input blocks, the data created sits around and creates memory problem.
Is there guideline how to handle this? What's the best way for my actor to know it should slow down so it doesn't keep creating millions of messages? Blocking store() call seems aptable. Thanks, Lin