So as you were maybe thinking, it only happens with the combination: Direct Stream only + backpressure = works as expected
4x Receiver on Topic A + Direct Stream on Topic B + backpressure = the direct stream is throttled even in the absence of scheduling delay This is using Spark 1.5.0 on CDH. After it's been running for several minutes if I look at "Input Metadata" I can see that the direct stream is consuming 1 record / partition / sec. I have maxrate set at 10,000 records / partition / sec. I'll file a bug today unless someone has any ideas? Thanks! Jeff On Fri, Sep 9, 2016 at 5:54 PM, Jeff Nadler <jnad...@srcginc.com> wrote: > Yes I'll test that next. > > On Sep 9, 2016 5:36 PM, "Cody Koeninger" <c...@koeninger.org> wrote: > >> Does the same thing happen if you're only using direct stream plus back >> pressure, not the receiver stream? >> >> On Sep 9, 2016 6:41 PM, "Jeff Nadler" <jnad...@srcginc.com> wrote: >> >>> Maybe this is a pretty esoteric implementation, but I'm seeing some bad >>> behavior with backpressure plus multiple Kafka streams / direct streams. >>> >>> Here's the scenario: >>> We have 1 Kafka topic using the reliable receiver (4 receivers, union >>> the result). In the same app, we consume another Kafka topic using a >>> direct stream. >>> >>> This may seem strange, but it's necessary in my application to work >>> around another problem: Maxrate is set globally in SparkConf. IMO It >>> would be more flexible if we could set maxrate for each stream >>> independently. Since directstream uses a different config parameter for >>> maxrate, we get the desired result. >>> >>> A bit hacky I know. >>> >>> Anyway, we recently turned on backpressure. It works as expected for >>> the receiver-based stream. For the direct stream, it starts out at the >>> maxrate (as expected) on the first batch. Then it ratchets down the >>> consumption until it is eventually consuming 1 record / second / partition. >>> >>> This happens even though there's no scheduling delay, and the >>> receiver-based stream does not appear to be throttled. >>> >>> Anyone ever see anything like this? >>> >>> Thanks! >>> >>> Jeff Nadler >>> Aerohive Networks >>> >>>