Hi Ken and Andrey, Thanks for the response. I think there is a confusion that the writes to Cassandra are happening within the Async function. In my test, the async function is just a pass-through without doing any work.
So any Cassandra related batching or buffering should not be the cause for this. Thanks, Seed On Tue, Mar 19, 2019 at 12:35 PM Ken Krugler <[email protected]> wrote: > Hi Seed, > > It’s a known issue that Flink doesn’t report back pressure properly for > AsyncFunctions, due to how it monitors the output collector to gather back > pressure statistics. > > But that wouldn’t explain how you get a faster processing with the > AsyncFunction inserted into your workflow. > > I haven’t looked at how the Cassandra sink handles batching, but if the > AsyncFunction somehow caused fewer, bigger Cassandra writes to happen then > that’s one (serious hand waving) explanation. > > — Ken > > On Mar 18, 2019, at 7:48 PM, Seed Zeng <[email protected]> wrote: > > Flink Version - 1.6.1 > > In our application, we consume from Kafka and sink to Cassandra in the > end. We are trying to introduce a custom async function in front of the > Sink to carry out some customized operations. In our testing, it appears > that the Async function is not generating backpressure to slow down our > Kafka Source when Cassandra becomes unhappy. Essentially compared to an > almost identical job where the only difference is the lack of the Async > function, Kafka source consumption speed is much higher under the same > settings and identical Cassandra cluster. The experiment is like this. > > Job 1 - without async function in front of Cassandra > Job 2 - with async function in front of Cassandra > > Job 1 is backpressured because Cassandra cannot handle all the writes and > eventually slows down the source rate to 6.5k/s. > Job 2 is slightly backpressured but was able to run at 14k/s. > > Is the AsyncFunction somehow not reporting the backpressure correctly? > > Thanks, > Seed > > > -------------------------- > Ken Krugler > +1 530-210-6378 > http://www.scaleunlimited.com > Custom big data solutions & training > Flink, Solr, Hadoop, Cascading & Cassandra > >
