Hi Ken and Andrey,

Thanks for the response. I think there is a confusion that the writes to
Cassandra are happening within the Async function.
In my test, the async function is just a pass-through without doing any
work.

So any Cassandra related batching or buffering should not be the cause for
this.

Thanks,

Seed

On Tue, Mar 19, 2019 at 12:35 PM Ken Krugler <[email protected]>
wrote:

> Hi Seed,
>
> It’s a known issue that Flink doesn’t report back pressure properly for
> AsyncFunctions, due to how it monitors the output collector to gather back
> pressure statistics.
>
> But that wouldn’t explain how you get a faster processing with the
> AsyncFunction inserted into your workflow.
>
> I haven’t looked at how the Cassandra sink handles batching, but if the
> AsyncFunction somehow caused fewer, bigger Cassandra writes to happen then
> that’s one (serious hand waving) explanation.
>
> — Ken
>
> On Mar 18, 2019, at 7:48 PM, Seed Zeng <[email protected]> wrote:
>
> Flink Version - 1.6.1
>
> In our application, we consume from Kafka and sink to Cassandra in the
> end. We are trying to introduce a custom async function in front of the
> Sink to carry out some customized operations. In our testing, it appears
> that the Async function is not generating backpressure to slow down our
> Kafka Source when Cassandra becomes unhappy. Essentially compared to an
> almost identical job where the only difference is the lack of the Async
> function, Kafka source consumption speed is much higher under the same
> settings and identical Cassandra cluster. The experiment is like this.
>
> Job 1 - without async function in front of Cassandra
> Job 2 - with async function in front of Cassandra
>
> Job 1 is backpressured because Cassandra cannot handle all the writes and
> eventually slows down the source rate to 6.5k/s.
> Job 2 is slightly backpressured but was able to run at 14k/s.
>
> Is the AsyncFunction somehow not reporting the backpressure correctly?
>
> Thanks,
> Seed
>
>
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://www.scaleunlimited.com
> Custom big data solutions & training
> Flink, Solr, Hadoop, Cascading & Cassandra
>
>

Reply via email to