Hello Everyone,

I'm running a streaming application using Flink 1.11 and EMR 6.01. My use
case is reading files from a s3 bucket, filter file contents ( say record)
and enrich each record. Filter records and output to a sink.
I'm reading 6k files per 15mints and the total number of records is 3
billion/15mints. I'm using a flat map operator to convert the file into
records and enrich records in a synchronous call.

*Problem* : My application fails (Checkpoint timeout) to run if i add more
filter criteria(operator). I suspect the system is not able to scale (CPU
util as still 20%) because of the synchronous call. I want to convert this
flat map to an asynchronous call using AsyncFunction. I was looking for
something like an AsyncCollector.collect
<https://nightlies.apache.org/flink/flink-docs-release-1.3/api/java/org/apache/flink/streaming/api/functions/async/collector/AsyncCollector.html#collect-java.util.Collection->

https://nightlies.apache.org/flink/flink-docs-release-1.3/api/java/org/apache/flink/streaming/api/functions/async/AsyncFunction.html
to complement my current synchronous implementation using flatmap but it
seems like this is not available in Flink 1.11.

*Question* :
Could someone please help me with converting this flatmap operation to an
asynchronous call?

Please let me know if you have any questions.

Best,

Reply via email to