Hi, Sundar

1. I think you might need to jstack the java process that is a bottleneck
and find where the task is stuck.
2. Could you share the code that your job looked like? I think maybe it
could help people to know what exactly happened.

Best,
Guowei


sundar <[email protected]> 于2020年5月20日周三 上午5:24写道:

> Initially, we had a flink pipeline which did the following -
> Kafka source -> KeyBy ID -> Map -> Kafka Sink.
>
> We now want to enrich this pipeline with a side input from a file. Similar
> to the recommendation  here
> <
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Best-pattern-for-achieving-stream-enrichment-side-input-from-a-large-static-source-td25771.html#a25780>
>
> ,
> we modified our pipeline to have the following flow.
>
> Kafka ---------------------------------> KeyBy ID------ -----Co flat map
> -----> Sink
>
> |
>
> |
> Continuous File Monitoring Function ->KeyBy ID ------
> (ID, static data)
>
> The file is relatively small( a few GBs) but not small enough to fit in RAM
> of the driver node. So we cannot load it there.
>
> The issue I am facing is that we are having high back pressure in this new
> pipeline after adding the file source. The initial job(without side inputs)
> worked fine with good through put. However, after adding the second source
> and a co flat Map function, even 4X ing the amount of parallelism does not
> solve the problem and has high back pressure.
> 1. The back pressure is high between the Kafka source and the coFlatMap
> function.
> Is this expected?  Could some one help with the reasoning behind why this
> new pipeline is much more resource intensive than the original pipeline?
> 2. Are there any caveats to keep in mind when connecting a bounded stream
> with an unbounded stream?
> 3. Also, is this the recommended design pattern for enriching a stream with
> a static data set? Is there a better way to solve the problem?
>
> Also we are using Flink-1.6.3 so we do not have the ability to use any API
> advancements in the future versions of Flink.
>
> Thanks!
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>

Reply via email to