Extremely large job serialization produced by union operator

杨力 Fri, 09 Mar 2018 02:01:17 -0800

I wrote a flink-sql app with following topography.

KafkaJsonTableSource -> SQL -> toAppendStream -> Map -> JDBCAppendTableSink
KafkaJsonTableSource -> SQL -> toAppendStream -> Map -> JDBCAppendTableSink
...
KafkaJsonTableSource -> SQL -> toAppendStream -> Map -> JDBCAppendTableSink


I have a dozen of TableSources And tens of SQLs. As a result, the number of
JDBCAppendTableSink times parallelism, that is the number of concurrent
connections to database, is too large for the database server to handle. So
I tried union DataStreams before connecting them to the TableSink.

KafkaJsonTableSource -> SQL -> toAppendStream -> Map
\
KafkaJsonTableSource -> SQL -> toAppendStream -> Map --- union ->
JDBCAppendTableSink
... /
KafkaJsonTableSource -> SQL -> toAppendStream -> Map

With this strategy, job submission failed with an OversizedPayloadException
of 104 MB. Increasing akka.framesize helps to avoid this exception, but job
submission hangs and times out.

I can't understand why a simple union operator would serialize to such a
large message. Can I avoid this problem?
Or can I change some configuration to fix the submission time out?

Regards,
Bill

Extremely large job serialization produced by union operator

Reply via email to