[ https://issues.apache.org/jira/browse/FLINK-34694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Roman Boyko updated FLINK-34694: -------------------------------- Affects Version/s: (was: 1.16.3) (was: 1.17.2) (was: 1.19.0) (was: 1.18.1) > Delete num of associations for streaming outer join > --------------------------------------------------- > > Key: FLINK-34694 > URL: https://issues.apache.org/jira/browse/FLINK-34694 > Project: Flink > Issue Type: Improvement > Components: Table SQL / Runtime > Reporter: Roman Boyko > Priority: Major > Attachments: image-2024-03-15-19-51-29-282.png, > image-2024-03-15-19-52-24-391.png > > > Currently in StreamingJoinOperator (non-window) in case of OUTER JOIN the > OuterJoinRecordStateView is used to store additional field - the number of > associations for every record. This leads to store additional Tuple2 and > Integer data for every record in outer state. > This functionality is used only for sending: > * -D[nullPaddingRecord] in case of first Accumulate record > * +I[nullPaddingRecord] in case of last Revoke record > The overhead of storing additional data and updating the counter for > associations can be avoided by checking the input state for these events. > > The proposed solution can be found here - > [https://github.com/rovboyko/flink/commit/1ca2f5bdfc2d44b99d180abb6a4dda123e49d423] > > According to the nexmark q20 test (changed to OUTER JOIN) it could increase > the performance up to 20%: > * Before: > !image-2024-03-15-19-52-24-391.png! > * After: > !image-2024-03-15-19-51-29-282.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)