I guess I could implement a solution which is not static and extends
the OneInputStreamOperator Flink operator.
https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/MqttSensorDataSkewedCombinerByKeySkewedDAG.java#L84
Best,
Felipe
*--*
*-- Feli
thanks All for your suggestions!
I am not sure if the option 3 that Fabian said I will need to change the
Flink source code or it can be implemented on top of Flink.
-
3) One approach to improve the processing of skewed data, is to change how
keyed state is handled.
Flink's
Just a small addition:
If two hot keys fall into two key groups which are being processed by the
same TM, then it could help to change the parallelism, because then the key
group mapping might be different.
If two hot keys fall into the same key group, you can adjust the max
parallelism which def
Hi Felipe,
three comments:
1) applying rebalance(), shuffle(), or rescale() before a keyBy() has no
effect:
keyBy() introduces a hash partitioning such that any data partitioning that
you do immediately before keyBy() is destroyed.
You only change the distribution for the call of the key extracto
Hi,
I am studying data skew processing in Flink and how I can change the
low-level control of physical partition (
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/#physical-partitioning)
in order to have an even processing of tuples. I have created synthetic
skewed data