Re: flink keyby之后数据倾斜的问题

Caizhi Weng Tue, 26 Oct 2021 03:39:23 -0700

Hi！

Flink SQL 里已经内置了很多解倾斜的方式，例如 local global 聚合。详见 [1]，如果一定要使用 streaming api
可以参考该思路进行优化。


[1]
https://ci.apache.org/projects/flink/flink-docs-master/zh/docs/dev/table/tuning/#local-global-%e8%81%9a%e5%90%88

xiazhl <yuankuo....@qq.com.invalid> 于2021年10月26日周二 下午2:31写道：

> hello everyone！&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> &nbsp; &nbsp; &nbsp; 向大家求助一个使用keyby后导致数据倾斜的问题。&nbsp; &nbsp; &nbsp;&nbsp;
>
>
> &nbsp; &nbsp; &nbsp; 背景：使用flink streamAPI进行数据处理和提取，结果写入物理存储。
> 处理后会将数据量放大10倍左右。
> &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;
> 考虑到其中有大量重复数据，使用flink状态根据id进行精确去重。去重前使用keyby id对数据进行分区。
>
>
> &nbsp; &nbsp; &nbsp; 问题：目前keyby之后会产生数据倾斜，切斜比例&nbsp; 高:低≈3:1，
> 各位大佬有什么好的方案处理这个问题吗？

Re: flink keyby之后数据倾斜的问题

回复