Re: flink key by 逻辑疑问

2022-05-29 文章 Shengkai Fang
Hi.

会根据 key 的 hash 值分配到固定个数的 keygroup 之中的。简单来说,跟HashMap>
有点相似。金竹老师有一篇文章详细解释了[1]。

如果想看实现的话,可以从 KeyGroupStreamPartitioner 入手来看看 Table 层是怎么做的。

Best,
Shengkai

[1] https://developer.aliyun.com/article/667562

Peihui He  于2022年5月29日周日 11:55写道:

> Hi, all
>
> 请教下大家,flink key by 后 使用process 来处理数据。现在有个问题:
> 当key不限量的情况下,比如uuid,这种情况下,下游都会创建一个process 对象来处理数据不?
> 如果这样的话,是不是没多久就会oom呢?
>
> 大家有熟悉这块相关flink 源码不?求指导,想自己观察下。
>
> Best Regards!
>


Re: flink key by 逻辑疑问

2022-05-29 文章 yidan zhao
下游的process算子数取决于并行度,parallelism。而不是你的数据key有多少。

Peihui He  于2022年5月29日周日 11:55写道:
>
> Hi, all
>
> 请教下大家,flink key by 后 使用process 来处理数据。现在有个问题:
> 当key不限量的情况下,比如uuid,这种情况下,下游都会创建一个process 对象来处理数据不?
> 如果这样的话,是不是没多久就会oom呢?
>
> 大家有熟悉这块相关flink 源码不?求指导,想自己观察下。
>
> Best Regards!


Re: GlobalCommitter in Flink's two-phase commit

2022-05-29 文章 Jing Ge
Hi,

1. What are the general usage scenarios of GlobalCommitter?
- GlobalCommitter is used for creating and committing an aggregated
committable. It is part of a 2-phase-commit protocol. One use case is the
compaction of small files.

2. Why should GlobalCommitter be removed in the new version of the api?
- As FLIP-191 described, there are many different requirement from
different downstream systems, e.g. Iceberg, Delta lake, Hive. One
GlobalCommitter could not cover all of them. If you take a look at the
SinkV1Adapter source code, you will see that
StandardSinkTopologies#addGlobalCommitter, which is recommended to replace
the usage of GlobalCommitter, is used to take care of the  post commit
topology.

Best regards,
Jing

On Tue, May 24, 2022 at 9:11 AM di wu <676366...@qq.com> wrote:

> Hello
> Regarding the GlobalCommitter in Flink's two-phase commit,
> I see it was introduced in FLIP-143, but it seems to have been removed
> again in FLP-191 and marked as Deprecated in the source code.
> I haven't found any relevant information about the use of GlobalCommitter.
>
> There are two questions I would like to ask:
> 1. What are the general usage scenarios of GlobalCommitter?
> 2. Why should GlobalCommitter be removed in the new version of the api?
>
> Thanks && Regards,
> di.wu
>
>