Hi Arvid

The second picture shows the metrics of the upstream operator. The upstream
has 150 parallelisms as you can see in the first picture. I expect the
bytes sent is about 9 * bytes received as we have 9 downstream operators
connecting.

Hi Caizhi,
Let me create a minimal reproducible DAG and update here

On Thu, Dec 16, 2021 at 4:03 AM Arvid Heise <ar...@apache.org> wrote:

> Hi,
>
> Could you please clarify which operator we see in the second picture?
>
> If you are showing the upstream operator, then this has only parallelism
> 1, so there shouldn't be multiple subtasks.
> If you are showing the downstream operator, then the metric would refer to
> the HASH and not REBALANCE.
>
> On Tue, Dec 14, 2021 at 2:55 AM Caizhi Weng <tsreape...@gmail.com> wrote:
>
>> Hi!
>>
>> This doesn't seem to be the expected behavior. Rebalance shuffle should
>> send records to one of the parallelism, not all.
>>
>> If possible could you please explain what your Flink job is doing and
>> preferably share your user code so that others can look into this case?
>>
>> tao xiao <xiaotao...@gmail.com> 于2021年12月11日周六 01:11写道:
>>
>>> Hi team,
>>>
>>> I have one operator that is connected to another 9 downstream operators
>>> using rebalance. Each operator has 150 parallelisms[1]. I assume each
>>> message in the upstream operation is sent to one of the parallel instances
>>> of the 9 receiving operators so the total bytes sent should be roughly 9
>>> times of bytes received in the upstream operator metric. However the Flink
>>> UI shows the bytes sent is much higher than 9 times. It is about 150 * 9 *
>>> bytes received[2]. This looks to me like every message is duplicated to
>>> each parallel instance of all receiving operators like what broadcast
>>> does.  Is this correct?
>>>
>>>
>>>
>>> [1] https://imgur.com/cGyb0QO
>>> [2] https://imgur.com/SFqPiJA
>>> --
>>> Regards,
>>> Tao
>>>
>>

-- 
Regards,
Tao

Reply via email to