Re: [DISCUSS] Add ROUND_ROBIN DistributionMode to core API

Maximilian Michels Thu, 05 Mar 2026 06:12:54 -0800

Hi Han,

Let me try to contextualize this further. Today, we have three
DistributionModes [1]:


- NONE (forward as-is)
- HASH (hash partitioning by partition value)
- RANGE (range-partitioning for skewed partition values)

The idea is to add another DistributionMode which allows to equally
distribute values across all writers: ROUND_ROBIN. As the name
suggests, this mode is suitable for evenly-distributing the data when
there is no partitioning [2].

The existing NONE distribution mode can potentially create skewed
writes with small file sizes, e.g. in streaming writes. ROUND_ROBIN
would obviously fix that, due to evenly distributing the data.

Now, it is true that users can already instruct engines to achieve
round-robin distribution pre-writing to Iceberg. However, adding the
DistributionMode will make this much more straight-forward and
programmatically accessible across all engines. The amount of work to
implement the new mode would be reasonable.

Cheers,
Max

[1] 
https://github.com/apache/iceberg/blob/ed8a16bbeb549b0286d3c229beb5a0cf165f2f4b/api/src/main/java/org/apache/iceberg/DistributionMode.java#L39
[2] If there are partition values or identifier fields, engines switch
to the HASH or RANGE distribution mode, like it is the case also for
NONE today. This avoids partition fanout.

On Wed, Mar 4, 2026 at 5:37 PM Han You via dev <[email protected]> wrote:
>
> Hi Iceberg community,
>
> I'd like to propose adding a new `ROUND_ROBIN` value to the core 
> `DistributionMode` enum and would appreciate the community's input.
>
> ### Background
>
> In Flink's `DynamicIcebergSink`, `DistributionMode.NONE` currently 
> distributes records across writer subtasks in a round-robin fashion. This is 
> inconsistent with the regular `IcebergSink`, where `NONE` means records are 
> forwarded directly to writers without any redistribution. The round-robin 
> behavior in `DynamicIcebergSink` forces records through a hash shuffle, which 
> incurs serdes  and network cost — even when no meaningful distribution is 
> needed. In high-throughput pipelines, these overheads can be a significant 
> bottleneck.
>
> By giving the round-robin behavior its own explicit mode (`ROUND_ROBIN`), we 
> can make `NONE` consistently mean "no redistribution" across all Flink sinks. 
> In `DynamicIcebergSink`, this enables operator chaining via a forward edge, 
> eliminating the serdes and shuffle overhead entirely.
>
> ### Why this is a core API change
>
> The `DynamicRecord` API already uses the core `DistributionMode` enum to 
> specify distribution per-record. Adding a new mode to distinguish between 
> "truly no distribution" (`NONE`) and "distribute evenly across writers" 
> (`ROUND_ROBIN`) requires adding the new value to the core enum in 
> `org.apache.iceberg.DistributionMode`.
>
> ### Flink consensus
>
> This has been discussed on PR #15433 [1] with the committers (Maximilian 
> Michels and Peter Vary). There is agreement on the Flink side that:
> - `NONE` should mean true forward/no-redistribution
> - `ROUND_ROBIN` should take over the current round-robin behavior of `NONE` 
> in `DynamicIcebergSink`
>
> The discussion point for this thread is specifically about adding 
> `ROUND_ROBIN` to the core API enum and its impact on other engine 
> integrations.
>
> ### Handling in Spark
>
> Spark's connector API does not have a round-robin distribution concept. The 
> current PR maps `ROUND_ROBIN` to `Distributions.unspecified()`, the same as 
> `NONE`, so that existing Spark tests pass. However, I'd appreciate the 
> community's thoughts on which approach is preferred:
>
> 1. Treat as alias for NONE: `ROUND_ROBIN` maps to 
> `Distributions.unspecified()` like currently in the PR. Pragmatic, but 
> potentially confusing if users expect actual round-robin behavior.
> 2. Fail explicitly: Throw an error if a table is configured with 
> `ROUND_ROBIN` in Spark, until a proper implementation (e.g. using Spark's 
> `repartition()`) is added.
>
> ### Other engine writers
>
> Other engines’ Iceberg writer integrations will also need to handle the new 
> enum value. Engine maintainers will need to consider whether `ROUND_ROBIN` 
> has a meaningful distinct behavior in their context or whether it should be 
> treated as an alias/error.
>
> Looking forward to the community's feedback.
>
> [1] https://github.com/apache/iceberg/pull/15433
>
> Best,
> Han
>
>
> ________________________________
>
> The information in this e-mail is intended only for the person or entity to 
> which it is addressed.
>
> It may contain confidential and /or privileged material, the disclosure of 
> which is prohibited. Any unauthorized copying, disclosure or distribution of 
> the information in this email outside your company is strictly forbidden.
>
> If you are not the intended recipient (or have received this email in error), 
> please contact the sender immediately and permanently delete all copies of 
> this email and any attachments from your computer system and destroy any hard 
> copies. Although the information in this email has been compiled with great 
> care, neither IMC nor any of its related entities shall accept any 
> responsibility for any errors, omissions or other inaccuracies in this 
> information or for the consequences thereof, nor shall it be bound in any way 
> by the contents of this e-mail or its attachments.
>
> Messages and attachments are scanned for all known viruses. Always scan 
> attachments before opening them.

Re: [DISCUSS] Add ROUND_ROBIN DistributionMode to core API

Reply via email to