Oh, that is perfect! Thank you! I think we should be good to add this feature then. We are currently testing this patch internally and once we are happy with it, we can submit it as a PR to the main repository, if that is okay with you.
On Thu, Feb 26, 2026 at 5:33 PM Jingsong Li <[email protected]> wrote: > Hi Mike, > > For the second scenario, here is an option: > 'commit.strict-mode.last-safe-snapshot'. If you are using > RescaleAction, it will set this option to check this scenario. > > Best, > Jingsong > > On Thu, Feb 26, 2026 at 2:27 PM Mike Dias <[email protected]> wrote: > > > > Thanks, Jingsong! > > > > It seems we already check the number of buckets being equal when > committing here -> > https://github.com/apache/paimon/blob/e1eeec56954c19ed78fd0bd4a46e0a332443397d/paimon-core/src/main/java/org/apache/paimon/operation/commit/ConflictDetection.java#L219 > . > > > > I think that should capture the first scenario where: > > > > writer starts > > rescale starts > > rescale commits > > writer commits -> fails because the number of buckets changed > > > > I don't think it would address the second scenario where: > > > > rescale starts > > writer starts > > writes commits > > rescale commits -> previous commit is overwritten > > > > Is my understanding correct? Not sure if it is possible to detect the > second scenario, though... users will need to ensure that no writer is > running/started duing the rescaling process. > > > > > > On Thu, Feb 26, 2026 at 3:24 PM Jingsong Li <[email protected]> > wrote: > >> > >> Hi Mike, > >> > >> This is a good question. > >> > >> As far as I know, Paimon does not strictly check that all partitions > >> must have the same number of buckets. It is possible to achieve > >> different buckets for different partitions, but it is more complex. We > >> may need to scan the manifests when writing to ensure that the number > >> of buckets written to the partitions is the same as before, otherwise > >> it will cause inconsistent data correctness issues. > >> > >> Best, > >> Jingsong > >> > >> On Mon, Feb 16, 2026 at 1:19 PM Mike Dias via dev < > [email protected]> wrote: > >> > > >> > Hi Paimon maintainers, > >> > > >> > I'm looking to implement a change that would allow different > partitions > >> > within a PK fixed-bucket table to have different bucket counts, > primarily > >> > to support highly skewed partitions with more/fewer buckets. > >> > > >> > We would use dynamic buckets to handle skew, but we really need > multiple > >> > writers writing to the same active partitions in both streaming and > batch, > >> > which doesn't seem to be something we could easily support with > dynamic > >> > buckets without coordinating changes to the bucket index file... > >> > > >> > On the fixed-buckets side, though, it seems we are in a good spot to > >> > implement per-partition bucketing, and this rescale doc > >> > <https://paimon.apache.org/docs/1.3/maintenance/rescale-bucket/> > suggests > >> > we can already do that for partitions that aren't receiving writes. > >> > Unfortunately, our partitions are not time-based, and most of them are > >> > always receiving writes... > >> > > >> > Hence, we would need to adapt the current code to allow writers to > look up > >> > the bucket counts from the manifest partition rather than relying on > the > >> > global table bucket count. > >> > > >> > That brings me to the following questions: > >> > > >> > 1. *Can we actually do this?:* Are there architectural reasons why > >> > bucket counts must be uniform across all partitions? Are there > assumptions > >> > elsewhere in the codebase that depend on a single global bucket > count? > >> > 2. *Concurrent writers:* If multiple writers are active, they each > >> > independently load the partition bucket mapping at initialization, > which > >> > creates a risk of inconsistency if a rescale operation completes > between > >> > when different writers load their mappings. This is not too > different from > >> > the existing behavior, but with a global bucket count, it is much > easier to > >> > safeguard against it. Do you have ideas on how we could mitigate > this issue > >> > or warn users against this pitfall? > >> > 3. *Read path:* On the read side, does the scan/split logic already > >> > handle partitions with heterogeneous bucket counts, or would > changes be > >> > needed there as well? > >> > > >> > > >> > Any guidance on gotchas or prior art in this area would be greatly > >> > appreciated. Happy to share the full diff or open a draft PR if that > would > >> > be easier to review. > >> > > >> > -- > >> > Thanks, > >> > Mike Dias > > > > > > > > -- > > Thanks, > > Mike Dias > -- Thanks, Mike Dias
