Hey team, here is the PR for this feature -> https://github.com/apache/paimon/pull/7865
Looking forward to getting your feedback! On Thu, Feb 26, 2026 at 5:52 PM Jingsong Li <[email protected]> wrote: > Look forward to hearing from you. > > Best, > Jingsong > > On Thu, Feb 26, 2026 at 2:37 PM Mike Dias <[email protected]> wrote: > > > > Oh, that is perfect! Thank you! > > > > I think we should be good to add this feature then. We are currently > testing this patch internally and once we are happy with it, we can submit > it as a PR to the main repository, if that is okay with you. > > > > On Thu, Feb 26, 2026 at 5:33 PM Jingsong Li <[email protected]> > wrote: > >> > >> Hi Mike, > >> > >> For the second scenario, here is an option: > >> 'commit.strict-mode.last-safe-snapshot'. If you are using > >> RescaleAction, it will set this option to check this scenario. > >> > >> Best, > >> Jingsong > >> > >> On Thu, Feb 26, 2026 at 2:27 PM Mike Dias <[email protected]> wrote: > >> > > >> > Thanks, Jingsong! > >> > > >> > It seems we already check the number of buckets being equal when > committing here -> > https://github.com/apache/paimon/blob/e1eeec56954c19ed78fd0bd4a46e0a332443397d/paimon-core/src/main/java/org/apache/paimon/operation/commit/ConflictDetection.java#L219 > . > >> > > >> > I think that should capture the first scenario where: > >> > > >> > writer starts > >> > rescale starts > >> > rescale commits > >> > writer commits -> fails because the number of buckets changed > >> > > >> > I don't think it would address the second scenario where: > >> > > >> > rescale starts > >> > writer starts > >> > writes commits > >> > rescale commits -> previous commit is overwritten > >> > > >> > Is my understanding correct? Not sure if it is possible to detect the > second scenario, though... users will need to ensure that no writer is > running/started duing the rescaling process. > >> > > >> > > >> > On Thu, Feb 26, 2026 at 3:24 PM Jingsong Li <[email protected]> > wrote: > >> >> > >> >> Hi Mike, > >> >> > >> >> This is a good question. > >> >> > >> >> As far as I know, Paimon does not strictly check that all partitions > >> >> must have the same number of buckets. It is possible to achieve > >> >> different buckets for different partitions, but it is more complex. > We > >> >> may need to scan the manifests when writing to ensure that the number > >> >> of buckets written to the partitions is the same as before, otherwise > >> >> it will cause inconsistent data correctness issues. > >> >> > >> >> Best, > >> >> Jingsong > >> >> > >> >> On Mon, Feb 16, 2026 at 1:19 PM Mike Dias via dev < > [email protected]> wrote: > >> >> > > >> >> > Hi Paimon maintainers, > >> >> > > >> >> > I'm looking to implement a change that would allow different > partitions > >> >> > within a PK fixed-bucket table to have different bucket counts, > primarily > >> >> > to support highly skewed partitions with more/fewer buckets. > >> >> > > >> >> > We would use dynamic buckets to handle skew, but we really need > multiple > >> >> > writers writing to the same active partitions in both streaming > and batch, > >> >> > which doesn't seem to be something we could easily support with > dynamic > >> >> > buckets without coordinating changes to the bucket index file... > >> >> > > >> >> > On the fixed-buckets side, though, it seems we are in a good spot > to > >> >> > implement per-partition bucketing, and this rescale doc > >> >> > <https://paimon.apache.org/docs/1.3/maintenance/rescale-bucket/> > suggests > >> >> > we can already do that for partitions that aren't receiving writes. > >> >> > Unfortunately, our partitions are not time-based, and most of them > are > >> >> > always receiving writes... > >> >> > > >> >> > Hence, we would need to adapt the current code to allow writers to > look up > >> >> > the bucket counts from the manifest partition rather than relying > on the > >> >> > global table bucket count. > >> >> > > >> >> > That brings me to the following questions: > >> >> > > >> >> > 1. *Can we actually do this?:* Are there architectural reasons > why > >> >> > bucket counts must be uniform across all partitions? Are there > assumptions > >> >> > elsewhere in the codebase that depend on a single global bucket > count? > >> >> > 2. *Concurrent writers:* If multiple writers are active, they > each > >> >> > independently load the partition bucket mapping at > initialization, which > >> >> > creates a risk of inconsistency if a rescale operation > completes between > >> >> > when different writers load their mappings. This is not too > different from > >> >> > the existing behavior, but with a global bucket count, it is > much easier to > >> >> > safeguard against it. Do you have ideas on how we could > mitigate this issue > >> >> > or warn users against this pitfall? > >> >> > 3. *Read path:* On the read side, does the scan/split logic > already > >> >> > handle partitions with heterogeneous bucket counts, or would > changes be > >> >> > needed there as well? > >> >> > > >> >> > > >> >> > Any guidance on gotchas or prior art in this area would be greatly > >> >> > appreciated. Happy to share the full diff or open a draft PR if > that would > >> >> > be easier to review. > >> >> > > >> >> > -- > >> >> > Thanks, > >> >> > Mike Dias > >> > > >> > > >> > > >> > -- > >> > Thanks, > >> > Mike Dias > > > > > > > > -- > > Thanks, > > Mike Dias > -- Thanks, Mike Dias
