I do like this proposal because it essentially avoids all the issues that
Robert mentions by instead just offering the ability for a client to decide
in advance which commits would succeed. Leaving more advanced automatic or
server side determined deconfliction is a good future direction but I think
it’s orthogonal to this proposal.


On Tue, Jul 8, 2025 at 3:19 AM Robert Stupp <sn...@snazy.de> wrote:

> The general idea to resolve commit-conflicts  in Polaris is fine.
> However I miss some information about the tricky details.
>
> The tricky part is how to detect and in turn how to resolve those
> conflicts. That requires knowledge of the change being performed and
> its context.
>
> While it sounds simple to let one "append" operation succeed on top of
> another conflicting "append" operation, it's in practice not that
> simple. At least Iceberg's sequence numbers get in the way here,
> because you'd get duplicate sequence IDs in that case.
> Other cases, for example writes with "merge on write" are even
> trickier (one commit deletes existing and adds new data files) -
> resolving two conflicting "merge on write" operations is very
> difficult (or say: extremely expensive as you'd have to perform a
> `diff` on the data).
> Another case that comes to mind is one schema change ("happens
> before") dropping a column plus an append referring to the dropped
> column ("happens after").
>
> In Nessie we've been thinking about this problem for quite a while
> [1], the outcome every time was that it would be an awesome feature,
> but a lot of necessary contextual information (aka what Iceberg stores
> and what Iceberg provides in commits) is missing.
>
> IMHO we should think about the actual conflict resolution first and
> have the necessary changes in Iceberg.
>
> [1] https://github.com/projectnessie/nessie/issues/2513
>
>
> On Tue, Jul 8, 2025 at 1:27 AM Dmitri Bourlatchkov <di...@apache.org>
> wrote:
> >
> > ... but the Polaris Server still has to reconcile the metadata for
> > conflicting changes (before it commits), right?
> >
> > Thanks,
> > Dmitri.
> >
> > On Mon, Jul 7, 2025 at 7:22 PM Eric Maynard <eric.w.mayn...@gmail.com>
> > wrote:
> >
> > > Hi Dmitri, thanks for checking the doc out.
> > >
> > > Indeed, in this implementation, the server does not apply any "decision
> > > logic" at all to the commits. Or perhaps it's more accurate to say
> that the
> > > decision logic applied is only to inspect the commits and check for
> their
> > > mutual consent to deconflict. The server trusts this mutual consent.
> > >
> > > There's a small section about other strategies at the end of the doc --
> > > essentially, I think we could implement various deconfliction
> strategies
> > > and allow them to be mixed together, like we do with the FileIOFactory
> > > implementations for example.
> > >
> > > --EM
> > >
> > > On Mon, Jul 7, 2025 at 3:57 PM Dmitri Bourlatchkov <di...@apache.org>
> > > wrote:
> > >
> > > > Hi Eric,
> > > >
> > > > This sounds like an interesting approach to me.
> > > >
> > > > I wonder how much decision logic do you envision Polaris to perform
> for
> > > > de-conflictling? Is it mostly approving based submitted "Writer" ID
> > > checks
> > > > or will Polaris validate actual table changes?
> > > >
> > > > I added some comments to the doc too.
> > > >
> > > > Thanks,
> > > > Dmitri.
> > > >
> > > > On Mon, Jul 7, 2025 at 6:33 PM Eric Maynard <
> eric.w.mayn...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > Wanted to share this short design doc
> > > > > <
> > > > >
> > > >
> > >
> https://docs.google.com/document/d/1tkqBOYtkcA7fbDmhIAE6_6Jmus5WwP6vS6jA_JHp4Ms
> > > > > >
> > > > > for
> > > > > a simple method of allowing conflicting commits to both be
> committed.
> > > If
> > > > > implemented, this would allow e.g. two writers doing append-only
> > > > operations
> > > > > to a table in Polaris to always succeed.
> > > > >
> > > > > If you're interested, please take a look. In the meantime, I'll be
> > > > > preparing a small draft PR to serve as a reference implementation.
> > > > >
> > > > > --EM
> > > > >
> > > >
> > >
>

Reply via email to