The general idea to resolve commit-conflicts in Polaris is fine. However I miss some information about the tricky details.
The tricky part is how to detect and in turn how to resolve those conflicts. That requires knowledge of the change being performed and its context. While it sounds simple to let one "append" operation succeed on top of another conflicting "append" operation, it's in practice not that simple. At least Iceberg's sequence numbers get in the way here, because you'd get duplicate sequence IDs in that case. Other cases, for example writes with "merge on write" are even trickier (one commit deletes existing and adds new data files) - resolving two conflicting "merge on write" operations is very difficult (or say: extremely expensive as you'd have to perform a `diff` on the data). Another case that comes to mind is one schema change ("happens before") dropping a column plus an append referring to the dropped column ("happens after"). In Nessie we've been thinking about this problem for quite a while [1], the outcome every time was that it would be an awesome feature, but a lot of necessary contextual information (aka what Iceberg stores and what Iceberg provides in commits) is missing. IMHO we should think about the actual conflict resolution first and have the necessary changes in Iceberg. [1] https://github.com/projectnessie/nessie/issues/2513 On Tue, Jul 8, 2025 at 1:27 AM Dmitri Bourlatchkov <di...@apache.org> wrote: > > ... but the Polaris Server still has to reconcile the metadata for > conflicting changes (before it commits), right? > > Thanks, > Dmitri. > > On Mon, Jul 7, 2025 at 7:22 PM Eric Maynard <eric.w.mayn...@gmail.com> > wrote: > > > Hi Dmitri, thanks for checking the doc out. > > > > Indeed, in this implementation, the server does not apply any "decision > > logic" at all to the commits. Or perhaps it's more accurate to say that the > > decision logic applied is only to inspect the commits and check for their > > mutual consent to deconflict. The server trusts this mutual consent. > > > > There's a small section about other strategies at the end of the doc -- > > essentially, I think we could implement various deconfliction strategies > > and allow them to be mixed together, like we do with the FileIOFactory > > implementations for example. > > > > --EM > > > > On Mon, Jul 7, 2025 at 3:57 PM Dmitri Bourlatchkov <di...@apache.org> > > wrote: > > > > > Hi Eric, > > > > > > This sounds like an interesting approach to me. > > > > > > I wonder how much decision logic do you envision Polaris to perform for > > > de-conflictling? Is it mostly approving based submitted "Writer" ID > > checks > > > or will Polaris validate actual table changes? > > > > > > I added some comments to the doc too. > > > > > > Thanks, > > > Dmitri. > > > > > > On Mon, Jul 7, 2025 at 6:33 PM Eric Maynard <eric.w.mayn...@gmail.com> > > > wrote: > > > > > > > Hi all, > > > > > > > > Wanted to share this short design doc > > > > < > > > > > > > > > https://docs.google.com/document/d/1tkqBOYtkcA7fbDmhIAE6_6Jmus5WwP6vS6jA_JHp4Ms > > > > > > > > > for > > > > a simple method of allowing conflicting commits to both be committed. > > If > > > > implemented, this would allow e.g. two writers doing append-only > > > operations > > > > to a table in Polaris to always succeed. > > > > > > > > If you're interested, please take a look. In the meantime, I'll be > > > > preparing a small draft PR to serve as a reference implementation. > > > > > > > > --EM > > > > > > > > >