mzzz-zzm opened a new issue, #977:
URL: https://github.com/apache/iceberg-go/issues/977

   ### Apache Iceberg version
   
   main (development)
   
   ### Please describe the bug 🐞
   
   When a writer loads a table that has **no snapshot yet** on the target branch
   (empty table, or a branch that the writer is about to create), the
   `newConflictContext` helper returns an **empty** `concurrent` list — it 
treats
   the situation as "no concurrent activity". If another writer then commits a
   snapshot between the first writer's load time and its commit time, the 
conflict
   validators never see those concurrent commits. A `RowDelta` carrying
   equality-delete files will be accepted even though it should be rejected 
under
   `SERIALIZABLE` isolation.
   
   ## Affected component
   
   `table/conflict_validation.go` — `newConflictContext()`
   
   ## Steps to reproduce
   
   The test is self-contained and lives in **package `table`** (internal test) 
so
   it can call the unexported `newConflictContext` directly.  The helpers it 
uses
   (`newConflictTestMetadata`, `MainBranch`) are already in
   `table/conflict_validation_test.go` on `main`.
   
   Save the file below as `table/bug_repro_empty_table_conflict_test.go` in an
   unmodified checkout of `main` and run:
   
   ```
   go test ./table/ -run TestBugRepro_EmptyTableConflictDetectionBypassed -v
   ```
   
   Expected output on unfixed upstream:
   
   ```
   --- FAIL: TestBugRepro_EmptyTableConflictDetectionBypassed
       bug_repro_empty_table_conflict_test.go:76:
           Error:    "[]" should have 1 item(s), but has 0
           Messages: snapshot 42 must appear in concurrent when base has no 
branch head
   ```
   
   Full source of `table/bug_repro_empty_table_conflict_test.go`:
   
   ```go
   package table
   
   import (
        "testing"
   
        "github.com/stretchr/testify/assert"
        "github.com/stretchr/testify/require"
   )
   
   func TestBugRepro_EmptyTableConflictDetectionBypassed(t *testing.T) {
        // Writer loaded the table before any snapshot was committed on main.
        base := newConflictTestMetadata(t, nil)
   
        // Meanwhile another writer committed snapshot 42.
        head := int64(42)
        current := newConflictTestMetadata(t, &head)
   
        ctx, err := newConflictContext(base, current, MainBranch, nil, true)
        require.NoError(t, err)
   
        // Snapshot 42 was committed while Writer had no view of the branch.
        // It MUST appear in ctx.concurrent so that conflict validators (e.g.
        // RowDelta's serializable check) can inspect the data files it added.
        //
        // BUG: on upstream, ctx.concurrent is empty — snapshot 42 is invisible.
        // Any equality-delete commit from the empty-base writer will be 
accepted
        // without validating against the concurrent data.
        assert.Len(t, ctx.concurrent, 1,
                "snapshot 42 must appear in concurrent when base has no branch 
head")
   }
   ```
   
   ## Root cause
   
   The relevant section of `table/conflict_validation.go` on `main`:
   
   ```go
   baseHead := base.SnapshotByName(branch)
   if baseHead == nil {
       // Writer has no view of this branch yet (e.g. creating it) —
       // by definition there are no concurrent commits to validate against.
       return &conflictContext{current: current, branch: branch, fs: fs, 
caseSensitive: caseSensitive}, nil
   }
   ```
   
   When `baseHead` is `nil` the code returns immediately with an empty
   `concurrent` list, reasoning that there was "nothing to compare". But a nil
   `baseHead` simply means the writer did not see any snapshot when it loaded 
the
   table. Snapshots committed *after* that load but *before* this commit are 
still
   real concurrent activity that validators must examine.
   
   ## Concrete scenario where this matters
   
   1. Table is empty. Worker A begins building an equality-delete commit
      (e.g. a streaming DELETE job that marks rows as deleted by field value).
   2. Worker B appends new data files to the same partition and commits first.
      The table now has snapshot B.
   3. Worker A calls `Commit`. Under `SERIALIZABLE` isolation, Worker A's
      equality-delete must be rejected: Worker B added data that Worker A's
      delete predicate may not cover — Worker A must reload and rebuild.
   4. **Bug:** `newConflictContext` sees `baseHead == nil` and returns zero
      concurrent snapshots. The `RowDelta` validator never checks Worker B's
      data files. Worker A's commit is accepted and the equality deletes are
      applied against data that the writer never saw.
   
   ## Expected behaviour
   
   When a writer started from an empty branch (no prior snapshot), all snapshots
   currently on that branch were committed after the writer loaded the table and
   are therefore concurrent. The returned `conflictContext.concurrent` list must
   contain all of them so validators can decide whether a conflict exists.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to