mithilgirish commented on PR #1104:
URL: https://github.com/apache/iceberg-go/pull/1104#issuecomment-4506942462

   All five concerns are valid and addressed in the updated patch.
   
   * **Error Propagation:** Fixed. Errors from `WalkDir`, `DeleteFiles`, and 
individual `Remove` calls are now collected via `errors.Join` and returned. 
`os.IsNotExist` is safely ignored. The declared `(err error)` return now 
carries actual signal.
   * **No Silent Fallbacks:** Fixed. The `LoadTable` fallback in SQL, Glue, and 
Hive that was quietly downgrading `--purge` on any non-`ErrNoSuchTable` error 
has been removed. The catalog now returns a wrapped error immediately — the 
table entry is not dropped until a purge can be attempted.
   * **Tracking External Paths:** Fixed. `PurgeTableFiles` now unions the base 
`Location()` walk with a full crawl of active metadata, manifests, and 
snapshots. This mirrors the PyIceberg/Java approach and catches files under 
custom `write.data.path` / `write.metadata.path`.
   
   Holding the push on this last point until we make a call on **Option A vs 
B**:
   * **Option A:** Promote `getReferencedFiles` in `table/orphan_cleanup.go` to 
a public method on `Table`. No duplication, logic is already correct, but the 
diff touches `table/`.
   * **Option B:** Private helper in `catalog/internal/utils.go` via public 
metadata APIs. PR stays isolated to `catalog/`, but introduces duplication that 
needs to stay in sync.
   
   * **Test Coverage:** `TestPurgeTable` now writes real Parquet + a mock 
`stats.puffin` outside the table root, asserting both are fully wiped. Glue and 
Hive mock purge tests removed — no filesystem backend meant they weren't 
covering `PurgeTableFiles` at all.
   * **PurgeableTable GoDoc:** Documents the best-effort model, type-assertion 
pattern, the un-loadable-on-failure warning, and the client-side vs REST 
(`purgeRequested=true`) distinction.
   
   Let me know your preference on Option A or B, and I will push the updated 
patch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to