amogh-jahagirdar commented on code in PR #6651:
URL: https://github.com/apache/iceberg/pull/6651#discussion_r1115168152
##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkCatalog.java:
##########
@@ -760,9 +760,12 @@ private Table loadFromPathIdentifier(PathIdentifier ident)
{
} else if (branch != null) {
Snapshot branchSnapshot = table.snapshot(branch);
- Preconditions.checkArgument(
- branchSnapshot != null, "Cannot find snapshot associated with branch
name: %s", branch);
- return new SparkTable(table, branchSnapshot.snapshotId(), !cacheEnabled);
+
+ // It's possible that the branch does not exist when performing writes
to new branches.
+ // Load table should still succeed when spark is performing the write.
+ // Reads performed on non-existing branches will fail at a later point
+ Long branchSnapshotId = branchSnapshot == null ? null :
branchSnapshot.snapshotId();
+ return new SparkTable(table, branchSnapshotId, !cacheEnabled);
Review Comment:
@namrathamyske @rdblue @aokolnychyi @jackye1995 I'm removing this check
because this prevents writing to new branches. Catalog#loadTable gets called in
spark when planning the write, and we fail the validation check that the branch
snapshot exists. I added a
[test](https://github.com/apache/iceberg/pull/6651/files#diff-1ddb496eb334873920605601cefbc7ebdda631855d72da3a030c0f6f89ba6043R181)
to validate that if a read on an invalid branch is performed we still fail
(albeit later, when trying to build the scan).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]