[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6651: Spark 3.3 write to branch snapshot

via GitHub Wed, 22 Feb 2023 17:42:30 -0800


amogh-jahagirdar commented on code in PR #6651:
URL: https://github.com/apache/iceberg/pull/6651#discussion_r1115168152



##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/SparkCatalog.java:
##########
@@ -760,9 +760,12 @@ private Table loadFromPathIdentifier(PathIdentifier ident) 
{
 
     } else if (branch != null) {
       Snapshot branchSnapshot = table.snapshot(branch);
-      Preconditions.checkArgument(
-          branchSnapshot != null, "Cannot find snapshot associated with branch 
name: %s", branch);
-      return new SparkTable(table, branchSnapshot.snapshotId(), !cacheEnabled);
+
+      // It's possible that the branch does not exist when performing writes 
to new branches.
+      // Load table should still succeed when spark is performing the write.
+      // Reads performed on non-existing branches will fail at a later point
+      Long branchSnapshotId = branchSnapshot == null ? null : 
branchSnapshot.snapshotId();
+      return new SparkTable(table, branchSnapshotId, !cacheEnabled);

Review Comment:
   @namrathamyske @rdblue @aokolnychyi I'm removing this check because this 
prevents writing to new branches. Catalog#loadTable gets called in spark when 
planning the write, and we fail the validation check that the branch snapshot 
exists. I added a test to validate that if a read on an invalid branch is 
performed we still fail (albeit later, when trying to build the scan).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] amogh-jahagirdar commented on a diff in pull request #6651: Spark 3.3 write to branch snapshot

Reply via email to