rdblue commented on a change in pull request #1738: URL: https://github.com/apache/iceberg/pull/1738#discussion_r520004493
########## File path: spark3/src/main/java/org/apache/spark/sql/connector/catalog/SupportsRowLevelOperations.java ########## @@ -0,0 +1,27 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.spark.sql.connector.catalog; + +import org.apache.spark.sql.connector.write.LogicalWriteInfo; +import org.apache.spark.sql.connector.write.RowLevelOperationsBuilder; + +public interface SupportsRowLevelOperations extends Table { Review comment: I like replace because it is in the context of Spark operations, not Iceberg. For Spark, this is planning an operation to replace the what was read and I think it describes that case well. And in Spark, overwrite is used to configure a normal Write operation in the builder, using a completely different builder seems like it should use a different name. I agree that it isn't what we use "replace" for in Iceberg. This would be an overwrite in Iceberg. > We still want to propagate the expression that was used to scan for checking concurrently appended files under serializable isolation. Okay, interesting point. So you're saying that we would use this to plan all row-level plans and this would be the only way to get a `DeltaWriter`? My main question is whether we think that is going to make it harder to build a `DeltaWriter` that doesn't support copy-on-write. Take a JDBC source for example. It doesn't have any reason to use copy-on-write because all of the operations are row-level. So would we be over-complicating that source just to avoid passing an extra filter to `DeltaWriter` that is created using a normal `WriteBuilder`? I think it would be. The simpler case for a JDBC implementation is to add `SupportsDelta` to `WriteBuilder` and not require a separate builder. But, we would also need an interface for passing the filter to the write for both `WriteBuilder` and `ReplaceBuilder`, and I agree that it is a bit cleaner to rely on this information coming from the scan configuration itself. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
