aokolnychyi commented on a change in pull request #2841:
URL: https://github.com/apache/iceberg/pull/2841#discussion_r730155316



##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {
+
+  /**
+   * Convert the equality deletes to the position deletes.
+   *
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles convertEqualityDeletes();
+
+  /**
+   * A filter for choosing the equality deletes to convert.
+   *
+   * @param expression An iceberg expression used to choose deletes.
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles filter(Expression expression);
+
+  /**
+   * The action result that contains a summary of the execution.
+   */
+  interface Result {
+    /**
+     * Returns the count of the deletes that been converted.
+     */
+    int convertedDeleteFilesCount();

Review comment:
       Not a strong opinion but have kept each action result separate so far. I 
think it probably makes sense to me here too.

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {

Review comment:
       Are there any other conversions we plan to support other than equality 
-> position? According to the doc, it seems to be the only conversion we 
consider. If so, would it make sense to call it `ConvertEqualityDeleteFiles`? 
Just asking at this point.

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {

Review comment:
       Are there any other conversions we plan to support other than equality 
-> position? According to the doc, it seems to be the only conversion we 
consider.
   
   If so, would it make sense to call it `ConvertEqualityDeleteFiles`? Just 
asking at this point.

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {

Review comment:
       Are there any other conversions we plan to support other than equality 
-> position? According to the doc, it seems to be the only conversion we 
consider.
   
   If so, would it make sense to call it `ConvertEqualityDeleteFiles`? Just 
asking.

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {

Review comment:
       Are there any other conversions we plan to support other than equality 
-> position? According to the doc, it seems to be the only conversion we 
consider.
   
   If so, would it make sense to call it `ConvertEqualityDeleteFiles`? Just 
asking. Then we can drop the method below.

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.

Review comment:
       I am not sure the second sentence is accurate. It applies more to the 
rewrite action.

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {
+
+  /**
+   * Convert the equality deletes to the position deletes.
+   *
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles convertEqualityDeletes();
+
+  /**
+   * A filter for choosing the equality deletes to convert.
+   *
+   * @param expression An iceberg expression used to choose deletes.
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles filter(Expression expression);
+
+  /**
+   * The action result that contains a summary of the execution.
+   */
+  interface Result {
+    /**
+     * Returns the count of the deletes that been converted.
+     */
+    int convertedDeleteFilesCount();

Review comment:
       Not a strong opinion but have kept each action result separate so far.
   I think it probably makes sense to me here too.

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {
+
+  /**
+   * Convert the equality deletes to the position deletes.
+   *
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles convertEqualityDeletes();
+
+  /**
+   * A filter for choosing the equality deletes to convert.
+   *
+   * @param expression An iceberg expression used to choose deletes.
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles filter(Expression expression);
+
+  /**
+   * The action result that contains a summary of the execution.
+   */
+  interface Result {
+    /**
+     * Returns the count of the deletes that been converted.
+     */
+    int convertedDeleteFilesCount();
+
+    /**
+     * Returns the count of the added position delete files.
+     */
+    int addedDeleteFilesCount();

Review comment:
       If we think this action will be about converting equality to position, 
then these methods can be a bit more specific just like their Javadoc.
   
   For example, we can call them `convertedEqualityDeleteFilesCount` and  
`addedPositionDeleteFilesCount`.

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {
+
+  /**
+   * Convert the equality deletes to the position deletes.
+   *
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles convertEqualityDeletes();
+
+  /**
+   * A filter for choosing the equality deletes to convert.
+   *
+   * @param expression An iceberg expression used to choose deletes.
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles filter(Expression expression);
+
+  /**
+   * The action result that contains a summary of the execution.
+   */
+  interface Result {
+    /**
+     * Returns the count of the deletes that been converted.
+     */
+    int convertedDeleteFilesCount();
+
+    /**
+     * Returns the count of the added position delete files.
+     */
+    int addedDeleteFilesCount();

Review comment:
       If we think this action will be always about converting equality to 
position, then these methods can be a bit more specific just like their Javadoc.
   
   For example, we can call them `convertedEqualityDeleteFilesCount` and  
`addedPositionDeleteFilesCount`.

##########
File path: api/src/main/java/org/apache/iceberg/actions/RewriteDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for rewriting the position delete files according to a rewrite 
strategy.

Review comment:
       nit: In the doc, we are still debating whether a rewrite of equality 
deletes makes sense. I agree it is useful to have a generic name for this 
action and I like `RewriteDeleteFiles`. Should we ensure the Javadoc is generic 
too? What about dropping `position` in the class doc?

##########
File path: api/src/main/java/org/apache/iceberg/actions/RewriteDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for rewriting the position delete files according to a rewrite 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface RewriteDeleteFiles extends SnapshotUpdate<RewriteDeleteFiles, 
RewriteDeleteFiles.Result> {
+
+  /**
+   * Combine the position deletes.
+   *
+   * @return this for method chaining
+   */
+  RewriteDeleteFiles combinePositionDeletes();

Review comment:
       Question. There is consensus that bin-packing position deletes is 
probably the most useful rewrite. However, we may add a rewrite that would also 
reshuffle deletes, not only bin-pack them. We don't know whether that will be 
needed and when we will implement it, but what about calling this 
`binPackPositionDeletes` similar to `binPack` in data compaction? 

##########
File path: api/src/main/java/org/apache/iceberg/actions/RewriteDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for rewriting the position delete files according to a rewrite 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface RewriteDeleteFiles extends SnapshotUpdate<RewriteDeleteFiles, 
RewriteDeleteFiles.Result> {
+
+  /**
+   * Combine the position deletes.
+   *
+   * @return this for method chaining
+   */
+  RewriteDeleteFiles combinePositionDeletes();

Review comment:
       Question. There is consensus that bin-packing position deletes is 
probably the most useful rewrite. However, we may add a rewrite that would also 
reshuffle deletes, not only bin-pack them. We don't know whether that will be 
needed and when we will implement it but what about calling this 
`binPackPositionDeletes` similar to `binPack` in data compaction? That way, we 
can add another strategy later as `combineXXX` seems vague.

##########
File path: api/src/main/java/org/apache/iceberg/actions/RewriteDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for rewriting the position delete files according to a rewrite 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface RewriteDeleteFiles extends SnapshotUpdate<RewriteDeleteFiles, 
RewriteDeleteFiles.Result> {
+
+  /**
+   * Combine the position deletes.
+   *
+   * @return this for method chaining
+   */
+  RewriteDeleteFiles combinePositionDeletes();
+
+  /**
+   * A filter for choosing deletes to rewrite.
+   *
+   * @param expression An iceberg expression used to choose deletes.
+   * @return this for method chaining
+   */
+  RewriteDeleteFiles filter(Expression expression);
+
+  /**
+   * The action result that contains a summary of the execution.
+   */
+  interface Result {

Review comment:
       I assume we will compact each partition separately so we may want to 
provide some file group stats in the future, just like we do in data 
compaction. Too early to tell whether that will be useful, though. Just 
something to keep in mind in the future.

##########
File path: api/src/main/java/org/apache/iceberg/actions/RewriteDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for rewriting the position delete files according to a rewrite 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface RewriteDeleteFiles extends SnapshotUpdate<RewriteDeleteFiles, 
RewriteDeleteFiles.Result> {
+
+  /**
+   * Combine the position deletes.
+   *
+   * @return this for method chaining
+   */
+  RewriteDeleteFiles combinePositionDeletes();
+
+  /**
+   * A filter for choosing deletes to rewrite.
+   *
+   * @param expression An iceberg expression used to choose deletes.
+   * @return this for method chaining
+   */
+  RewriteDeleteFiles filter(Expression expression);
+
+  /**
+   * The action result that contains a summary of the execution.
+   */
+  interface Result {

Review comment:
       I assume we will compact each partition separately so we may want to 
provide some file group stats in the future, just like we do in data 
compaction. Too early to tell whether that will be useful, though. Just 
something to keep in mind.

##########
File path: 
core/src/main/java/org/apache/iceberg/actions/ConvertDeleteStrategy.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import java.util.Map;
+import java.util.Set;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.FileScanTask;
+import org.apache.iceberg.Table;
+
+public interface ConvertDeleteStrategy {

Review comment:
       @RussellSpitzer, we may want to rename `RewriteStrategy` into something 
data-specific since we are about to add more rewrite strategies.

##########
File path: 
core/src/main/java/org/apache/iceberg/actions/ConvertDeleteStrategy.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import java.util.Map;
+import java.util.Set;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.FileScanTask;
+import org.apache.iceberg.Table;
+
+public interface ConvertDeleteStrategy {
+
+  /**
+   * Returns the name of this convert deletes strategy
+   */
+  String name();
+
+  /**
+   * Returns the table being modified by this convert strategy
+   */
+  Table table();
+
+  /**
+   * Returns a set of options which this convert strategy can use. This is an 
allowed-list and any options not
+   * specified here will be rejected at runtime.
+   */
+  Set<String> validOptions();
+
+  /**
+   * Sets options to be used with this strategy
+   */
+  RewriteDeleteStrategy options(Map<String, String> options);
+
+  /**
+   * Select the delete files to convert.
+   *
+   * @return iterable of original delete file to be converted.
+   */
+  Iterable<DeleteFile> selectDeleteFiles();
+
+  /**
+   * Define how to convert the deletes.
+   *
+   * @param deleteFilesToConvert a group of files to be converted together
+   * @return iterable of delete files used to replace the original delete 
files.
+   */
+  Iterable<DeleteFile> convertDeleteFiles(Iterable<DeleteFile> 
deleteFilesToConvert);
+
+  /**
+   * Groups delete files into lists which will be processed in a single 
executable unit. Each group will end up being
+   * committed as an independent set of changes. This creates the jobs which 
will eventually be run as by the underlying
+   * Action.
+   *
+   * @param dataFiles iterable of data files that contain the DeleteFile to be 
converted
+   * @return iterable of lists of FileScanTasks which will be processed 
together
+   */
+  Iterable<Iterable<FileScanTask>> planDeleteFileGroups(Iterable<FileScanTask> 
dataFiles);

Review comment:
       Hm, I am not sure I get why we are accepting `dataFiles` here. It seems 
to be borrowed from the data compaction and does not match the design doc. I 
agree we should split delete files into group that will be processed 
independently but I am not sure what data files we will pass here?

##########
File path: 
core/src/main/java/org/apache/iceberg/actions/ConvertDeleteStrategy.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import java.util.Map;
+import java.util.Set;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.FileScanTask;
+import org.apache.iceberg.Table;
+
+public interface ConvertDeleteStrategy {
+
+  /**
+   * Returns the name of this convert deletes strategy
+   */
+  String name();
+
+  /**
+   * Returns the table being modified by this convert strategy
+   */
+  Table table();
+
+  /**
+   * Returns a set of options which this convert strategy can use. This is an 
allowed-list and any options not
+   * specified here will be rejected at runtime.
+   */
+  Set<String> validOptions();
+
+  /**
+   * Sets options to be used with this strategy
+   */
+  RewriteDeleteStrategy options(Map<String, String> options);
+
+  /**
+   * Select the delete files to convert.
+   *
+   * @return iterable of original delete file to be converted.
+   */
+  Iterable<DeleteFile> selectDeleteFiles();
+
+  /**
+   * Define how to convert the deletes.
+   *
+   * @param deleteFilesToConvert a group of files to be converted together
+   * @return iterable of delete files used to replace the original delete 
files.
+   */
+  Iterable<DeleteFile> convertDeleteFiles(Iterable<DeleteFile> 
deleteFilesToConvert);
+
+  /**
+   * Groups delete files into lists which will be processed in a single 
executable unit. Each group will end up being
+   * committed as an independent set of changes. This creates the jobs which 
will eventually be run as by the underlying
+   * Action.
+   *
+   * @param dataFiles iterable of data files that contain the DeleteFile to be 
converted
+   * @return iterable of lists of FileScanTasks which will be processed 
together
+   */
+  Iterable<Iterable<FileScanTask>> planDeleteFileGroups(Iterable<FileScanTask> 
dataFiles);

Review comment:
       Hm, I am not sure I get why we are accepting `dataFiles` here. It seems 
to be borrowed from the data compaction and does not match the design doc. I 
agree we should split delete files into group that will be processed 
independently but I am not sure what data files we will pass here.

##########
File path: 
core/src/main/java/org/apache/iceberg/actions/ConvertDeleteStrategy.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import java.util.Map;
+import java.util.Set;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.FileScanTask;
+import org.apache.iceberg.Table;
+
+public interface ConvertDeleteStrategy {
+
+  /**
+   * Returns the name of this convert deletes strategy
+   */
+  String name();
+
+  /**
+   * Returns the table being modified by this convert strategy
+   */
+  Table table();
+
+  /**
+   * Returns a set of options which this convert strategy can use. This is an 
allowed-list and any options not
+   * specified here will be rejected at runtime.
+   */
+  Set<String> validOptions();
+
+  /**
+   * Sets options to be used with this strategy
+   */
+  RewriteDeleteStrategy options(Map<String, String> options);
+
+  /**
+   * Select the delete files to convert.
+   *
+   * @return iterable of original delete file to be converted.
+   */
+  Iterable<DeleteFile> selectDeleteFiles();
+
+  /**
+   * Define how to convert the deletes.
+   *
+   * @param deleteFilesToConvert a group of files to be converted together
+   * @return iterable of delete files used to replace the original delete 
files.
+   */
+  Iterable<DeleteFile> convertDeleteFiles(Iterable<DeleteFile> 
deleteFilesToConvert);
+
+  /**
+   * Groups delete files into lists which will be processed in a single 
executable unit. Each group will end up being
+   * committed as an independent set of changes. This creates the jobs which 
will eventually be run as by the underlying
+   * Action.
+   *
+   * @param dataFiles iterable of data files that contain the DeleteFile to be 
converted
+   * @return iterable of lists of FileScanTasks which will be processed 
together
+   */
+  Iterable<Iterable<FileScanTask>> planDeleteFileGroups(Iterable<FileScanTask> 
dataFiles);

Review comment:
       In case of data compaction, it is an iterable of data file lists to be 
processed independently.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to