stevenzwu commented on a change in pull request #2841:
URL: https://github.com/apache/iceberg/pull/2841#discussion_r730121720



##########
File path: 
core/src/main/java/org/apache/iceberg/actions/ConvertDeleteStrategy.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import java.util.Map;
+import java.util.Set;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.FileScanTask;
+import org.apache.iceberg.Table;
+
+public interface ConvertDeleteStrategy {
+
+  /**
+   * Returns the name of this convert deletes strategy
+   */
+  String name();
+
+  /**
+   * Returns the table being modified by this convert strategy
+   */
+  Table table();
+
+  /**
+   * Returns a set of options which this convert strategy can use. This is an 
allowed-list and any options not
+   * specified here will be rejected at runtime.
+   */
+  Set<String> validOptions();
+
+  /**
+   * Sets options to be used with this strategy
+   */
+  RewriteDeleteStrategy options(Map<String, String> options);
+
+  /**
+   * Select the delete files to convert.
+   *
+   * @return iterable of original delete file to be converted.
+   */
+  Iterable<DeleteFile> selectDeleteFiles();
+
+  /**
+   * Define how to convert the deletes.
+   *
+   * @param deleteFilesToConvert a group of files to be converted together
+   * @return iterable of delete files used to replace the original delete 
files.
+   */
+  Iterable<DeleteFile> convertDeleteFiles(Iterable<DeleteFile> 
deleteFilesToConvert);
+
+  /**
+   * Groups delete files into lists which will be processed in a single 
executable unit. Each group will end up being
+   * committed as an independent set of changes. This creates the jobs which 
will eventually be run as by the underlying
+   * Action.
+   *
+   * @param dataFiles iterable of data files that contain the DeleteFile to be 
converted
+   * @return iterable of lists of FileScanTasks which will be processed 
together
+   */
+  Iterable<Iterable<FileScanTask>> planDeleteFileGroups(Iterable<FileScanTask> 
dataFiles);

Review comment:
       what about returning `Iterable<CombinedScanTask>`?

##########
File path: 
core/src/main/java/org/apache/iceberg/actions/ConvertDeleteStrategy.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import java.util.Map;
+import java.util.Set;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.FileScanTask;
+import org.apache.iceberg.Table;
+
+public interface ConvertDeleteStrategy {

Review comment:
       might be helpful to add some javadoc. I assume this is for converting 
equality deletes to position deletes from the design doc?

##########
File path: 
core/src/main/java/org/apache/iceberg/actions/RewriteDeleteStrategy.java
##########
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import java.util.Map;
+import java.util.Set;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.Table;
+
+public interface RewriteDeleteStrategy {
+
+  /**
+   * Returns the name of this rewrite deletes strategy
+   */
+  String name();
+
+  /**
+   * Returns the table being modified by this rewrite strategy
+   */
+  Table table();
+
+  /**
+   * Returns a set of options which this rewrite strategy can use. This is an 
allowed-list and any options not
+   * specified here will be rejected at runtime.
+   */
+  Set<String> validOptions();
+
+  /**
+   * Sets options to be used with this strategy
+   */
+  RewriteDeleteStrategy options(Map<String, String> options);
+
+  /**
+   * Select the delete files to rewrite.
+   *
+   * @return iterable of original delete file to be replaced.
+   */
+  Iterable<DeleteFile> selectDeleteFiles();
+
+  /**
+   * Define how to rewrite the deletes.
+   *
+   * @param deleteFilesToRewrite a group of files to be rewritten together
+   * @return iterable of delete files used to replace the original delete 
files.
+   */
+  Iterable<DeleteFile> rewriteDeleteFiles(Iterable<DeleteFile> 
deleteFilesToRewrite);
+
+  /**
+   * Groups into lists which will be processed in a single executable unit. 
Each group will end up being
+   * committed as an independent set of changes. This creates the jobs which 
will eventually be run as by the underlying
+   * Action.
+   *
+   * @param deleteFiles iterable of DeleteFile to be rewritten
+   * @return iterable of lists of FileScanTasks which will be processed 
together
+   */
+  Iterable<Iterable<DeleteFile>> planDeleteFileGroups(Iterable<DeleteFile> 
deleteFiles);

Review comment:
       wondering why here it is `DeleteFile` while the `ConvertDeleteStrategy` 
plans with `FileScanTask`. 
   
   also wondering if both interfaces in this PR can use `RewriteFileGroup`? 
@RussellSpitzer 

##########
File path: 
core/src/main/java/org/apache/iceberg/actions/RewriteDeleteStrategy.java
##########
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import java.util.Map;
+import java.util.Set;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.Table;
+
+public interface RewriteDeleteStrategy {
+
+  /**
+   * Returns the name of this rewrite deletes strategy
+   */
+  String name();
+
+  /**
+   * Returns the table being modified by this rewrite strategy
+   */
+  Table table();
+
+  /**
+   * Returns a set of options which this rewrite strategy can use. This is an 
allowed-list and any options not
+   * specified here will be rejected at runtime.
+   */
+  Set<String> validOptions();
+
+  /**
+   * Sets options to be used with this strategy
+   */
+  RewriteDeleteStrategy options(Map<String, String> options);
+
+  /**
+   * Select the delete files to rewrite.
+   *
+   * @return iterable of original delete file to be replaced.
+   */
+  Iterable<DeleteFile> selectDeleteFiles();
+
+  /**
+   * Define how to rewrite the deletes.
+   *
+   * @param deleteFilesToRewrite a group of files to be rewritten together
+   * @return iterable of delete files used to replace the original delete 
files.
+   */
+  Iterable<DeleteFile> rewriteDeleteFiles(Iterable<DeleteFile> 
deleteFilesToRewrite);
+
+  /**
+   * Groups into lists which will be processed in a single executable unit. 
Each group will end up being
+   * committed as an independent set of changes. This creates the jobs which 
will eventually be run as by the underlying
+   * Action.
+   *
+   * @param deleteFiles iterable of DeleteFile to be rewritten
+   * @return iterable of lists of FileScanTasks which will be processed 
together
+   */
+  Iterable<Iterable<DeleteFile>> planDeleteFileGroups(Iterable<DeleteFile> 
deleteFiles);

Review comment:
       wondering why here it is `DeleteFile` while the `ConvertDeleteStrategy` 
plans with `FileScanTask`. 
   
   also wondering if both interfaces in this PR can use `RewriteFileGroup` as 
return type? @RussellSpitzer 

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {
+
+  /**
+   * Convert the equality deletes to the position deletes.
+   *
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles convertEqualityDeletes();
+
+  /**
+   * A filter for choosing the equality deletes to convert.
+   *
+   * @param expression An iceberg expression used to choose deletes.
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles filter(Expression expression);
+
+  /**
+   * The action result that contains a summary of the execution.
+   */
+  interface Result {
+    /**
+     * Returns the count of the deletes that been converted.
+     */
+    int convertedDeleteFilesCount();

Review comment:
       wondering if both interfaces can share the same `Result` interface. Can 
both call this `removedDeleteFilesCount` which also match the other method of 
`addedDeleteFilesCount`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to