chenjunjiedada commented on a change in pull request #2841:
URL: https://github.com/apache/iceberg/pull/2841#discussion_r730232003



##########
File path: 
core/src/main/java/org/apache/iceberg/actions/ConvertDeleteStrategy.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import java.util.Map;
+import java.util.Set;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.FileScanTask;
+import org.apache.iceberg.Table;
+
+public interface ConvertDeleteStrategy {
+
+  /**
+   * Returns the name of this convert deletes strategy
+   */
+  String name();
+
+  /**
+   * Returns the table being modified by this convert strategy
+   */
+  Table table();
+
+  /**
+   * Returns a set of options which this convert strategy can use. This is an 
allowed-list and any options not
+   * specified here will be rejected at runtime.
+   */
+  Set<String> validOptions();
+
+  /**
+   * Sets options to be used with this strategy
+   */
+  RewriteDeleteStrategy options(Map<String, String> options);
+
+  /**
+   * Select the delete files to convert.
+   *
+   * @return iterable of original delete file to be converted.
+   */
+  Iterable<DeleteFile> selectDeleteFiles();
+
+  /**
+   * Define how to convert the deletes.
+   *
+   * @param deleteFilesToConvert a group of files to be converted together
+   * @return iterable of delete files used to replace the original delete 
files.
+   */
+  Iterable<DeleteFile> convertDeleteFiles(Iterable<DeleteFile> 
deleteFilesToConvert);
+
+  /**
+   * Groups delete files into lists which will be processed in a single 
executable unit. Each group will end up being
+   * committed as an independent set of changes. This creates the jobs which 
will eventually be run as by the underlying
+   * Action.
+   *
+   * @param dataFiles iterable of data files that contain the DeleteFile to be 
converted
+   * @return iterable of lists of FileScanTasks which will be processed 
together
+   */
+  Iterable<Iterable<FileScanTask>> planDeleteFileGroups(Iterable<FileScanTask> 
dataFiles);

Review comment:
       There is one minor difference between rewrite position delete and 
convert equality delete. Which is the convert action needs planning and rewrite 
does not. That's why here we use `FileScan` and use `DeleteFile` in the rewrite 
action. This has been changed back and forth. 

##########
File path: 
core/src/main/java/org/apache/iceberg/actions/RewriteDeleteStrategy.java
##########
@@ -0,0 +1,74 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import java.util.Map;
+import java.util.Set;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.Table;
+
+public interface RewriteDeleteStrategy {
+
+  /**
+   * Returns the name of this rewrite deletes strategy
+   */
+  String name();
+
+  /**
+   * Returns the table being modified by this rewrite strategy
+   */
+  Table table();
+
+  /**
+   * Returns a set of options which this rewrite strategy can use. This is an 
allowed-list and any options not
+   * specified here will be rejected at runtime.
+   */
+  Set<String> validOptions();
+
+  /**
+   * Sets options to be used with this strategy
+   */
+  RewriteDeleteStrategy options(Map<String, String> options);
+
+  /**
+   * Select the delete files to rewrite.
+   *
+   * @return iterable of original delete file to be replaced.
+   */
+  Iterable<DeleteFile> selectDeleteFiles();
+
+  /**
+   * Define how to rewrite the deletes.
+   *
+   * @param deleteFilesToRewrite a group of files to be rewritten together
+   * @return iterable of delete files used to replace the original delete 
files.
+   */
+  Iterable<DeleteFile> rewriteDeleteFiles(Iterable<DeleteFile> 
deleteFilesToRewrite);
+
+  /**
+   * Groups into lists which will be processed in a single executable unit. 
Each group will end up being
+   * committed as an independent set of changes. This creates the jobs which 
will eventually be run as by the underlying
+   * Action.
+   *
+   * @param deleteFiles iterable of DeleteFile to be rewritten
+   * @return iterable of lists of FileScanTasks which will be processed 
together
+   */
+  Iterable<Iterable<DeleteFile>> planDeleteFileGroups(Iterable<DeleteFile> 
deleteFiles);

Review comment:
       There is one difference between the rewrite position delete and the 
convert, it is about the planning. 

##########
File path: 
core/src/main/java/org/apache/iceberg/actions/ConvertDeleteStrategy.java
##########
@@ -0,0 +1,75 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import java.util.Map;
+import java.util.Set;
+import org.apache.iceberg.DeleteFile;
+import org.apache.iceberg.FileScanTask;
+import org.apache.iceberg.Table;
+
+public interface ConvertDeleteStrategy {

Review comment:
       will do.

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {

Review comment:
       Make sense to me, the below function should be used in the original API 
definition which has both rewrite and converts stuff

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.

Review comment:
       Hm, I think the first one should be enough, let me delete them.

##########
File path: api/src/main/java/org/apache/iceberg/actions/RewriteDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for rewriting the position delete files according to a rewrite 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface RewriteDeleteFiles extends SnapshotUpdate<RewriteDeleteFiles, 
RewriteDeleteFiles.Result> {
+
+  /**
+   * Combine the position deletes.
+   *
+   * @return this for method chaining
+   */
+  RewriteDeleteFiles combinePositionDeletes();

Review comment:
       I call this combineXXX because I think bin-pack is a kind of combined 
action. But this indeed may be vague if we have something like reshuffle in the 
future. Let me make this more specific.

##########
File path: api/src/main/java/org/apache/iceberg/actions/RewriteDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for rewriting the position delete files according to a rewrite 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface RewriteDeleteFiles extends SnapshotUpdate<RewriteDeleteFiles, 
RewriteDeleteFiles.Result> {
+
+  /**
+   * Combine the position deletes.
+   *
+   * @return this for method chaining
+   */
+  RewriteDeleteFiles combinePositionDeletes();
+
+  /**
+   * A filter for choosing deletes to rewrite.
+   *
+   * @param expression An iceberg expression used to choose deletes.
+   * @return this for method chaining
+   */
+  RewriteDeleteFiles filter(Expression expression);
+
+  /**
+   * The action result that contains a summary of the execution.
+   */
+  interface Result {

Review comment:
       Thanks. We could always improve the API later.

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {
+
+  /**
+   * Convert the equality deletes to the position deletes.
+   *
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles convertEqualityDeletes();
+
+  /**
+   * A filter for choosing the equality deletes to convert.
+   *
+   * @param expression An iceberg expression used to choose deletes.
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles filter(Expression expression);
+
+  /**
+   * The action result that contains a summary of the execution.
+   */
+  interface Result {
+    /**
+     * Returns the count of the deletes that been converted.
+     */
+    int convertedDeleteFilesCount();

Review comment:
       so we prefer to have the result separately, right?

##########
File path: api/src/main/java/org/apache/iceberg/actions/ConvertDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for converting the equality delete files according to a convert 
strategy.
+ * Generally used for optimizing the sizing and layout of delete files within 
a table.
+ */
+public interface ConvertDeleteFiles extends SnapshotUpdate<ConvertDeleteFiles, 
ConvertDeleteFiles.Result> {
+
+  /**
+   * Convert the equality deletes to the position deletes.
+   *
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles convertEqualityDeletes();
+
+  /**
+   * A filter for choosing the equality deletes to convert.
+   *
+   * @param expression An iceberg expression used to choose deletes.
+   * @return this for method chaining
+   */
+  ConvertDeleteFiles filter(Expression expression);
+
+  /**
+   * The action result that contains a summary of the execution.
+   */
+  interface Result {
+    /**
+     * Returns the count of the deletes that been converted.
+     */
+    int convertedDeleteFilesCount();
+
+    /**
+     * Returns the count of the added position delete files.
+     */
+    int addedDeleteFilesCount();

Review comment:
       Make sense to me. Will do.

##########
File path: api/src/main/java/org/apache/iceberg/actions/RewriteDeleteFiles.java
##########
@@ -0,0 +1,59 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.actions;
+
+import org.apache.iceberg.expressions.Expression;
+
+/**
+ * An action for rewriting the position delete files according to a rewrite 
strategy.

Review comment:
       Instead of make API and doc generic, how about make them more specific? 
I mean change RewriteDeleteFiles to RewritePositionDeleteFiles.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to