szehon-ho commented on code in PR #5469:
URL: https://github.com/apache/iceberg/pull/5469#discussion_r941891416
##########
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/actions/ExpireSnapshotsSparkAction.java:
##########
@@ -140,19 +142,39 @@ public ExpireSnapshotsSparkAction
deleteWith(Consumer<String> newDeleteFunc) {
*
* <p>This does not delete data files. To delete data files, run {@link
#execute()}.
*
- * <p>This may be called before or after {@link #execute()} is called to
return the expired file
- * list.
+ * <p>This may be called before or after {@link #execute()} to return the
expired files.
*
* @return a Dataset of files that are no longer referenced by the table
+ * @deprecated since 1.0.0, will be removed in 1.1.0; use {@link
#expireFiles()} instead.
*/
+ @Deprecated
public Dataset<Row> expire() {
- if (expiredFiles == null) {
+ // rely on the same query execution to reuse shuffles
+ QueryExecution queryExecution = expiredFileDS().queryExecution();
+ return new Dataset<>(queryExecution,
RowEncoder.apply(queryExecution.analyzed().schema()));
+ }
+
+ /**
+ * Expires snapshots and commits the changes to the table, returning a
Dataset of files to delete.
+ *
+ * <p>This does not delete data files. To delete data files, run {@link
#execute()}.
+ *
+ * <p>This may be called before or after {@link #execute()} to return the
expired files.
Review Comment:
Lookng more, I think this is correct but a bit confusing for a user to use.
(like, should they call it before or after). Would it make more sense to have
a method expiredFiles() that returns the saved dataset from execute(), and just
not have it work before execute()?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]