szehon-ho commented on code in PR #4229:
URL: https://github.com/apache/iceberg/pull/4229#discussion_r879931194
##########
docs/spark/spark-procedures.md:
##########
@@ -275,6 +275,11 @@ See the [`RewriteDataFiles` Javadoc](../../../javadoc/{{%
icebergVersion %}}/org
and <br/> [`SortStrategy` Javadoc](../../../javadoc/{{% icebergVersion
%}}/org/apache/iceberg/actions/SortStrategy.html#field.summary)
for list of all the supported options for this action.
+!!! Note
+ This procedure can also read the delete files created by merge-on-read
mode if present
Review Comment:
How about remove 'created by merge-on-read mode'? I don't think we ever say
'merge-on-read' mode anywhere in docs.
##########
docs/spark/spark-procedures.md:
##########
@@ -275,6 +275,11 @@ See the [`RewriteDataFiles` Javadoc](../../../javadoc/{{%
icebergVersion %}}/org
and <br/> [`SortStrategy` Javadoc](../../../javadoc/{{% icebergVersion
%}}/org/apache/iceberg/actions/SortStrategy.html#field.summary)
for list of all the supported options for this action.
+!!! Note
+ This procedure can also read the delete files created by merge-on-read
mode if present
+ and rewrite to new data files by masking the result from delete files.
+ By default, delete files will not be considered for compaction. In order
to change this behavior, need to set `delete-file-threshold`.
Review Comment:
Do we need to add a section about documenting various strategy settings,
like this one?
##########
docs/spark/spark-procedures.md:
##########
@@ -275,6 +275,11 @@ See the [`RewriteDataFiles` Javadoc](../../../javadoc/{{%
icebergVersion %}}/org
and <br/> [`SortStrategy` Javadoc](../../../javadoc/{{% icebergVersion
%}}/org/apache/iceberg/actions/SortStrategy.html#field.summary)
for list of all the supported options for this action.
+!!! Note
+ This procedure can also read the delete files created by merge-on-read
mode if present
+ and rewrite to new data files by masking the result from delete files.
Review Comment:
Instead of "masking the result", thinking to keep more the language in
https://iceberg.apache.org/spec/#version-2-row-level-deletes
How about:
``` This procedure can also read applicable delete files to delete rows from
the original data file during the rewrite.```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]