dramaticlly commented on code in PR #13837:
URL: https://github.com/apache/iceberg/pull/13837#discussion_r2283518243
##########
spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteTablePathSparkAction.java:
##########
@@ -304,12 +307,29 @@ private String rebuildMetadata() {
.collect(Collectors.toSet());
rewritePositionDeletes(deleteFiles);
+ ImmutableRewriteTablePath.Result.Builder builder =
+ ImmutableRewriteTablePath.Result.builder()
+ .stagingLocation(stagingDir)
+ .deleteFilesCount(deleteFiles.size())
+ .metadataFilesCount(metaFiles.size())
+ .latestVersion(RewriteTablePathUtil.fileName(endVersionName));
+
+ // skip file list
+ if (skipFileList) {
+ return builder.fileListLocation("").build();
Review Comment:
can you return `builder.build()` instead?
##########
spark/v4.0/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRewriteTablePathProcedure.java:
##########
@@ -171,6 +171,23 @@ public void testProcedureWithInvalidInput() {
"Cannot find provided version file %s in metadata log.",
"v11.metadata.json");
}
+ @TestTemplate
+ public void testRewriteTablePathWithSkipFileList() {
Review Comment:
Let's add another test where we rewritten delete and manifest file count is
no zero and assert on the procedure output?
##########
api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java:
##########
@@ -112,5 +122,15 @@ interface Result {
/** Name of latest metadata file version */
String latestVersion();
+
+ /** count of rewrite delete files, default value is 0 */
+ default int deleteFilesCount() {
+ return 0;
+ }
+
+ /** count of rewrite metadata files involved, default value is 0 */
+ default int metadataFilesCount() {
+ return 0;
Review Comment:
I think rewrite metadata files can be confused with metadata.json files, if
this is count of rewritten manifest files. let's call it
rewrittenManifestFilesCount? Same wise for rewrittenDeleteFilesCount above.
##########
api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java:
##########
@@ -112,5 +122,15 @@ interface Result {
/** Name of latest metadata file version */
String latestVersion();
+
+ /** count of rewrite delete files, default value is 0 */
+ default int deleteFilesCount() {
+ return 0;
+ }
+
+ /** count of rewrite metadata files involved, default value is 0 */
+ default int metadataFilesCount() {
+ return 0;
Review Comment:
Also, do we expect other counts might need for path based rewrite? I am
wondering how such information can be useful for programatic access in real
world scenario
##########
spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/actions/RewriteTablePathSparkAction.java:
##########
@@ -263,7 +265,7 @@ private String jobDesc() {
* <li>Get all files needed to move
* </ul>
*/
- private String rebuildMetadata() {
+ private Result rebuildMetadata() {
Review Comment:
I think we shall add a corresponding test in TestRewriteTablePathsAction
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]