slfan1989 commented on PR #13882:
URL: https://github.com/apache/iceberg/pull/13882#issuecomment-3238466586
> @slfan1989 I'm not sure I understand the issue. This PR isn't fixing any
tests correct? In the #13837 this return is being changed so that the tests
which use URI.toString() don't break? Couldn't we just change those tests? I'm
not really opposed to changing this to standardize but it feels like the tests
shouldn't be relying on URI output?
>
> I'm more of a +0 here. If there was an obvious test this was fixing I'd be
+1 but it doesn't seem like it has been a problem before?
@RussellSpitzer Thank you very much for your reply. I am indeed encountering
an issue with a unit test, which appears in the newly submitted PR #13837, and
this PR has not yet been merged. From my perspective, if we do not align the
behavior of the REST Catalog with that of the other catalogs, we may not be
able to resolve the issue effectively.
The problem arises in the
TestRewriteTablePathProcedure#testRewriteTablePathWithoutFileList unit test. In
the REST Catalog, due to the prefix check in RewriteTablePathUtil, the test
fails with the following error message:
```
Path
file:/var/folders/2k/21gv5vmx6z7dlr1g0_jtm8r80000ks/T/iceberg_warehouse10689209309257803697/iceberg_data/default/table/data/deletes.parquet
does not start with
/var/folders/2k/21gv5vmx6z7dlr1g0_jtm8r80000ks/T/iceberg_warehouse10689209309257803697/iceberg_data/default/table/
java.lang.IllegalArgumentException: Path
file:/var/folders/2k/21gv5vmx6z7dlr1g0_jtm8r80000ks/T/iceberg_warehouse10689209309257803697/iceberg_data/default/table/data/deletes.parquet
does not start with
/var/folders/2k/21gv5vmx6z7dlr1g0_jtm8r80000ks/T/iceberg_warehouse10689209309257803697/iceberg_data/default/table/
at
org.apache.iceberg.RewriteTablePathUtil.relativize(RewriteTablePathUtil.java:686)
at
org.apache.iceberg.RewriteTablePathUtil.newPath(RewriteTablePathUtil.java:663)
at
org.apache.iceberg.RewriteTablePathUtil.writeDeleteFileEntry(RewriteTablePathUtil.java:492)
at
org.apache.iceberg.RewriteTablePathUtil.lambda$rewriteDeleteManifest$5(RewriteTablePathUtil.java:438)
at
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
at
java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1845)
at
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
```
The purpose of `RewriteTablePath` is to move or copy an Iceberg table to a
new location, updating the paths in the metadata files so the table can be
fully or incrementally copied to another storage location. To enhance the
functionality of `RewriteTablePath`, the original author added a check that
requires the original table's location and the path in the metadata files to be
consistent. The problem we're encountering is that the REST Catalog's library
path is not in URI format, triggering this check. In the unit test, we removed
the `file:` prefix from the other variable paths to make `REST Catalog` pass
the check, but this results in the new paths being inconsistent with the paths
of other catalogs, causing the check to fail again.
The issue is that this is a Junit5 parameterized test, and it runs four
times. If we modify the path to make `REST Catalog` pass, it will work for
`REST Catalog`, but other catalogs (such as `Hive Catalog`, `Hadoop Catalog`,
and `Spark Session Catalog`) will encounter similar errors. Therefore, to
effectively resolve this issue, we need to align the behavior of different
catalogs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]