slfan1989 commented on PR #13882:
URL: https://github.com/apache/iceberg/pull/13882#issuecomment-3209649471

   > > which caused the path prefix validation to fail
   > 
   > can you elaborate which path prefix validation you mean here? I think it 
would be good to have a test here as well to show what was causing the issue 
that required this fix
   > 
   > > while our unit tests consistently used toURI().toString()
   > 
   > I do actually see quite a lot of other places in tests that use 
`getAbsolutePath()`, such as 
https://github.com/apache/iceberg/blob/main/core/src/test/java/org/apache/iceberg/rest/TestRESTCatalog.java#L124.
   > 
   > I'm not saying that this is a good/bad change, I'm just trying to 
understand why this is all of a sudden needed now
   
   @nastra Many thanks for your comment!
   
   In #13837, I added a new unit test to verify that the two newly introduced 
metrics, `rewrittenDeleteFilesCount` and `rewrittenManifestFilesCount`, are 
greater than 0.
   
   This unit test was added to `TestRewriteTablePathProcedure` and is executed 
across multiple catalogs: `HIVE`, `HADOOP`, `SPARK_SESSION`, and `REST`.
   
   The test passes normally under `HIVE`, `HADOOP`, and `SPARK_SESSION`, but it 
fails under the `REST` catalog with the following exception:
   
   ```
   Path 
file:/var/folders/2k/21gv5vmx6z7dlr1g0_jtm8r80000ks/T/iceberg_warehouse10689209309257803697/iceberg_data/default/table/data/deletes.parquet
 does not start with 
/var/folders/2k/21gv5vmx6z7dlr1g0_jtm8r80000ks/T/iceberg_warehouse10689209309257803697/iceberg_data/default/table/
   java.lang.IllegalArgumentException: Path 
file:/var/folders/2k/21gv5vmx6z7dlr1g0_jtm8r80000ks/T/iceberg_warehouse10689209309257803697/iceberg_data/default/table/data/deletes.parquet
 does not start with 
/var/folders/2k/21gv5vmx6z7dlr1g0_jtm8r80000ks/T/iceberg_warehouse10689209309257803697/iceberg_data/default/table/
        at 
org.apache.iceberg.RewriteTablePathUtil.relativize(RewriteTablePathUtil.java:686)
        at 
org.apache.iceberg.RewriteTablePathUtil.newPath(RewriteTablePathUtil.java:663)
        at 
org.apache.iceberg.RewriteTablePathUtil.writeDeleteFileEntry(RewriteTablePathUtil.java:492)
        at 
org.apache.iceberg.RewriteTablePathUtil.lambda$rewriteDeleteManifest$5(RewriteTablePathUtil.java:438)
        at 
java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
        at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
        at 
java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1845)
        at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
   ```
   
   The root cause of this issue lies in the **different ways the base Hive 
warehouse path is generated in the test environment**:
   
   ```
   warehouseLocation = new File(tmp, "iceberg_data").getAbsolutePath();
   ```
   
   This approach produces an absolute path without the file: prefix, for 
example:
   ```
   /var/folders/.../iceberg_data/default/table/
   ```
   
   In contrast, most other unit tests use the following:
   
   ```
   String location = targetTableDir.toFile().toURI().toString();
   ```
   
   This method generates a path with the file: prefix, for example:
   
   ```
   file:/var/folders/.../iceberg_data/default/table/
   ```
   
   <img width="1822" height="707" alt="image" 
src="https://github.com/user-attachments/assets/7cdd0d89-ca29-4446-b764-76f57a11e17e";
 />
   
   From my perspective, I also believe that using .toURI().toString() provides 
better compatibility.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to