Baunsgaard opened a new issue, #15172:
URL: https://github.com/apache/iceberg/issues/15172
### Apache Iceberg version
1.10.1 (latest release)
### Query engine
None
### Please describe the bug 🐞
`RewriteTablePathUtil.relativize()` throws `IllegalArgumentException` when
the path equals the prefix exactly, rather than being a child path under it.
This breaks `rewrite_table_path` when table properties like
`write.data.location` point to the table root.
### Expected behavior
`relativize("/path/to/table", "/path/to/table")` should return `""` (empty
string representing root).
### Actual behavior
Throws `IllegalArgumentException: Path /path/to/table does not start with
/path/to/table/`
### Root Cause
[`RewriteTablePathUtil.relativize()`](https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java#L758-L765)
appends `/` to the prefix before checking `startsWith()`:
```java
public static String relativize(String path, String prefix) {
String toRemove = maybeAppendFileSeparator(prefix); // "/table" ->
"/table/"
if (!path.startsWith(toRemove)) {
throw new IllegalArgumentException(...); // FAILS when path == prefix
}
return path.substring(toRemove.length());
}
```
### Steps to reproduce
```java
String prefix = "/path/to/table";
// Works - path is under prefix
RewriteTablePathUtil.relativize("/path/to/table/data/file.parquet", prefix);
// ✓ returns "data/file.parquet"
// Fails - path equals prefix (e.g., write.data.path = table root)
RewriteTablePathUtil.relativize("/path/to/table", prefix); // ✗ throws
IllegalArgumentException
```
### Use Case
This affects `rewrite_table_path` for tables where `write.data.path` or
`write.metadata.path` equals the table root:
```sql
-- Table with write.data.path set to table root (valid configuration)
CREATE TABLE catalog.db.events (id BIGINT, data STRING)
USING iceberg
LOCATION 's3://bucket/warehouse/db/events'
TBLPROPERTIES ('write.data.path' = 's3://bucket/warehouse/db/events');
-- Replicating table to DR region:
CALL catalog.system.rewrite_table_path(
'db.events', -- table
's3://bucket/warehouse/db/events', -- source_prefix (table
location)
's3://bucket-dr/warehouse/db/events' -- target_prefix
);
-- Fails when processing write.data.path property in updatePathInProperty():
-- IllegalArgumentException: Path s3://bucket/warehouse/db/events
-- does not start with s3://bucket/warehouse/db/events/
```
**Affected scenarios:**
- **Storage migration**: Moving tables between buckets or storage systems
- **Backup/restore**: Archiving table metadata to different locations
The `location` field itself works (uses `replaceFirst`), but path properties
go through `relativize()` which fails on this edge case.
### Proposed Fix
Handle the case where path equals prefix by checking after normalization:
```java
public static String relativize(String path, String prefix) {
String toRemove = maybeAppendFileSeparator(prefix);
if (path.startsWith(toRemove)) {
return path.substring(toRemove.length());
}
// Handle exact match where path equals prefix (without trailing separator)
if (maybeAppendFileSeparator(path).equals(toRemove)) {
return "";
}
throw new IllegalArgumentException(
String.format("Path %s does not start with %s", path, toRemove));
}
```
I can submit a PR with this fix and tests.
### Willingness to contribute
- [x] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from
the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]