[ 
https://issues.apache.org/jira/browse/HDDS-14937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18083273#comment-18083273
 ] 

Chu Cheng Li commented on HDDS-14937:
-------------------------------------

Found recent improvement on table path in iceberg, FYI.
 * [https://github.com/apache/iceberg/pull/15630]
 * [https://github.com/ClickHouse/ClickHouse/issues/102321]
 * [https://github.com/apache/iceberg/issues/13141]
 * 
[https://docs.google.com/document/d/1a6tXvbWVbvOxiRexCiaIsIA6NOOqcbxPKjeoJGqgYtg/edit?tab=t.p92mo7cvg08q]

> Ozone native implementation of Iceberg RewriteTablePath
> -------------------------------------------------------
>
>                 Key: HDDS-14937
>                 URL: https://issues.apache.org/jira/browse/HDDS-14937
>             Project: Apache Ozone
>          Issue Type: Epic
>            Reporter: Sreeja
>            Assignee: Sreeja
>            Priority: Major
>
> Iceberg tables stored in Apache Ozone traditionally(table created via ofs) 
> use absolute paths with the "ofs://" protocol prefix in the path. These 
> absolute paths prevent the table from being accessed via S3, even when a 
> bucket link exists.
> This Epic introduces a native Ozone implementation of the Iceberg's 
> [RewriteTablePath 
> |https://github.com/apache/iceberg/blob/1.10.x/api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java]
>  action to enable seamless protocol migration with zero data file copy. 
> Iceberg also provides the core util methods in  
> [RewriteTablePathUtil|https://github.com/apache/iceberg/blob/1.10.x/core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java]
>   that can be used by Ozone for the same purpose.
> This approach is particularly useful when integrating with REST-based 
> catalogs for example Apache Polaris etc .., which expect S3-compatible 
> locations.
> We will implement the Iceberg's action and use RewriteTablePathUtil to 
> perform a "metadata-only" migration. 
>  * *Traverse* the table’s metadata history.
>  * *Rewrite* all internal absolute paths from a sourcePrefix (e.g., ofs://) 
> to a targetPrefix (e.g., s3a:// or s3://).
>  * *Stage* the updated metadata files in a temporary location.
>  * *Perform Zero Data Copy:* The actual data files remain untouched, only the 
> "pointers" in the metadata(metadata version file, manifest list , manifest 
> file , position delete file) are updated.
> For example:
> Suppose an Iceberg table is present in an Ozone volume/bucket using an ofs:// 
> path say "{*}ofs://om:9862/vol1/buck1/my_db/test_table"{*}, all file 
> references stored across the table’s metadata hierarchy are mentioned as 
> absolute ofs:// paths. This includes:
>  * Table metadata files (table location, manifest-list location, previous 
> metadata file locations)
>  * Manifest list files (pointing to manifest files)
>  * Manifest files (pointing to data files)
>  * Position delete files (referencing affected data files)
> sample metadata file (before rewrite):
> {code:java}
> {  "format-version": 2, 
>  "table-uuid": "9b791462-d257-45e5-92f8-435302d2c335",  
> "location": "ofs://ozone-om:9862/vol1/buck1/my_db/test_table", 
>  .  
>  .  
>  .  
> },  
> "snapshots": [{...},      
> "manifest-list": 
> "ofs://ozone-om:9862/vol1/buck1/my_db/test_table/metadata/snap-1753351619419365870-1-5ac51133-8cbf-4327-bbf8-0559b463e1f9.avro",
> "schema-id": 0    }, 
> {...},      
> "manifest-list": 
> "ofs://ozone-om:9862/vol1/buck1/my_db/test_table/metadata/snap-176890185746044789-1-5061c816-61b1-43e4-84e8-0ad689c2ea86.avro",
> "schema-id": 0    }  ],  
>   .
>   .
>   .  
> "metadata-log": [    
> {      
> "timestamp-ms": 1774448474465,      
> "metadata-file": 
> "ofs://ozone-om:9862/vol1/buck1/my_db/test_table/metadata/00000-d480d223-a92f-4255-be8c-fef1714bb423.metadata.json"
>  
> },
> {      
> "timestamp-ms": 1774448493051,
> "metadata-file": 
> "ofs://ozone-om:9862/vol1/buck1/my_db/test_table/metadata/00001-3d20e8d6-e151-4442-a0d7-55533f27cf09.metadata.json"
> }  ]} {code}
> Now if we try to access this table via a REST based catalog like Apache 
> Polaris then it would fail as polaris expects s3:// or s3a://
> {code:java}
> org.apache.iceberg.exceptions.ForbiddenException: Forbidden: Invalid 
> locations '[ofs://om:9862/vol1/buck1/my_db/test_table]' for identifier 
> 'my_db.test_table': ofs://om:9862/vol1/buck1/my_db/test_table is not in the 
> list of allowed locations{code}
> we won't even be able to register the table with polaris catalog as it sees 
> ofs:// paths in the files. Or if we use any engine that tries to access the 
> table via s3 it would also fail as it won't be able to resolve ofs:// paths.
> h3. Rewriting paths to S3
> To make the table accessible via S3-compatible systems {*}without copying 
> data files{*}, we use Ozone's native implementation of Iceberg's 
> {{RewriteTablePath}} action.
> Steps:
>  # *Create a bucket link* in the Ozone {{/s3v}} volume pointing to the bucket 
> where the table exists.
>  # Provide the *sourcePrefix* 
> ({{{}ofs://om:9862/vol1/buck1/my_db/test_table{}}}) and *targetPrefix* 
> ({{{}s3://buck1link/my_db/test_table{}}}).
>  # Optionally, provide *start/end metadata versions* or a *staging location* 
> for the rewritten metadata files.
>  # Run the rewrite action — this updates all embedded paths in metadata 
> version files, manifest lists, manifests, and position delete files 
> {*}without touching the actual data{*}.
>  # Copy the rewritten metadata from the staging location back to the table’s 
> location (not handled automatically by Ozone's implementation).
> sample metadata file (after rewrite):
> {code:java}
> {  "format-version": 2, 
>  "table-uuid": "9b791462-d257-45e5-92f8-435302d2c335",  
> "location": "s3://buck1link/my_db/test_table", 
> .  
> .  
> .  
> },  
> "snapshots": [{...},      
> "manifest-list": 
> "s3://buck1link/my_db/test_table/metadata/snap-1753351619419365870-1-5ac51133-8cbf-4327-bbf8-0559b463e1f9.avro",
> "schema-id": 0    }, 
> {...},      
> "manifest-list": 
> "s3://buck1link/my_db/test_table/metadata/snap-176890185746044789-1-5061c816-61b1-43e4-84e8-0ad689c2ea86.avro",
> "schema-id": 0    }  ],  
> .   .   .  
> "metadata-log": [    
> {      
> "timestamp-ms": 1774448474465,      
> "metadata-file": 
> "s3://buck1link/my_db/test_table/metadata/00000-d480d223-a92f-4255-be8c-fef1714bb423.metadata.json"
>  
> },
> {      
> "timestamp-ms": 1774448493051,
> "metadata-file": 
> "s3://buck1link/my_db/test_table/metadata/00001-3d20e8d6-e151-4442-a0d7-55533f27cf09.metadata.json"
> }  ]}  {code}
> After this rewrite:
>  * The table is {*}accessible via S3{*}.
>  * It can now be *registered with Polaris* without any path-related errors.
> NOTE: The hadoop-ozone/iceberg module should be enabled only when building 
> with JDK ≥ 11



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to