[ 
https://issues.apache.org/jira/browse/HDDS-14937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreeja updated HDDS-14937:
--------------------------
    Description: 
Iceberg tables stored in Apache Ozone traditionally(table created via ofs) use 
absolute paths with the "ofs://" protocol prefix in the path. These absolute 
paths prevent the table from being accessed via S3, even when a bucket link 
exists.

This Epic introduces a native Ozone implementation of the Iceberg's 
[RewriteTablePath 
|https://github.com/apache/iceberg/blob/1.10.x/api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java]
 action to enable seamless protocol migration with zero data file copy. Iceberg 
also provides the core util methods in  
[RewriteTablePathUtil|https://github.com/apache/iceberg/blob/1.10.x/core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java]
  that can be used by Ozone for the same purpose.

This approach is particularly useful when integrating with REST-based catalogs 
such as Apache Polaris, which expect S3-compatible locations.

We will implement the Iceberg's action and use RewriteTablePathUtil to perform 
a "metadata-only" migration. 
 # *Traverse* the table’s metadata history.

 # *Rewrite* all internal absolute paths from a sourcePrefix (e.g., ofs://) to 
a targetPrefix (e.g., s3a:// or s3://).

 # *Stage* the updated metadata files in a temporary location.

 # *Perform Zero Data Copy:* The actual data files remain untouched, only the 
"pointers" in the metadata are updated.

  was:
Iceberg tables stored in Apache Ozone traditionally(table created via ofs) use 
absolute paths with the ofs:// protocol prefix in the path. These absolute 
paths prevent the table from being accessed via S3, even when a bucket link 
exists.

This Epic introduces a native Ozone implementation of the Iceberg's 
[RewriteTablePath 
|https://github.com/apache/iceberg/blob/1.10.x/api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java]
 action to enable seamless protocol migration with zero data file copy. Iceberg 
also provides the core util methods in  
[RewriteTablePathUtil|https://github.com/apache/iceberg/blob/1.10.x/core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java]
  that can be used by Ozone for the same purpose.

This approach is particularly useful when integrating with REST-based catalogs 
such as Apache Polaris, which expect S3-compatible locations.

We will implement the Iceberg's action and use RewriteTablePathUtil to perform 
a "metadata-only" migration. 
 # *Traverse* the table’s metadata history.

 # *Rewrite* all internal absolute paths from a sourcePrefix (e.g., ofs://) to 
a targetPrefix (e.g., s3a:// or s3://).

 # *Stage* the updated metadata files in a temporary location.

 # *Perform Zero Data Copy:* The actual data files remain untouched, only the 
"pointers" in the metadata are updated.


> Ozone native implementation of Iceberg RewriteTablePath
> -------------------------------------------------------
>
>                 Key: HDDS-14937
>                 URL: https://issues.apache.org/jira/browse/HDDS-14937
>             Project: Apache Ozone
>          Issue Type: Epic
>            Reporter: Sreeja
>            Assignee: Sreeja
>            Priority: Major
>
> Iceberg tables stored in Apache Ozone traditionally(table created via ofs) 
> use absolute paths with the "ofs://" protocol prefix in the path. These 
> absolute paths prevent the table from being accessed via S3, even when a 
> bucket link exists.
> This Epic introduces a native Ozone implementation of the Iceberg's 
> [RewriteTablePath 
> |https://github.com/apache/iceberg/blob/1.10.x/api/src/main/java/org/apache/iceberg/actions/RewriteTablePath.java]
>  action to enable seamless protocol migration with zero data file copy. 
> Iceberg also provides the core util methods in  
> [RewriteTablePathUtil|https://github.com/apache/iceberg/blob/1.10.x/core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java]
>   that can be used by Ozone for the same purpose.
> This approach is particularly useful when integrating with REST-based 
> catalogs such as Apache Polaris, which expect S3-compatible locations.
> We will implement the Iceberg's action and use RewriteTablePathUtil to 
> perform a "metadata-only" migration. 
>  # *Traverse* the table’s metadata history.
>  # *Rewrite* all internal absolute paths from a sourcePrefix (e.g., ofs://) 
> to a targetPrefix (e.g., s3a:// or s3://).
>  # *Stage* the updated metadata files in a temporary location.
>  # *Perform Zero Data Copy:* The actual data files remain untouched, only the 
> "pointers" in the metadata are updated.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to