Ritesh H Shukla created HDDS-7297:
-------------------------------------
Summary: Content sharing across objects.
Key: HDDS-7297
URL: https://issues.apache.org/jira/browse/HDDS-7297
Project: Apache Ozone
Issue Type: Improvement
Components: Ozone Datanode, Ozone Filesystem, Ozone Manager
Reporter: Ritesh H Shukla
Assignee: Ritesh H Shukla
This request/suggestion was brought up by [~omalley] during [Apache Con
2022|[https://www.apachecon.com/acna2022/]].
When mutating a large table, there would be a huge performance boost achieved
if applications can address data from from either other objects stored
previously or other versions of the same object. These objects could be older
snapshots or other versions of the same object (maintained by iceberg or via
snapshots or object versions in Ozone).
To make progress we need to do
# Identify the API surface that needs to be exposed for applications such as
iceberg or ocr writers to leverage this feature. Should be be done via exposing
underlying blocks or abstracting the blocks away and only addressing this as
ranges in a file to be sourced from other files (and their corresponding
ranges, similar to a scatter gather list).
## Look into if this needs to be an extension of vectorIO APIs.
## Is there a need to expose the layout of sharable content
# Backend modeling of the API and how Ozone will make it work. This needs to
be reasoned across EC and Replication.
# How would this be made available as an extension to S3 APIs in addition to
OFS.
The https://issues.apache.org/jira/browse/HDDS-7288 is a duplicate of this one.
Filling this to capture the full context of the discussion.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]