Ritesh H Shukla created HDDS-7297:
-------------------------------------

             Summary: Content sharing across objects.
                 Key: HDDS-7297
                 URL: https://issues.apache.org/jira/browse/HDDS-7297
             Project: Apache Ozone
          Issue Type: Improvement
          Components: Ozone Datanode, Ozone Filesystem, Ozone Manager
            Reporter: Ritesh H Shukla
            Assignee: Ritesh H Shukla


This request/suggestion was brought up by [~omalley] during [Apache Con 
2022|[https://www.apachecon.com/acna2022/]]. 

When mutating a large table, there would be a huge performance boost achieved 
if applications can address data from from either other objects stored 
previously or other versions of the same object. These objects could be older 
snapshots or other versions of the same object (maintained by iceberg or via 
snapshots or object versions in Ozone).

To make progress we need to do
 # Identify the API surface that needs to be exposed for applications such as 
iceberg or ocr writers to leverage this feature. Should be be done via exposing 
underlying blocks or abstracting the blocks away and only addressing this as 
ranges in a file to be sourced from other files (and their corresponding 
ranges, similar to a scatter gather list).
 ## Look into if this needs to be an extension of vectorIO APIs.
 ##  Is there a need to expose the layout of sharable content  
 # Backend modeling of the API and how Ozone will make it work. This needs to 
be reasoned across EC and Replication.
 # How would this be made available as an extension to S3 APIs in addition to 
OFS.

The https://issues.apache.org/jira/browse/HDDS-7288 is a duplicate of this one. 
Filling this to capture the full context of the discussion. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to