[jira] [Updated] (HDDS-9154) Optimize Snapdiff to compute diff from Delta SST Files in case of DAG based Diff

Wei-Chiu Chuang (Jira) Wed, 08 Apr 2026 18:32:13 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-9154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wei-Chiu Chuang updated HDDS-9154:
----------------------------------
    Description: 
The current Snapshot Diff (SnapDiff) mechanism identifies changes between 
snapshots by constructing a reverse object ID mapping, a process that requires 
intensive I/O to read Delta SST files and perform cross-database lookups 
between the source and target snapshot RocksDB instances.

To improve efficiency, the diff engine can be redesigned to compute differences 
by directly scanning the internal contents of the SST files. By evaluating the 
RocksDB sequence numbers and values embedded within the SST entries, the system 
can determine key mutations without the need for exhaustive mapping or external 
lookups.

This transition from a mapping-centric approach to a direct metadata-driven 
analysis significantly minimizes IOPS, drastically reducing the latency and 
resource consumption of the SnapDiff computation.

 

 

  was:Currently Snapdiff is computed by getting a reverse object id mapping by 
reading the Delta sst Files and referring to the src and target snapshot 
rocksdb. Instead of computing the diff based on the contents of the sst files 
directly based on the internal rocksdb sequence number and value in the sst 
file which will save lot of io-ops in the diff computation. 


> Optimize Snapdiff to compute diff from Delta SST Files in case of DAG based 
> Diff
> --------------------------------------------------------------------------------
>
>                 Key: HDDS-9154
>                 URL: https://issues.apache.org/jira/browse/HDDS-9154
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Swaminathan Balachandran
>            Assignee: Swaminathan Balachandran
>            Priority: Critical
>
> The current Snapshot Diff (SnapDiff) mechanism identifies changes between 
> snapshots by constructing a reverse object ID mapping, a process that 
> requires intensive I/O to read Delta SST files and perform cross-database 
> lookups between the source and target snapshot RocksDB instances.
> To improve efficiency, the diff engine can be redesigned to compute 
> differences by directly scanning the internal contents of the SST files. By 
> evaluating the RocksDB sequence numbers and values embedded within the SST 
> entries, the system can determine key mutations without the need for 
> exhaustive mapping or external lookups.
> This transition from a mapping-centric approach to a direct metadata-driven 
> analysis significantly minimizes IOPS, drastically reducing the latency and 
> resource consumption of the SnapDiff computation.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-9154) Optimize Snapdiff to compute diff from Delta SST Files in case of DAG based Diff

Reply via email to