[
https://issues.apache.org/jira/browse/HDDS-9154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei-Chiu Chuang updated HDDS-9154:
----------------------------------
Description:
The current Snapshot Diff (SnapDiff) mechanism identifies changes between
snapshots by constructing a reverse object ID mapping, a process that requires
intensive I/O to read Delta SST files and perform cross-database lookups
between the source and target snapshot RocksDB instances.
To improve efficiency, the diff engine can be redesigned to compute differences
by directly scanning the internal contents of the SST files. By evaluating the
RocksDB sequence numbers and values embedded within the SST entries, the system
can determine key mutations without the need for exhaustive mapping or external
lookups.
This transition from a mapping-centric approach to a direct metadata-driven
analysis significantly minimizes IOPS, drastically reducing the latency and
resource consumption of the SnapDiff computation.
was:Currently Snapdiff is computed by getting a reverse object id mapping by
reading the Delta sst Files and referring to the src and target snapshot
rocksdb. Instead of computing the diff based on the contents of the sst files
directly based on the internal rocksdb sequence number and value in the sst
file which will save lot of io-ops in the diff computation.
> Optimize Snapdiff to compute diff from Delta SST Files in case of DAG based
> Diff
> --------------------------------------------------------------------------------
>
> Key: HDDS-9154
> URL: https://issues.apache.org/jira/browse/HDDS-9154
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Swaminathan Balachandran
> Assignee: Swaminathan Balachandran
> Priority: Critical
>
> The current Snapshot Diff (SnapDiff) mechanism identifies changes between
> snapshots by constructing a reverse object ID mapping, a process that
> requires intensive I/O to read Delta SST files and perform cross-database
> lookups between the source and target snapshot RocksDB instances.
> To improve efficiency, the diff engine can be redesigned to compute
> differences by directly scanning the internal contents of the SST files. By
> evaluating the RocksDB sequence numbers and values embedded within the SST
> entries, the system can determine key mutations without the need for
> exhaustive mapping or external lookups.
> This transition from a mapping-centric approach to a direct metadata-driven
> analysis significantly minimizes IOPS, drastically reducing the latency and
> resource consumption of the SnapDiff computation.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]