[
https://issues.apache.org/jira/browse/HDDS-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703551#comment-17703551
]
Tsz-wo Sze edited comment on HDDS-8128 at 3/22/23 10:09 AM:
------------------------------------------------------------
In this JIRA, we will focus on RDBBatchOperation deduplication, where
RDBBatchOperation is a utility class used everywhere including OM, SCM, DN, etc.
Filed HDDS-8238 for some further works specific to OM.
was (Author: szetszwo):
In this JIRA, we will focus on RDBBatchOperation deduplication, where
RDBBatchOperation is a utility class used everywhere including OM, SCM, DN, etc.
Filed HDDS-8238 for some further works specific OM.
> Deduplicate the ops in RDBBatchOperation
> ----------------------------------------
>
> Key: HDDS-8128
> URL: https://issues.apache.org/jira/browse/HDDS-8128
> Project: Apache Ozone
> Issue Type: Improvement
> Components: db
> Reporter: Tsz-wo Sze
> Assignee: Tsz-wo Sze
> Priority: Blocker
> Labels: pull-request-available
>
> In a multipart upload test, the key "testKey" had 1000-parts with 8KB each.
> The same key was uploaded 10 times sequentially (i.e. it overwrote the
> previous upload) in a newly formatted cluster. The replication was 3, so the
> total raw size of the key is ~ 24 MB. After the test has completed, OM rocks
> db uses ~ 7.5 GB.
> In this JIRA, we add a cache to RDBBatchOperation for deduplication. Within
> a batch, the put-ops and delete-ops of the same key can be safely
> deduplicated. Only the last op has to be applied to the db. All the
> previous ops can be discarded.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]