[ 
https://issues.apache.org/jira/browse/HDDS-8128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703551#comment-17703551
 ] 

Tsz-wo Sze edited comment on HDDS-8128 at 3/22/23 10:09 AM:
------------------------------------------------------------

In this JIRA, we will focus on RDBBatchOperation deduplication, where 
RDBBatchOperation is a utility class used everywhere including OM, SCM, DN, etc.

Filed HDDS-8238 for some further works specific to OM.


was (Author: szetszwo):
In this JIRA, we will focus on RDBBatchOperation deduplication, where 
RDBBatchOperation is a utility class used everywhere including OM, SCM, DN, etc.

Filed HDDS-8238 for some further works specific OM.

> Deduplicate the ops in RDBBatchOperation
> ----------------------------------------
>
>                 Key: HDDS-8128
>                 URL: https://issues.apache.org/jira/browse/HDDS-8128
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: db
>            Reporter: Tsz-wo Sze
>            Assignee: Tsz-wo Sze
>            Priority: Blocker
>              Labels: pull-request-available
>
> In a multipart upload test, the key "testKey" had 1000-parts with 8KB each.  
> The same key was uploaded 10 times sequentially (i.e. it overwrote the 
> previous upload) in a newly formatted cluster.  The replication was 3, so the 
> total raw size of the key is ~ 24 MB.  After the test has completed, OM rocks 
> db uses ~ 7.5 GB.
> In this JIRA, we add a cache to RDBBatchOperation for deduplication.  Within 
> a batch, the put-ops and delete-ops of the same key can be safely 
> deduplicated.  Only the last op has to be applied to the db.  All the 
> previous ops can be discarded.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to