[ 
https://issues.apache.org/jira/browse/HDDS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17733354#comment-17733354
 ] 

ChenXi edited comment on HDDS-8865 at 6/16/23 7:11 AM:
-------------------------------------------------------

For the DN Delete command processing
After analysis, we found that DeleteBlocksCommandHandler thread often fails to 
obtain Container's write lock due to `reblancer`, `replication`, 
`BlockDeletingService` and other threads holding Container locks, We observe 
that sometimes, DeleteBlocksCommandHandler will be stuck for hours, but most of 
the time, DeleteBlocksCommandHandler will only be stuck for a few minutes, when 
SCM will send a lot of invalid transactions to the current blocked DN, and 
other DNs will receive fewer transactions.

The Metrics of DeleteBlocksCommandHandler threadpool queue size
!image-2023-06-16-15-09-59-163.png|width=747,height=267!


was (Author: JIRAUSER294158):
For the DN Delete command processing
After analysis, we found that DeleteBlocksCommandHandler thread often fails to 
obtain Container's write lock due to `reblancer`, `replication`, 
`BlockDeletingService` and other threads holding Container locks, We observe 
that sometimes, DeleteBlocksCommandHandler will be stuck for hours, but most of 
the time, DeleteBlocksCommandHandler will only be stuck for a few minutes, when 
SCM will send a lot of invalid transactions to the current blocked DN, and 
other DNs will receive fewer transactions.

The Metrics of DeleteBlocksCommandHandler threadpool queue size
 !image-2023-06-16-15-09-59-163.png! 

> Ozone asynchronous deletion performance optimization
> ----------------------------------------------------
>
>                 Key: HDDS-8865
>                 URL: https://issues.apache.org/jira/browse/HDDS-8865
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: ChenXi
>            Assignee: ChenXi
>            Priority: Major
>         Attachments: image-2023-06-16-15-09-59-163.png
>
>
> Background:
> We have a Cluster which will write many small key. The user will merge these 
> small keys into one big key, so these small keys will not be stored for a 
> long time. So we have almost the same QPS of delete requests as writes. 
> The current key deletion performance cannot meet the requirements, the disk 
> is using less and less available capacity, but in fact, only a small portion 
> of the user's valid data



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to