[jira] [Updated] (HDDS-15577) Improve getBlockDNCache logic and cleanup

Ivan Andika (Jira) Tue, 16 Jun 2026 01:13:04 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ivan Andika updated HDDS-15577:
-------------------------------
    Description: 
This is simply an observation in XceiverClientGrpc getBlockDNcache. 
getBlockDNCache is used to cache the DN which returned the GetBlock command so 
that the ReadChunk command can be sent to the same DN. The idea is that if the 
GetBlock for BCSID N returns successfully, the subsequence ReadChunk will also 
return successfully, whereas for other datanodes, the data might not have been 
replicated yet.

There are some identified issues. 

First, it seems that getBlockDNcache entries are not cleaned up regularly. 
Although the XceiverClientManager will evict the XceiverClientGrpc every 
scm.container.client.idle.threshold (default 10s). If the particular 
XceiverClientGrpc is accessed a lot of time (hot client), the size of the 
getBlockDNcache might increase and can cause real memory overhead. The possible 
solution is to use Guava cache for getBlockDNcache to ensure that the entries 
are evicted. Additionally, we might also clear the map in close() so that the 
objects GC cleanup do not need to wait for XceiverClientGrpc to be GC collected.

Secondly, in HDDS-10593 we added another sort datanodes logic that will 
prioritize IN_SERVICE datanodes over datanodes in maintenance / decommission. 
However, this will reorder the datanodeList again causing the cached DN for 
that block ID might not be the first datanode. We can have a flag hasCachedDN 
whenever there is a hit in the getBlockDNcache and skip the sorting.

  was:
This is simply an observation in XceiverClientGrpc getBlockDNcache. 
getBlockDNCache is used to cache the DN which returned the GetBlock command so 
that the ReadChunk command can be sent to the same DN. The idea is that if the 
GetBlock for BCSID x returns successfully, the subsequence ReadChunk will also 
return successfully, whereas for other datanodes, the data might not have been 
replicated yet.

There are some identified issues. 

First, it seems that getBlockDNcache entries are not cleaned up regularly. 
Although the XceiverClientManager will evict the XceiverClientGrpc every 
scm.container.client.idle.threshold (default 10s). If the particular 
XceiverClientGrpc is accessed a lot of time (hot client), the size of the 
getBlockDNcache might increase and can cause real memory overhead. The possible 
solution is to use Guava cache for getBlockDNcache to ensure that the entries 
are evicted. Additionally, we might also clear the map in close() so that the 
objects GC cleanup do not need to wait for XceiverClientGrpc to be GC collected.

Secondly, in HDDS-10593 we added another sort datanodes logic that will 
prioritize IN_SERVICE datanodes over datanodes in maintenance / decommission. 
However, this will reorder the datanodeList again causing the cached DN for 
that block ID might not be the first datanode. We can have a flag hasCachedDN 
whenever there is a hit in the getBlockDNcache and skip the sorting.


> Improve getBlockDNCache logic and cleanup
> -----------------------------------------
>
>                 Key: HDDS-15577
>                 URL: https://issues.apache.org/jira/browse/HDDS-15577
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Minor
>
> This is simply an observation in XceiverClientGrpc getBlockDNcache. 
> getBlockDNCache is used to cache the DN which returned the GetBlock command 
> so that the ReadChunk command can be sent to the same DN. The idea is that if 
> the GetBlock for BCSID N returns successfully, the subsequence ReadChunk will 
> also return successfully, whereas for other datanodes, the data might not 
> have been replicated yet.
> There are some identified issues. 
> First, it seems that getBlockDNcache entries are not cleaned up regularly. 
> Although the XceiverClientManager will evict the XceiverClientGrpc every 
> scm.container.client.idle.threshold (default 10s). If the particular 
> XceiverClientGrpc is accessed a lot of time (hot client), the size of the 
> getBlockDNcache might increase and can cause real memory overhead. The 
> possible solution is to use Guava cache for getBlockDNcache to ensure that 
> the entries are evicted. Additionally, we might also clear the map in close() 
> so that the objects GC cleanup do not need to wait for XceiverClientGrpc to 
> be GC collected.
> Secondly, in HDDS-10593 we added another sort datanodes logic that will 
> prioritize IN_SERVICE datanodes over datanodes in maintenance / decommission. 
> However, this will reorder the datanodeList again causing the cached DN for 
> that block ID might not be the first datanode. We can have a flag hasCachedDN 
> whenever there is a hit in the getBlockDNcache and skip the sorting.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-15577) Improve getBlockDNCache logic and cleanup

Reply via email to