[
https://issues.apache.org/jira/browse/RATIS-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Duong reassigned RATIS-2093:
----------------------------
Assignee: Duong
> Decouple metadata and configuration entries from appendEntries buffer for
> stateMachineCache
> -------------------------------------------------------------------------------------------
>
> Key: RATIS-2093
> URL: https://issues.apache.org/jira/browse/RATIS-2093
> Project: Ratis
> Issue Type: Sub-task
> Reporter: Duong
> Assignee: Duong
> Priority: Major
> Attachments: cache_reference.png,
> direct_mem_util_before_and_after.png, unreleased_messages_before_and_after.png
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> When testing zero-copy in Ozone (stateMachineCache enabled), we saw hundreds
> of thousands of ServerProtocol messages trapped unclosed although the number
> of entries cached in Ozone StateMachine was small (<500). Also, the
> utilization of direct memory by Netty is high and doesn't go down after the
> test run is done.
>
> Turns out, an appendEntries request can contain multiple log entries. Some of
> them can be metadata or configuration entries whose size is small (~10-20
> bytes). Some of them can be StateMachine entries whose size is much bigger
> (4mb).
> Today, when stateMachineCache is enabled, the StateMachine entities stored in
> LogCache don't have a reference count to the original appendEntries, but
> metadata and configuration entries do. Because the size of metadata and
> configuration is small, they will almost never fill up the LogCache to
> trigger a cacheEvict. Their references to the original appendEntries request
> prevent the request buffer from being released when StateMachine cache evicts
> the StateMachine entries.
> !cache_reference.png|width=496,height=310!
>
> When stateMachineCache enabled, the metadata and config entries should not
> hold a reference to the original appendEntries.
> I did a quick test and compared the direct mem util and number of unreleased
> message before and after making the change.
> !direct_mem_util_before_and_after.png|width=545,height=251!
> !unreleased_messages_before_and_after.png|width=548,height=254!
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)