[jira] [Comment Edited] (RATIS-1979) Correct LogEntryProto cache size calculation for stateMachineCachingEnabled=true

Duong (Jira) Thu, 21 Dec 2023 18:29:03 -0800


    [ 
https://issues.apache.org/jira/browse/RATIS-1979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17798335#comment-17798335
 ]


Duong edited comment on RATIS-1979 at 12/22/23 2:28 AM:
--------------------------------------------------------

Makes sense. If zero copy and stateMachineCaching are enabled, the zero copy 
buffers can be released when applied and follower indices pass.

Actually, with stateMachineData trimmed, we don't need to copy LogEntryProto 
before putting in the cache, because the remaining is all primitive data (no 
ByteString fields).


was (Author: JIRAUSER290990):
Makes sense. If zero copy and stateMachineCaching are enabled, the zero copy 
buffers can be released when applied and follower indices pass.

> Correct LogEntryProto cache size calculation for 
> stateMachineCachingEnabled=true
> --------------------------------------------------------------------------------
>
>                 Key: RATIS-1979
>                 URL: https://issues.apache.org/jira/browse/RATIS-1979
>             Project: Ratis
>          Issue Type: Sub-task
>            Reporter: Duong
>            Priority: Major
>
> With zero-copy (https://github.com/apache/ratis/pull/990), we rely on RaftLog 
> cache eviction to release the zero-copy input streams.
> There's a problems, with stateMachineCachingEnabled, the cache-size of 
> LogEntryproto is calculated with StateMachineData trimmed. That leave the 
> actual cached size if a log entry is too small comparing to the amount of 
> direct memory backed for the StateMachineData. 
>  
> {code:java}
> static long getEntrySize(LogEntryProto entry, Op op) {
>   LogEntryProto e = entry;
>   if (op == Op.CHECK_SEGMENT_FILE_FULL) {
>     e = LogProtoUtils.removeStateMachineData(entry);
>   } else if (op == Op.LOAD_SEGMENT_FILE || op == 
> Op.WRITE_CACHE_WITH_STATE_MACHINE_CACHE) {
>     Preconditions.assertTrue(entry == 
> LogProtoUtils.removeStateMachineData(entry),
>         () -> "Unexpected LogEntryProto with StateMachine data: op=" + op + 
> ", entry=" + entry);
>   } else {
>     Preconditions.assertTrue(op == Op.WRITE_CACHE_WITHOUT_STATE_MACHINE_CACHE 
> || op == Op.REMOVE_CACHE,
>         () -> "Unexpected op " + op + ", entry=" + entry);
>   }
>   final int serialized = e.getSerializedSize();
>   return serialized + CodedOutputStream.computeUInt32SizeNoTag(serialized) + 
> 4L;
> } {code}
>  
> With the default 200MB limit for raft log cache, cache eviction is likely 
> never happens until the server run out of direct memory for zero-copy.
>  
> Maybe for cache calculation, we should take the StateMachineData size into 
> account.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (RATIS-1979) Correct LogEntryProto cache size calculation for stateMachineCachingEnabled=true

Reply via email to