Duong created RATIS-1979:
----------------------------

             Summary: Correct LogEntryProto cache size calculation for 
stateMachineCachingEnabled=true
                 Key: RATIS-1979
                 URL: https://issues.apache.org/jira/browse/RATIS-1979
             Project: Ratis
          Issue Type: Sub-task
            Reporter: Duong


With zero-copy (https://github.com/apache/ratis/pull/990), we rely on RaftLog 
cache eviction to release the zero-copy input streams.

There's a problems, with stateMachineCachingEnabled, the cache-size of 
LogEntryproto is calculated with StateMachineData trimmed. That leave the 
actual cached size if a log entry is too small comparing to the amount of 
direct memory backed for the StateMachineData. 

 
{code:java}
static long getEntrySize(LogEntryProto entry, Op op) {
  LogEntryProto e = entry;
  if (op == Op.CHECK_SEGMENT_FILE_FULL) {
    e = LogProtoUtils.removeStateMachineData(entry);
  } else if (op == Op.LOAD_SEGMENT_FILE || op == 
Op.WRITE_CACHE_WITH_STATE_MACHINE_CACHE) {
    Preconditions.assertTrue(entry == 
LogProtoUtils.removeStateMachineData(entry),
        () -> "Unexpected LogEntryProto with StateMachine data: op=" + op + ", 
entry=" + entry);
  } else {
    Preconditions.assertTrue(op == Op.WRITE_CACHE_WITHOUT_STATE_MACHINE_CACHE 
|| op == Op.REMOVE_CACHE,
        () -> "Unexpected op " + op + ", entry=" + entry);
  }
  final int serialized = e.getSerializedSize();
  return serialized + CodedOutputStream.computeUInt32SizeNoTag(serialized) + 4L;
} {code}
 

With the default 200MB limit for raft log cache, cache eviction is likely never 
happens until the server run out of direct memory for zero-copy.

 

Maybe for cache calculation, we should take the StateMachineData size into 
account.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to