ankitsinghal commented on a change in pull request #18: RATIS-523 RATIS-524 RATIS-525 RATIS-526 RATIS-527 RATIS-533 Lots of cleanup on the LogService URL: https://github.com/apache/incubator-ratis/pull/18#discussion_r280302619
########## File path: ratis-logservice/TUNING.md ########## @@ -0,0 +1,42 @@ +# Tuning for the Log Service + +This is a list of Ratis configuration properties which have been +found to be relevant/important to control how Ratis operates for +the purposes of the LogService. + +## RAFT Log + +The default RAFT log implementation uses "segments" on disk to avoid +a single file growing to be very large. By default, each segment is +`8MB` in size and can be set by the API `RaftServerConfigKeys.Log.setSegmentSizeMax()`. +When a new segment is created, Ratis will "preallocate" that segment by writing +data into the file to reduce the risk of latency when we first try to append +entries to the RAFT log. By default, the segment is preallocated with `4MB` +and can be changed via `RaftServerConfigKeys.Log.setPreallocatedSize()`. + +Up to 2 log segments are cached in memory (including the segment actively being +written to). This is controlled by `RaftServerConfigKeys.Log.setMaxCachedSegmentNum()`. +Increasing this configuration would use more memory but should reduce the latency +of reading entries from the RAFT log. + +Writes to the RAFT log are buffered using a Java Direct ByteBuffer (offheap). By default, +this buffer is `64KB` in size and can be changed via `RaftServerConfigKeys.Log.setWriteBufferSize`. +Beware that when one LogServer is hosting multiple RAFT groups (multiple "LogService Logs"), each +will LogServer will have its own buffer. Thus, high concurrency will result in multiple buffers. + +## RAFT Server + +Every RAFT server maintains a queue of I/O actions that it needs to execute. As with +much of Ratis, these actions are executed asynchronously and the client can block on +completion of these tasks as necessary. To prevent saturating memory, this queue of +items can be limited in size by both number of entries and size of the elements in the queue. +The former defaults to 4096 elements and id controlled by `RaftServerConfigKeys.Log.setElementLimit()`, +while the latter defaults to `64MB` and is controlled by `RaftServerConfigKeys.Log.setByteLimit()`. + +## Do Not Set + +Running a snapshot indicates that we can truncate part of the RAFT log, as the expectation is that +a snapshot is an equivalent representation of all of the updates from the log. However, the LogService +is written to expect that we maintain these records. As such, we must not allow snapshots to automatically +happen as we may lose records from teh RAFT log. `RaftServerConfigKeys.Snapshot.setAutoTriggerEnabled()` +defaults to `false` and should not be set to `true`. Review comment: This is a good tuning guide. But Snapshot.setAutoTriggerEnabled() config doesn't look like a tuning, as it impacts the log service, shouldn't we throw an exception if it is set to true or explicitly set to false in code(just to avoid human errors) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
