Hi William, Thanks a lot for filing RATIS-1626 and working on it!
Tsz-Wo On Mon, Jul 18, 2022 at 8:18 AM William Song <[email protected]> wrote: > Hi Tsz-Wo, > > Thanks for helping me understand the mechanism of PurgeLog. SuggestIndex & > purgeGap is sufficient for most scenarios. Still, a new conf controlling > latest n won’t-be-purged logs can help applications to balance between the > cost of preserving & transferring lagging logs and the cost of transferring > snapshot. Intuitively, applications can set n to where (n average logs == a > snapshot). > > I create an (issue)[https://issues.apache.org/jira/browse/RATIS-1626] and > I’m willing to work on it. > > William > > > 2022年7月17日 01:23,Tsz Wo Sze <[email protected]> 写道: > > > > Hi William, > > > > //RaftLog.java > > > > /** > > * Purge asynchronously the log transactions. > > * The implementation may choose to purge an index other than the > > suggested index. > > * > > * @param suggestedIndex the suggested index (inclusive) to be purged. > > * @return the future of the actual purged log index. > > */ > > CompletableFuture<Long> purge(long suggestedIndex); > > > > First of all, the index parameter passed to purge(..) is a suggested > > index. The implementation may choose to purge a different index. For > > SegmentedRaftLog, it only purges the logs in closed segments. Logs in > the > > open segment won't be purged. > > > > Also, there is a raft.server.log.purge.gap so that purge operations won't > > happen too often. E.g. when the last purge index is 1000 and the purge > gap > > is 1024, the next purge won't happen until the index is 2024. > > > > Both mechanisms above do not directly solve your problem, although they > may > > make the problem less serious. How about we add another conf so that > purge > > won't purge the latest n log entries? > > > > Tsz-Wo > > > > On Sat, Jul 16, 2022 at 12:02 AM William Song <[email protected]> > wrote: > > > >> Hi, > >> > >> When purgeUpToSnapshotIndex = false, a particularly slow follower (sync > >> rate < write rate) can cause the RaftLog accumulates in leader and other > >> group members. Even if they take a snapshot, they still have to keep the > >> RaftLog for the slow follower. > >> > >> When purgeUpToSnapshotIndex = true, leader will delete its RaftLog as > soon > >> as it takes the latest snapshot. Even a follower is only 100 entries > >> behind, leader still has to send the whole snapshot to this follower. > The > >> situation worsens when snapshot contains GBs of data. > >> > >> Is it possible that, when the server takes a snapshot, we can ask it to > >> preserve the latest, say, 1000 logs and purge the other logs? In this > way, > >> we don’t have to worry for RaftLog accumulation or unnecessary snapshot > >> transfer. > >> > >> Regards, > >> William Song > >> > >> > >
