> Snapshot metadata are currently stored in memory / they are cached so we do > not need to go to disk every single time we want to list them, the more > snapshots we have, the worse it is. Are we enumerating our snapshots somewhere on the hot path, or is this performance concern misplaced?
On Wed, Aug 7, 2024, at 7:44 AM, Štefan Miklošovič wrote: > Snapshot metadata are currently stored in memory / they are cached so we do > not need to go to disk every single time we want to list them, the more > snapshots we have, the worse it is. > > When a snapshot is _manually_ removed from disk, not from nodetool > clearsnapshot, just by rm -rf on a respective snapshot directory, then such > snapshot will be still visible in nodetool listsnapshots. Manual removal of a > snapshot might be done e.g. by accident or by some "impatient" operator who > just goes to disk and removes it there instead of using nodetool or > respective JMX method. > > To improve UX here, what I came up with is that we might use Java's > WatchService where each snapshot dir would be registered. WatchService is > part of Java, it uses inotify subsystem which is what Linux kernel offers. > The result of doing it is that once a snapshot dir is registered to be > watched and when it is removed then we are notified about that via inotify / > WatchService so we can react on it and remove the in-memory representation of > that so it will not be visible in the output anymore. > > While this works, there are some questions / concerns > > 1) What do people think about inotify in general? I tested this on 10k > snapshots and it seems to work just fine, nevertheless there is in general no > strong guarantee that every single event will come through, there is also a > family of kernel parameters around this where more tuning can be done etc. It > is also questionable how this will behave on other systems from Linux (Mac > etc). While JRE running on different platforms also implements this, I am not > completely sure these implementations are quality-wise the same as for Linux > etc. There is a history of not-so-quality implementations for other systems > (events not coming through on Macs etc) and while I think we are safe on > Linux, I am not sure we want to go with this elsewhere. > > 2) inotify brings more entropy into the codebase, it is another thing we need > to take care of etc (however, it is all concentrated in one class and pretty > much "isolated" from everything else) > > I made this feature optional and it is turned off by default so people need > to explicitly opt-in into this so we are not forcing it on anybody. > > If we do not want to go with inotify, another option would be to have a > background thread which would periodically check if a manifest exists on a > disk, if it does not, then a snapshot does not either. While this works, what > I do not like about this is that the primary reason we moved it to memory was > to bypass IO as much as possible yet here we would introduce another check > which would go to disk, and this would be done periodically, which beats the > whole purpose. If an operator lists snapshots once a week and there is a > background check running every 10 minutes (for example), then the cummulative > number of IO operations migth be bigger than us just doing nothing at all. > For this reason, if we do not want to go with inotify, I would also not > implement any automatic background check. Instead of that, there would be > SnapshotManagerMbean#refresh() method which would just forcibly reload all > snapshots from scratch. I think that manual deletion of snapshots is not > something a user would do regularly, snapshots are meant to be removed via > nodetool or JMX. If manual removal ever happens then in order to make it > synchronized again, the refreshing of these snapshots would be required. > There might be an additional flag in nodetool listsnapshots, once specified, > refreshing would be done, otherwise not. > > How does this all sound to people? > > Regards