[cc Jos] Hi Alexander,
I have a few questions/concerns about the list mentioned below: On Fri, Mar 21, 2025 at 11:09 AM Alexander Patrakov <patra...@gmail.com> wrote: > > Hello Vladimir, > > Please contact croit via https://www.croit.io/contact for unofficial > (not yet fully reviewed) patches and mention me. They are currently > working on a similar problem and have solved at least the following: > > * Crashes when two directories are being mirrored in parallel Are you talking about bi-directional mirroring for the same directory b/w two ceph clusters? > * Data corruption (snapshots on the destination having missing files > or files "from the future") due to incorrect cache maintenance in > libcephfs There have been bugs identified in the mirror daemon and the snapdiff library. These have been fixed and merged in the main branch (or under review) waiting to get backported to releases. > * Confusion that results in the failed mirroring when a directory is > replaced by a symlink This is a bug in the mirror daemon when the sync mechanism uses snapdiff -- fix is under review. > * Useless work done to mirror old snapshots that nobody needs, while > the latest snapshot has the highest business value OK. So this is more of an enhancement (that you proposed) than a bug to avoid the mirror daemon starting synchronizing snapshots from the very first and also possibly synchronize the latest snapshot first since that snapshot has the most business value. > * Useless stat() calls slowing down the mirroring significantly There are some (not many) unnecessary stats in the scan code path, however, most of those stat requests should be satisfied from the client's cache. If you have details on the slowness, please create a tracker with the details. Those unnecessary stats should be done away with irrespective of the requests satisfied from clients cache... Jos? > * They also introduced multi-threaded mirroring so that more than one > file is copied at a time and more than one subdirectory is scanned at > a time - this helps with directories containing many small files Again, this is an enhancement feature being developed by Croit. A change has been posted for review. > > Due to the above, I absolutely cannot recommend using the unpatched > cephfs-mirror and would rather (if you don't want to contact croit for > patches) ask you to switch to rsync with some wrapper script that > runs, in parallel, multiple copies operating on different > subdirectories. The *real* slowness in the mirror daemon is the data sync for regular files over a slow link b/w clusters. This is due to the fact that the mirror daemon transfers the whole file even if only some bytes have been modified. As you can see, this is inefficient and scales badly with large files. For tentacle release, the MDS is introducing blockdiff operation support -- to be able to get a list of changes blocks for a file between snapshots and the mirror daemon would make use of this. This enhancement will also be backported along with other enhancements by Croit, so latest ceph releases should see a vast improvement in synchronization time. > > On Fri, Mar 21, 2025 at 12:48 AM Vladimir Cvetkovic > <vladimir.cvetko...@hostpapa.com> wrote: > > > > Hi everyone, > > > > We have two remote clusters and are trying to set up snapshot mirroring. > > Our directories are in a syncing state but never seem to actually sync the > > snapshots. > > Snapshot mirroring seems way to slow for large file systems - does anyone > > have a working setup? > > > > Thanks! > > _______________________________________________ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > > -- > Alexander Patrakov > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io -- Cheers, Venky _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io