> On May 13, 2025, at 4:38 PM, David Cantrell <da...@cantrell.org.uk> wrote: > > On 05/05/2025 17:23, Scott Baker wrote: > >> High speed database-grade cloud storage is not cheap. Whatever we can do to >> decrease the amount of raw storage we need the better. Lower storage usage >> means faster replication and quicker backups. Have you ever tried backing up >> 1TB of data in the cloud? Spoiler alert: it's not easy. > > The initial sync is a pain, but after that it's tolerable, especially if you > can efficiently just send diffs eg with zfs send/recv.
Yeah, that's how Zach Dysktra set up our MySQL database backups: The primary MySQL wrote its binlogs to a ZFS volume, and the replica would receive them and then use a ZFS snapshot to mark the backup point. Then ZFS itself stored the diff between the snapshot point and the current data. Long-term, I suspect that'll be what ends up happening w/ the Collector system: ZFS really seems like the ideal solution for this, as it has compression like we want, but also snapshotting and replication and send/recv and etc... If there's anyone with some expertise to lend about setting up some OpenZFS stuff, that'd be awesome, because I'm not that up-to-date on it at the moment (I was still thinking it was pretty unstable on Linux, but a cursory web search I just did makes it seem like that is no longer the case). Doug Bell d...@preaction.me