Re: Backup methods
Andrew, > For a small system with a few hundred mailboxes, a simple unix filesystem backup is sufficient. You can dump the Cyrus mailboxes.db to a flat file every hour with cron (keep a few days worth). Backup everything with your regular backup system (tar, rsync, etc). > If you suffer a complete loss of the system and have to restore from the backup, you won't care much about a few database file inconsistencies, which can be repaired with Cyrus' reconstruct tool. You would recover the whole backup, recover mailboxes.db from the most recent flat file export, and then run reconstruct on every mailbox. Yepp, this is how I was (and is) doing it (hourly), so if one backup has something unrecoverable, I can check a previous backup (-1hr) and luckily it'll be in a better shape. So on the one hand this is something that "works", yes. On the other, recently I've started using Cyrus xDAV functionality that permits to store files, calendars and contacts (BTW, some minor issues apart, it works great!). All this information, if inconsistent, is more difficult to deal with. It's more fragile than mails. Also, changes to this data are more important and happen with higher frequency (I have an accounting client where 4 users make a couple of hundreds of changes to a single xls file per day over Cyrus WebDAV). It's in pre-production state in my deployments right now, but I suspect that to bear some inconsistencies or restore a -1hr backup would not be an acceptable policy for this type of data. Regards, Anatoli *From:* Andrew Morgan *Sent:* Friday, May 11, 2018 02:05 *To:* Anatoli *Cc:* Info-cyrus *Subject:* Re: Backup methods On Fri, 11 May 2018, Anatoli wrote: There may be an argument that could be made for 2 backup stratagies That's the point. In the context of SME environments (Small and Medium-sized Enterprises, i.e. from 5 to 50 employees normally, up to 250 in some countries) that we were talking about, a replication is an overkill, IMO. But for large enterprises like MNCs, large universities, public mail providers (Fastmail) of course multiple masters and backups via replication is the way to go. For large deployments there are good backup solutions in Cyrus, but for the small businesses admins I don't know any to recommend. Anatoli, I think you're making this harder than it needs to be... For a small system with a few hundred mailboxes, a simple unix filesystem backup is sufficient. You can dump the Cyrus mailboxes.db to a flat file every hour with cron (keep a few days worth). Backup everything with your regular backup system (tar, rsync, etc). If you suffer a complete loss of the system and have to restore from the backup, you won't care much about a few database file inconsistencies, which can be repaired with Cyrus' reconstruct tool. You would recover the whole backup, recover mailboxes.db from the most recent flat file export, and then run reconstruct on every mailbox. If you need to recover some messages or mailboxes that were deleted by a user, then just recover those individual files or directories from you backup. Run reconstruct -rf on the mailbox. Naturally, delayed expunge and delayed delete are fantastic ways to avoid all this work. Purge them only after a few weeks or a month has passed. It is much easier to restore using those delayed delete/expunge features. Thanks, Andy Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
On Fri, 11 May 2018, Anatoli wrote: There may be an argument that could be made for 2 backup stratagies That's the point. In the context of SME environments (Small and Medium-sized Enterprises, i.e. from 5 to 50 employees normally, up to 250 in some countries) that we were talking about, a replication is an overkill, IMO. But for large enterprises like MNCs, large universities, public mail providers (Fastmail) of course multiple masters and backups via replication is the way to go. For large deployments there are good backup solutions in Cyrus, but for the small businesses admins I don't know any to recommend. Anatoli, I think you're making this harder than it needs to be... For a small system with a few hundred mailboxes, a simple unix filesystem backup is sufficient. You can dump the Cyrus mailboxes.db to a flat file every hour with cron (keep a few days worth). Backup everything with your regular backup system (tar, rsync, etc). If you suffer a complete loss of the system and have to restore from the backup, you won't care much about a few database file inconsistencies, which can be repaired with Cyrus' reconstruct tool. You would recover the whole backup, recover mailboxes.db from the most recent flat file export, and then run reconstruct on every mailbox. If you need to recover some messages or mailboxes that were deleted by a user, then just recover those individual files or directories from you backup. Run reconstruct -rf on the mailbox. Naturally, delayed expunge and delayed delete are fantastic ways to avoid all this work. Purge them only after a few weeks or a month has passed. It is much easier to restore using those delayed delete/expunge features. Thanks, Andy Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
> There may be an argument that could be made for 2 backup stratagies That's the point. In the context of SME environments (Small and Medium-sized Enterprises, i.e. from 5 to 50 employees normally, up to 250 in some countries) that we were talking about, a replication is an overkill, IMO. But for large enterprises like MNCs, large universities, public mail providers (Fastmail) of course multiple masters and backups via replication is the way to go. For large deployments there are good backup solutions in Cyrus, but for the small businesses admins I don't know any to recommend. > If the mailboxes are on something like EXT4 you can do an LVM snapshot bacause of the built in auto checkpointing and for something like xfs there is freeze. Yepp, this is the idea. But the data on disk should be in a consistent state. Something like "FLUSH TABLES WITH READ LOCK" is what is needed actually, i.e. to consistently write to disk the data from the current instance and lock. > The trouble is that read operations can alter a files state so it may not be just a simple matter of a write lock. If you mean things like SEEN state that are changed when the user reads something, the implementation v1 could block it, v2 could allow it by queuing in memory such state-change pending operations. Anyway these are just implementation details that don't change the general logic and, taking into account the supposed duration of the lock, IMO don't even matter much. The significant work that blocks this feature is the global lock across the entire Cyrus. Once it's implemented, it would be much easier to introduced specific improvements here and there. > Cyrus has multiple databases that would also need to be frozen and flushed before the snapshot is taken. > If you spread your mailstorage and metadata storage over multiple file systems trying to co-ordinate snapshots becomes more complex. Sure, Cyrus would have to lock all the databases and where to store them would be up to the admin. Actually, fs snapshots is just the most obvious use-case. There could be others... up to the admin to decide. *From:* Alvin Starr *Sent:* Thursday, May 10, 2018 23:55 *To:* Info-cyrus *Subject:* Re: Backup methods On 05/10/2018 06:29 PM, Anatoli wrote: Actually, mysqldump performs a lock on the records it's dumping. If its for a MyISAM db, the entire table is locked. If it's for InnoDB and similar, an internal snapshot is created and only the records the dump is reading are unavailable for writing. Mysql provides table consistency by locking but that is only table consistency. Multiple updates across multiple tables could easily result in an inconsistent database even though the tables are individually consistent. With some tables that take hours to dump locking the tables is problematic. This is why some people use LVM snapshots combined with "FLUSH TABLES WITH READ LOCK". Cyrus could also implement a per-user lock, but in reality it doesn't need that complex syncro mechanisms, a simple global write lock would be enough (reading would not be affected, son only I, not I/O, and not to stop it but just to suspend). After all, the *write* lock would last only a second or so, the fs snapshots are almost instantaneous. If you can't tolerate a 1 second delay for writing in Cyrus, you are probably not a SME. The trouble is that read operations can alter a files state so it may not be just a simple matter of a write lock. If the mailboxes are on something like EXT4 you can do an LVM snapshot bacause of the built in auto checkpointing and for something like xfs there is freeze. Cyrus has multiple databases that would also need to be frozen and flushed before the snapshot is taken. If you spread your mailstorage and metadata storage over multiple file systems trying to co-ordinate snapshots becomes more complex. And you don't need to hold the data to transfer it. You can dump it directly to a nfs share or pass it as stdout to ssh: mysqldump | xz -9 | ssh remote_server "cat > /bkp/`date +%y%m%d_%H%M`.sql.xz". With a couple of pipes more you can encrypt the data on the fly so it's secure to store it in a cheap VPS overseas... or you could upload it to dropbox. There may be an argument that could be made for 2 backup stratagies. 1) where the mailstoreage and metadata can exist on a single volume and flushing the various databases is a short duration event. Then an LVM snapshot could be used 2) for distribted large scale mail systems where only an online live backup system can be used. Backup for 100 users has different requirements than backup for 10 users so why not support a few different backup strategies. *From:* Jason L Tibbitts Iii *Sent:* Thursday, May 10, 2018 18:41 *To:* Anatoli *Cc:* Info-cyrus *Subject:* Re: Backup methods "A" == Anatoli<m...@anatoli.ws> writes: A> What about mysqldump >
Re: Backup methods
On 05/10/2018 06:29 PM, Anatoli wrote: Actually, mysqldump performs a lock on the records it's dumping. If its for a MyISAM db, the entire table is locked. If it's for InnoDB and similar, an internal snapshot is created and only the records the dump is reading are unavailable for writing. Mysql provides table consistency by locking but that is only table consistency. Multiple updates across multiple tables could easily result in an inconsistent database even though the tables are individually consistent. With some tables that take hours to dump locking the tables is problematic. This is why some people use LVM snapshots combined with "FLUSH TABLES WITH READ LOCK". Cyrus could also implement a per-user lock, but in reality it doesn't need that complex syncro mechanisms, a simple global write lock would be enough (reading would not be affected, son only I, not I/O, and not to stop it but just to suspend). After all, the *write* lock would last only a second or so, the fs snapshots are almost instantaneous. If you can't tolerate a 1 second delay for writing in Cyrus, you are probably not a SME. The trouble is that read operations can alter a files state so it may not be just a simple matter of a write lock. If the mailboxes are on something like EXT4 you can do an LVM snapshot bacause of the built in auto checkpointing and for something like xfs there is freeze. Cyrus has multiple databases that would also need to be frozen and flushed before the snapshot is taken. If you spread your mailstorage and metadata storage over multiple file systems trying to co-ordinate snapshots becomes more complex. And you don't need to hold the data to transfer it. You can dump it directly to a nfs share or pass it as stdout to ssh: mysqldump | xz -9 | ssh remote_server "cat > /bkp/`date +%y%m%d_%H%M`.sql.xz". With a couple of pipes more you can encrypt the data on the fly so it's secure to store it in a cheap VPS overseas... or you could upload it to dropbox. There may be an argument that could be made for 2 backup stratagies. 1) where the mailstoreage and metadata can exist on a single volume and flushing the various databases is a short duration event. Then an LVM snapshot could be used 2) for distribted large scale mail systems where only an online live backup system can be used. Backup for 100 users has different requirements than backup for 10 users so why not support a few different backup strategies. *From:* Jason L Tibbitts Iii *Sent:* Thursday, May 10, 2018 18:41 *To:* Anatoli *Cc:* Info-cyrus *Subject:* Re: Backup methods "A" == Anatoli<m...@anatoli.ws> writes: A> What about mysqldump > dump.sql, then mysql < dump.sql? Also a wrong A> way and didn't have to be implemented? No, that's exactly my point. Thanks for making it for me! The analog to the way you indicated that you would like it to work would be having the mysql server stop IO so that you can take a filesystem snapshot while the database is in a consistent state. But instead, the database (like cyrus) implements a backup method which you can use to extract the data. And it also requires disk space to hold the backup until you can transfer it to your backup medium. - J< Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus -- Alvin Starr || land: (905)513-7688 Netvel Inc. || Cell: (416)806-0133 al...@netvel.net || Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
Actually, mysqldump performs a lock on the records it's dumping. If its for a MyISAM db, the entire table is locked. If it's for InnoDB and similar, an internal snapshot is created and only the records the dump is reading are unavailable for writing. Cyrus could also implement a per-user lock, but in reality it doesn't need that complex syncro mechanisms, a simple global write lock would be enough (reading would not be affected, son only I, not I/O, and not to stop it but just to suspend). After all, the *write* lock would last only a second or so, the fs snapshots are almost instantaneous. If you can't tolerate a 1 second delay for writing in Cyrus, you are probably not a SME. And you don't need to hold the data to transfer it. You can dump it directly to a nfs share or pass it as stdout to ssh: mysqldump | xz -9 | ssh remote_server "cat > /bkp/`date +%y%m%d_%H%M`.sql.xz". With a couple of pipes more you can encrypt the data on the fly so it's secure to store it in a cheap VPS overseas... or you could upload it to dropbox. *From:* Jason L Tibbitts Iii *Sent:* Thursday, May 10, 2018 18:41 *To:* Anatoli *Cc:* Info-cyrus *Subject:* Re: Backup methods "A" == Anatoli <m...@anatoli.ws> writes: A> What about mysqldump > dump.sql, then mysql < dump.sql? Also a wrong A> way and didn't have to be implemented? No, that's exactly my point. Thanks for making it for me! The analog to the way you indicated that you would like it to work would be having the mysql server stop IO so that you can take a filesystem snapshot while the database is in a consistent state. But instead, the database (like cyrus) implements a backup method which you can use to extract the data. And it also requires disk space to hold the backup until you can transfer it to your backup medium. - J< Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
> "A" == Anatoliwrites: A> What about mysqldump > dump.sql, then mysql < dump.sql? Also a wrong A> way and didn't have to be implemented? No, that's exactly my point. Thanks for making it for me! The analog to the way you indicated that you would like it to work would be having the mysql server stop IO so that you can take a filesystem snapshot while the database is in a consistent state. But instead, the database (like cyrus) implements a backup method which you can use to extract the data. And it also requires disk space to hold the backup until you can transfer it to your backup medium. - J< Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
> Well, sort of. It is a method that is actually focused around doing backups. It happens to make use of the replication protocol because that is actually the smart way to do it. I did detail the differences in my message. I suggest you try to use it in your deployments and then share with us your real-world experience, like how reliable it is, how well the compression works, how easy it is to recover something if both master and the backup instances become unaccessible (disk failure in both or both servers are stolen (this is a SME office, not a tier 4 datacenter) and the backups from an external location should be brought in), what data is missing (if at all) after a backup recovery, how incremental backups are done, etc. I tried it in a /real deployment/ a year ago when it was just released and my conclusion was that it was not well-suited for a SME environment (at least at that moment). > Honestly I believe that's the wrong way to go about it What about mysqldump > dump.sql, then mysql < dump.sql? Also a wrong way and didn't have to be implemented? I bet this is the most deployed method for DB backups in the real SME world (like cron mysqldump --routines --all-databases | xz -9 > /bu/`date +%y%m%d_%H%M`_full.sql.xz), though there are replication solutions available too. The Unix way is about minimalist, modular software. *From:* Jason L Tibbitts Iii *Sent:* Thursday, May 10, 2018 16:38 *To:* Anatoli *Cc:* Info-cyrus *Subject:* Re: Backup methods "A" == Anatoli <m...@anatoli.ws> writes: A> What you mention is highly related to the replication backup A> we were talking about in the previous mails. Well, sort of. It is a method that is actually focused around doing backups. It happens to make use of the replication protocol because that is actually the smart way to do it. I did detail the differences in my message. A> In both cases, a copy of the master data is made, which requires A> twice the space of real usage (Cyrus Backups tries to apply A> compression on stored data, not sure how well it works). As I mentioned, the documentation discusses this. A> What is really needed, IMO, for SME environments is the ability for A> Cyrus to sync to disk all data, so one can take a hot copy of that A> data with standard UNIX tools and then handle it accordingly. Once a A> recovery is needed, one just copies a backup to the Cyrus dir and A> starts the service. Honestly I believe that's the wrong way to go about it, but it's certainly one way to do things if you have no backup solution integrated into the software. But hey, it's your data. I only wanted to mention that there really is an existing backup solution which wasn't being discussed. - J< Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
> For me, if I put a replica in place it's get the role of backup. Meanning I will put two replica and do not make another backup. A replica is not a replacement for a backup. You may have your specific needs, but replica per-se mostly serves to cover for master's hardware failures. You are not protected with a replica from accidental or intentional deletions/changes of the data. If a user deletes some of his/her mails and discovers it after the expunge period, you won't be able to recover them as replica would also have them deleted. > Using ZFS, do no need to do that Sure, if you're using ZFS :) The solution I've described serves for any *nix OS and fs. > So if I stop the postfix on the cyrus_server You just don't need to stop it. If you expect to stop Cyrus frequently, just configure the cyrus_sever Postfix retry interval to something like 1 min. *From:* Albert Shih *Sent:* Thursday, May 10, 2018 17:32 *To:* Anatoli *Cc:* Info-cyrus *Subject:* Re: Backup methods Le 10/05/2018 à 10:38:28-0300, Anatoli a écrit Not very sure to understand that. It's always true isn't ? If you have XTo of data and you want n backups you will need X*(n+1) To ? The replication as it is designed means that you create an additional (replica) instance of Cyrus that will be in sync with the master instance, so when you need to make a backup, you turn of the replica, take a backup from its data, then turn it on again so it comes in sync with the master. In this case there's no interruption to the service, you just stop a replica. But the replica will use the same amount of space as your master, so without even making a backup, you'll use 2x space. + you have to understand how the replication works, then set it up, control that the sync process is always working and the replica has the same information as the master... That's a great solution for ISP-level or public mail service operations, but IMO an absolute overkill for small deployments. For me, if I put a replica in place it's get the role of backup. Meanning I will put two replica and do not make another backup. When it comes to making a backup, the best policy IMO is to make incremental backups. In this case you only store the new mails + binary indexes. Once in a while (e.g. every month) you make a full backup, then, say, once a week a level 1 backup (that stores changes from the previous week, reset at lower level backup, i.e. every month), then daily level 2 backups and hourly level 3. This way you can restore up to hourly changes without using excessive amount of space. Of course you can compress them too (xz -9 gives a pretty good ratio). Using ZFS, do no need to do that. Just use zfs snapshot and he going to keep the differential at block level (much better than file level). Same as compression. Just need to activate compression on the dataset. Uhh don't do that. Your Postfix has no problem in retaining mails if Cyrus is not reachable, then attempt their delivery again. I was referring to that, depending on the configuration of your incoming MTA, the next delivery attempt may be in, say, 15 minutes, so you postpone incoming mail for that time if you turn off Cyrus to take a backup. If you turn off your incoming MTA, the source MTA may have issues with delivery at all (you don't control it, you don't know how it's configured, when the next delivery attempt will occur, etc.), never turn off your incoming MTA. Don't be a problem, I've got 2 public incoming MTA, 4 privates and the postfix on the cyrus-server. So incoming mail, let's say gmail.com going from gmail.com_MX to our MX, then send to cyrus-server. So if I stop the postfix on the cyrus_server, the incoming mail going to stay on the our MX. -- Albert SHIH DIO bâtiment 15 Observatoire de Paris xmpp: j...@obspm.fr Heure local/Local time: Thu May 10 22:27:22 CEST 2018 Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
Le 10/05/2018 à 16:08:32-0300, Anatoli a écrit Hi. > In both cases, a copy of the master data is made, which requires twice the > space of real usage (Cyrus Backups tries to apply compression on stored data, > not sure how well it works). In ZFS with lz4 (standard compression on ZFS) you got 1.18 ratio (3.57 To on disk for 4.05To of data) so not very good. I use lz4 because it's got same performance than no compression. I didn't try gzip on mail but gzip can be very impressif on ratio but eat lot of cpu. > > What is really needed, IMO, for SME environments is the ability for Cyrus to > sync to disk all data, so one can take a hot copy of that data with standard > UNIX tools and then handle it accordingly. Once a recovery is needed, one just > copies a backup to the Cyrus dir and starts the service. The data would be in > the exact same state as when the backup took place. This is discussed in the > github issue mentioned in the previous mail. I fully agree. In fact 7 years ago when we renew our mail server I already try cyrus and dovecot (we come from courier-imap), and we choose dovecot because it's very easy to backup (and manage) for old_unix_admin. Just put in the crontab some rsync that's all, one mail = one file, etc. Now we choose cyrus-imap over dovecot (so for next 7 years) because all the feature cyrus got. But yes if cyrus got something like mysql_dump or pg_dump_all that would be super nice. Regards. -- Albert SHIH DIO bâtiment 15 Observatoire de Paris xmpp: j...@obspm.fr Heure local/Local time: Thu May 10 22:43:47 CEST 2018 Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
Le 10/05/2018 à 10:38:28-0300, Anatoli a écrit > > Not very sure to understand that. It's always true isn't ? If you have XTo > > of > data and you want n backups you will need X*(n+1) To ? > > The replication as it is designed means that you create an additional > (replica) > instance of Cyrus that will be in sync with the master instance, so when you > need to make a backup, you turn of the replica, take a backup from its data, > then turn it on again so it comes in sync with the master. In this case > there's > no interruption to the service, you just stop a replica. But the replica will > use the same amount of space as your master, so without even making a backup, > you'll use 2x space. + you have to understand how the replication works, then > set it up, control that the sync process is always working and the replica has > the same information as the master... That's a great solution for ISP-level or > public mail service operations, but IMO an absolute overkill for small > deployments. For me, if I put a replica in place it's get the role of backup. Meanning I will put two replica and do not make another backup. > When it comes to making a backup, the best policy IMO is to make incremental > backups. In this case you only store the new mails + binary indexes. Once in a > while (e.g. every month) you make a full backup, then, say, once a week a > level > 1 backup (that stores changes from the previous week, reset at lower level > backup, i.e. every month), then daily level 2 backups and hourly level 3. This > way you can restore up to hourly changes without using excessive amount of > space. Of course you can compress them too (xz -9 gives a pretty good ratio). Using ZFS, do no need to do that. Just use zfs snapshot and he going to keep the differential at block level (much better than file level). Same as compression. Just need to activate compression on the dataset. > > Uhh don't do that. Your Postfix has no problem in retaining mails if Cyrus is > not reachable, then attempt their delivery again. I was referring to that, > depending on the configuration of your incoming MTA, the next delivery attempt > may be in, say, 15 minutes, so you postpone incoming mail for that time if you > turn off Cyrus to take a backup. If you turn off your incoming MTA, the source > MTA may have issues with delivery at all (you don't control it, you don't know > how it's configured, when the next delivery attempt will occur, etc.), never > turn off your incoming MTA. Don't be a problem, I've got 2 public incoming MTA, 4 privates and the postfix on the cyrus-server. So incoming mail, let's say gmail.com going from gmail.com_MX to our MX, then send to cyrus-server. So if I stop the postfix on the cyrus_server, the incoming mail going to stay on the our MX. -- Albert SHIH DIO bâtiment 15 Observatoire de Paris xmpp: j...@obspm.fr Heure local/Local time: Thu May 10 22:27:22 CEST 2018 Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
> "A" == Anatoliwrites: A> What you mention is highly related to the replication backup A> we were talking about in the previous mails. Well, sort of. It is a method that is actually focused around doing backups. It happens to make use of the replication protocol because that is actually the smart way to do it. I did detail the differences in my message. A> In both cases, a copy of the master data is made, which requires A> twice the space of real usage (Cyrus Backups tries to apply A> compression on stored data, not sure how well it works). As I mentioned, the documentation discusses this. A> What is really needed, IMO, for SME environments is the ability for A> Cyrus to sync to disk all data, so one can take a hot copy of that A> data with standard UNIX tools and then handle it accordingly. Once a A> recovery is needed, one just copies a backup to the Cyrus dir and A> starts the service. Honestly I believe that's the wrong way to go about it, but it's certainly one way to do things if you have no backup solution integrated into the software. But hey, it's your data. I only wanted to mention that there really is an existing backup solution which wasn't being discussed. - J< Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
Jason, What you mention is highly related to the replication backup we were talking about in the previous mails. The idea is the same, they replicate from master. Then, in a pure replica solution, the replica is stopped, a copy of its files (in the native format) is made and it's started again to continue replication. The Cyrus Backups also replicates all data from master, but stores it in a different format and has some additional functionality to assist in backing up the files, but AFAIK it's not yet complete, e.g. I'm not sure it's possible to make incremental backups with it, not sure about the SEEN state, the recovery process appears not trivial, etc. In both cases, a copy of the master data is made, which requires twice the space of real usage (Cyrus Backups tries to apply compression on stored data, not sure how well it works). What is really needed, IMO, for SME environments is the ability for Cyrus to sync to disk all data, so one can take a hot copy of that data with standard UNIX tools and then handle it accordingly. Once a recovery is needed, one just copies a backup to the Cyrus dir and starts the service. The data would be in the exact same state as when the backup took place. This is discussed in the github issue mentioned in the previous mail. *From:* Jason L Tibbitts Iii *Sent:* Thursday, May 10, 2018 14:10 *To:* Arnaldo Viegas De Lima *Cc:* Info-cyrus *Subject:* Re: Backup methods Cyrus does have an integrated backup system (see https://cyrusimap.org/imap/reference/admin/backups.html) which I'm not sure has been mentioned in this thread. But you still have to have enough space to keep the compressed backups on disk in order to move them to tape or whatever archival storage you're using. There is discussion of the storage requirements in the documentation. I don't think any of it is particularly unreasonable, but I haven't actually tried it myself. Technically I don't think you need a separate machine (though that's simpler); it may just be possible to have a second cyrus server listening on different ports to act as the replication target. I probably wouldn't do it that way anyway; old hardware with some cheap disk would suffice to stage the backups until they're sent to tape or wherever. As for it all being marked "experimental", I'm sure that if bugs were found (and reported), they would be fixed. It probably just needs more testing and back and forth with the devs to flesh out the documentation and add any missing functionality. - J< Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
Cyrus does have an integrated backup system (see https://cyrusimap.org/imap/reference/admin/backups.html) which I'm not sure has been mentioned in this thread. But you still have to have enough space to keep the compressed backups on disk in order to move them to tape or whatever archival storage you're using. There is discussion of the storage requirements in the documentation. I don't think any of it is particularly unreasonable, but I haven't actually tried it myself. Technically I don't think you need a separate machine (though that's simpler); it may just be possible to have a second cyrus server listening on different ports to act as the replication target. I probably wouldn't do it that way anyway; old hardware with some cheap disk would suffice to stage the backups until they're sent to tape or wherever. As for it all being marked "experimental", I'm sure that if bugs were found (and reported), they would be fixed. It probably just needs more testing and back and forth with the devs to flesh out the documentation and add any missing functionality. - J< Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
One alternative, if you are running VMs (such as VMWare or Hyper-V) is resort to a VM backup program that will uses native change tracking (VMWare CBT and Hyper-V RCT). This allows for a consistent and incremental backup of the VM’s. It’s quite fast and saves lots of space. The backup is incremental (based on disk clusters) and is performed on a temporary VM snapshot. I use this approach, but my spool is no where close to a TB (or To for the francophones). Works fine and I’ve performed a few disaster recovery tests without problems. Arnaldo Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
> Not very sure to understand that. It's always true isn't ? If you have XTo of data and you want n backups you will need X*(n+1) To ? The replication as it is designed means that you create an additional (replica) instance of Cyrus that will be in sync with the master instance, so when you need to make a backup, you turn of the replica, take a backup from /its data/, then turn it on again so it comes in sync with the master. In this case there's no interruption to the service, you just stop a replica. But the replica will use the same amount of space as your master, so without even making a backup, you'll use 2x space. + you have to understand how the replication works, then set it up, control that the sync process is always working and the replica has the same information as the master... That's a great solution for ISP-level or public mail service operations, but IMO an absolute overkill for small deployments. > I don't see how you can avoid that, of course you can activate heavy compression on the backup but beside of that When it comes to making a backup, the best policy IMO is to make incremental backups. In this case you only store the new mails + binary indexes. Once in a while (e.g. every month) you make a full backup, then, say, once a week a level 1 backup (that stores changes from the previous week, reset at lower level backup, i.e. every month), then daily level 2 backups and hourly level 3. This way you can restore up to hourly changes without using excessive amount of space. Of course you can compress them too (xz -9 gives a pretty good ratio). > Well that's is easy to avoid, you just have to stop postfix before stopping the VM, when postfix is stop all incoming messages will stay on the parent smtp server, so no loosing incoming mail. Uhh don't do that. Your Postfix has no problem in retaining mails if Cyrus is not reachable, then attempt their delivery again. I was referring to that, depending on the configuration of your incoming MTA, the next delivery attempt may be in, say, 15 minutes, so you postpone incoming mail for that time if you turn off Cyrus to take a backup. If you turn off your /incoming MTA/, the source MTA may have issues with delivery at all (you don't control it, you don't know how it's configured, when the next delivery attempt will occur, etc.), never turn off your incoming MTA. *From:* Albert Shih *Sent:* Thursday, May 10, 2018 04:14 *To:* Anatoli *Cc:* Info-cyrus *Subject:* Re: Backup methods Le 10/05/2018 à 02:44:18-0300, Anatoli a écrit Hi, The replication is reasonable only if you have more than one server in your deployment (and both servers with the same level of security, if not you risk to compromise the user data) or "spool size/available disk space" is low, otherwise you'd need to dedicate 2 times more space than needed to store user data, only to take a periodic backup (+ the space needed to store the backup itself). Not very sure to understand that. It's always true isn't ? If you have XTo of data and you want n backups you will need X*(n+1) To ? I don't see how you can avoid that, of course you can activate heavy compression on the backup but beside of that I suggest you take a look at this issue: https://github.com/cyrusimap/ cyrus-imapd/issues/1763, where backups for small deployments were already Thanks for the link I will read that. Answering the OP's question, I'm using Cyrus for 4 years now and I don't know about any reliable and reasonable strategy for backups of Cyrus data in SME environments. Summing it up: • FS snapshots without stopping the server: a possibility of a corrupted backup. • FS snapshots after stopping the server: service downtime, breaking open connections, delivery issues for incoming MTAs, etc. - reasonable for daily Well that's is easy to avoid, you just have to stop postfix before stopping the VM, when postfix is stop all incoming messages will stay on the parent smtp server, so no loosing incoming mail. backups in a 8/5 office, unreasonable for 24/7 deployments (e.g. users distributed in different time zones) or for intra-day backups. I check, stopping postfix, stopping the VM, take a snapshot, starting the VM, take about 10-15 secondes. So I agree with you it's not a very good solution because user still can loose the connection, but I think without replication it's acceptable. • Replication: unreasonable requirements for disk space, setup overkill. For the setup the overkill is for me a small price vs loosing dataand as for the disk space that's not a issue at all for me. Currently I run dovecot and have 2 backups, so when I say to my boss « we got X To of mail » I already got 3 * X To of disk. Say in other way, if I can afford X To, I will say I can give you X/3 To of mail. Regards. -- Albert SHIH DIO bâtiment 15 Observatoire de Paris xmpp: j...@obspm.fr Heure local/Local time: Thu May 10
Re: Backup methods
Le 10/05/2018 à 02:44:18-0300, Anatoli a écrit Hi, > > The replication is reasonable only if you have more than one server in your > deployment (and both servers with the same level of security, if not you risk > to compromise the user data) or "spool size/available disk space" is low, > otherwise you'd need to dedicate 2 times more space than needed to store user > data, only to take a periodic backup (+ the space needed to store the backup > itself). Not very sure to understand that. It's always true isn't ? If you have XTo of data and you want n backups you will need X*(n+1) To ? I don't see how you can avoid that, of course you can activate heavy compression on the backup but beside of that > I suggest you take a look at this issue: https://github.com/cyrusimap/ > cyrus-imapd/issues/1763, where backups for small deployments were already Thanks for the link I will read that. > Answering the OP's question, I'm using Cyrus for 4 years now and I don't know > about any reliable and reasonable strategy for backups of Cyrus data in SME > environments. Summing it up: > > • FS snapshots without stopping the server: a possibility of a corrupted > backup. > • FS snapshots after stopping the server: service downtime, breaking open > connections, delivery issues for incoming MTAs, etc. - reasonable for > daily Well that's is easy to avoid, you just have to stop postfix before stopping the VM, when postfix is stop all incoming messages will stay on the parent smtp server, so no loosing incoming mail. > backups in a 8/5 office, unreasonable for 24/7 deployments (e.g. users > distributed in different time zones) or for intra-day backups. I check, stopping postfix, stopping the VM, take a snapshot, starting the VM, take about 10-15 secondes. So I agree with you it's not a very good solution because user still can loose the connection, but I think without replication it's acceptable. > • Replication: unreasonable requirements for disk space, setup overkill. For the setup the overkill is for me a small price vs loosing dataand as for the disk space that's not a issue at all for me. Currently I run dovecot and have 2 backups, so when I say to my boss « we got X To of mail » I already got 3 * X To of disk. Say in other way, if I can afford X To, I will say I can give you X/3 To of mail. Regards. -- Albert SHIH DIO bâtiment 15 Observatoire de Paris xmpp: j...@obspm.fr Heure local/Local time: Thu May 10 09:03:31 CEST 2018 Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
Hi, On larger systems with VMs i take a ZFS or LVM snapshot and mount it externally to "fetch" a full (incremental) filesystem backup of the mail spool and imap spool and cyrus db on a daily base. After the backup run i destroy the snapshot. The problem with this is that you can't be sure the data on disk is in sync. Depending on how heavy the load is during the "backup" (+ your luck), you may find unpleasant surprises when you have to restore it. And you'd only know that the backup is corrupted when you try to restore it. Beside this and depending from your needs you may take a look at cyrus replication features to build a "backup" or just use standard filesystem backup tools like tar, dumpfs etc. The replication is reasonable only if you have more than one server in your deployment (and both servers with the same level of security, if not you risk to compromise the user data) or "spool size/available disk space" is low, otherwise you'd need to dedicate 2 times more space than needed to store user data, only to take a periodic backup (+ the space needed to store the backup itself). I suggest you take a look at this issue: https://github.com/cyrusimap/cyrus-imapd/issues/1763, where backups for small deployments were already discussed in detail. Though, no idea if there are plans to implement it. Answering the OP's question, I'm using Cyrus for 4 years now and I don't know about any reliable and reasonable strategy for backups of Cyrus data in SME environments. Summing it up: * FS snapshots without stopping the server: a possibility of a corrupted backup. * FS snapshots after stopping the server: service downtime, breaking open connections, delivery issues for incoming MTAs, etc. - reasonable for daily backups in a 8/5 office, unreasonable for 24/7 deployments (e.g. users distributed in different time zones) or for intra-day backups. * Replication: unreasonable requirements for disk space, setup overkill. Regards, Anatoli *From:* Niels Dettenbach Via Info-cyrus *Sent:* Wednesday, May 09, 2018 06:42 *To:* Info-cyrus *Subject:* Re: Backup methods Am Mittwoch, 9. Mai 2018, 11:19:54 CEST schrieb Albert Shih: I would like to know what's kind of backup method are recommended for cyrus-imapd. My cyrus-imapd host (only one currently) are running under FreeBSD jail (something like systemd-nspawn, lxc) & ZFS so I'm intend to use this method : stop the vm take a zfs snapshot start the vm send the zfs snapshot on a backup server. This is relatively inefficient, but a working option if anything from cyrus data is on that VM - i.e. the complete mail spool and the database files (possibly plus sieve files). We do similiar on relatively small systems or to get "intraday backups" only. On larger systems with VMs i take a ZFS or LVM snapshot and mount it externally to "fetch" a full (incremental) filesystem backup of the mail spool and imap spool and cyrus db on a daily base. After the backup run i destroy the snapshot. Beside this and depending from your needs you may take a look at cyrus replication features to build a "backup" or just use standard filesystem backup tools like tar, dumpfs etc. On a file base you have to backup the mail spool and the cyrus database files. If you use SIEVE, backup the SIEVE file pool too. You can restore by just replacing the files and start cyrus. To get the common database files "interoperable" it may makes sense to dump then into a machine independent format for backup if they are in a machine dependent format. If your restore such a filesystem based backup to a new system which has other hardware / arch specs or newer / incompatible DB subsystem (instead of skiplist) you may have to "recreate" indizes and database data. reconstruct - f may be your friend to "clean" up the transfer / restore. There are several strategies for backup cyrus - this are just a few. hth a bit. good luck, Niels. Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
Re: Backup methods
Am Mittwoch, 9. Mai 2018, 11:19:54 CEST schrieb Albert Shih: > I would like to know what's kind of backup method are recommended for > cyrus-imapd. > > My cyrus-imapd host (only one currently) are running under FreeBSD jail > (something like systemd-nspawn, lxc) & ZFS so I'm intend to use this > method : > > stop the vm > take a zfs snapshot > start the vm > > send the zfs snapshot on a backup server. This is relatively inefficient, but a working option if anything from cyrus data is on that VM - i.e. the complete mail spool and the database files (possibly plus sieve files). We do similiar on relatively small systems or to get "intraday backups" only. On larger systems with VMs i take a ZFS or LVM snapshot and mount it externally to "fetch" a full (incremental) filesystem backup of the mail spool and imap spool and cyrus db on a daily base. After the backup run i destroy the snapshot. Beside this and depending from your needs you may take a look at cyrus replication features to build a "backup" or just use standard filesystem backup tools like tar, dumpfs etc. On a file base you have to backup the mail spool and the cyrus database files. If you use SIEVE, backup the SIEVE file pool too. You can restore by just replacing the files and start cyrus. To get the common database files "interoperable" it may makes sense to dump then into a machine independent format for backup if they are in a machine dependent format. If your restore such a filesystem based backup to a new system which has other hardware / arch specs or newer / incompatible DB subsystem (instead of skiplist) you may have to "recreate" indizes and database data. reconstruct - f may be your friend to "clean" up the transfer / restore. There are several strategies for backup cyrus - this are just a few. hth a bit. good luck, Niels. -- --- Niels Dettenbach Syndicat IT & Internet http://www.syndicat.com PGP: https://syndicat.com/pub_key.asc --- Cyrus Home Page: http://www.cyrusimap.org/ List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/ To Unsubscribe: https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus