Re: automated refresh of dev from prod
On Feb 28, 2019, at 8:04 AM, Stephen Frost wrote: > > Greetings, > > * Julie Nishimura (juliez...@hotmail.com) wrote: >> Hello everybody, I am new to postgresql environment, but trying to get up to >> speed. >> Can you please share your experience on how you can automate refreshment of >> dev environment on regular basis (desirably weekly), taking for >> consideration some of prod dbs can be very large (like 20+ TB >> >> Any suggestions? > > The approach that I like to recommend is to have your backup/restore > solution be involved in this refreshing process, so that you're also > testing that your backup/restore process works correctly. For dealing > with larger databases, using a backup tool which has parallel backup, > parallel restore, and is able to restore just the files which are > different from the backup can make the restore take much less time (this > is what the 'delta-restore' option in pgbackrest does, and it was > specifically written to support exactly this kind of prod->dev periodic > refresh, though other tools may also support that these days). > > As mentioned elsewhere on this thread, using snapshots can also be a > good approach though you have to be sure that the snapshot is completely > atomic across all filesystems that PostgreSQL is using, or you have to > deal with running pg_start/stop_backup and putting a backup_label into > place for the restored snapshot and a recovery.conf to provide a way for > PG to get at any WAL which was generated while the snapshot (or > snapshots) was being taken. Very much yes to everything Stephen says. Regularly refreshing nonprod via your normal backup/restore process is an efficient way to test your backups, and snapshots are a great way to do backups when your data volume is greater than your churn between backups. (And at 20+ TB, I hope that's the case for you.)
Re: automated refresh of dev from prod
Greetings, * Julie Nishimura (juliez...@hotmail.com) wrote: > Hello everybody, I am new to postgresql environment, but trying to get up to > speed. > Can you please share your experience on how you can automate refreshment of > dev environment on regular basis (desirably weekly), taking for consideration > some of prod dbs can be very large (like 20+ TB > > Any suggestions? The approach that I like to recommend is to have your backup/restore solution be involved in this refreshing process, so that you're also testing that your backup/restore process works correctly. For dealing with larger databases, using a backup tool which has parallel backup, parallel restore, and is able to restore just the files which are different from the backup can make the restore take much less time (this is what the 'delta-restore' option in pgbackrest does, and it was specifically written to support exactly this kind of prod->dev periodic refresh, though other tools may also support that these days). As mentioned elsewhere on this thread, using snapshots can also be a good approach though you have to be sure that the snapshot is completely atomic across all filesystems that PostgreSQL is using, or you have to deal with running pg_start/stop_backup and putting a backup_label into place for the restored snapshot and a recovery.conf to provide a way for PG to get at any WAL which was generated while the snapshot (or snapshots) was being taken. Thanks! Stephen signature.asc Description: PGP signature
Re: automated refresh of dev from prod
On 2/27/19 3:15 PM, Julie Nishimura wrote: Hello everybody, I am new to postgresql environment, but trying to get up to speed. Can you please share your experience on how you can automate refreshment of dev environment on regular basis (desirably weekly), taking for consideration some of prod dbs can be very large (like 20+ TB Any suggestions? Weekly refreshes of 20TB from prod to dev seems a bit excessive. -- Angular momentum makes the world go 'round.
RE: automated refresh of dev from prod
My method is complex and not so good for newbies, but it is incredibly fast and should scale to almost any size database. Mine are not nearly as large though. I use two methods... the normal backup/restore for longer lived development environments, and for shorter lived environments I use postgres native mirroring from a secondary prod server to all my dev environments, then use LVM snapshots to take a snapshot of the postgres mount. Mount the snapshot and startup a second postgres instance in it and you have a mirror of production ready for use. That only lasts for a finite amount of time though (until you fill the space dedicated to the snapshot) before it becomes unusable, that's the downside... it can't be long lived (hours, 1-2 days maybe). The upside is that the refresh from production in my environment for a 400G database is 3 seconds. It is a trade-off and not fit for every use, that's why we also use the traditional backup/restore in some cases. Scot Kreienkamp |Senior Systems Engineer | La-Z-Boy Corporate One La-Z-Boy Drive| Monroe, Michigan 48162 | Office: 734-384-6403 | | Mobile: 7349151444 | Email: scot.kreienk...@la-z-boy.com From: Julie Nishimura [mailto:juliez...@hotmail.com] Sent: Wednesday, February 27, 2019 4:16 PM To: pgsql-gene...@postgresql.org Subject: automated refresh of dev from prod ATTENTION: This email was sent to La-Z-Boy from an external source. Be vigilant when opening attachments or clicking links. Hello everybody, I am new to postgresql environment, but trying to get up to speed. Can you please share your experience on how you can automate refreshment of dev environment on regular basis (desirably weekly), taking for consideration some of prod dbs can be very large (like 20+ TB Any suggestions? Thank you! This message is intended only for the individual or entity to which it is addressed. It may contain privileged, confidential information which is exempt from disclosure under applicable laws. If you are not the intended recipient, you are strictly prohibited from disseminating or distributing this information (other than to the intended recipient) or copying this information. If you have received this communication in error, please notify us immediately by e-mail or by telephone at the above number. Thank you.