Re: automated refresh of dev from prod

2019-02-28 Thread Ben Chobot
On Feb 28, 2019, at 8:04 AM, Stephen Frost  wrote:
> 
> Greetings,
> 
> * Julie Nishimura (juliez...@hotmail.com) wrote:
>> Hello everybody, I am new to postgresql environment, but trying to get up to 
>> speed.
>> Can you please share your experience on how you can automate refreshment of 
>> dev environment on regular basis (desirably weekly), taking for 
>> consideration some of prod dbs can be very large (like 20+ TB
>> 
>> Any suggestions?
> 
> The approach that I like to recommend is to have your backup/restore
> solution be involved in this refreshing process, so that you're also
> testing that your backup/restore process works correctly.  For dealing
> with larger databases, using a backup tool which has parallel backup,
> parallel restore, and is able to restore just the files which are
> different from the backup can make the restore take much less time (this
> is what the 'delta-restore' option in pgbackrest does, and it was
> specifically written to support exactly this kind of prod->dev periodic
> refresh, though other tools may also support that these days).
> 
> As mentioned elsewhere on this thread, using snapshots can also be a
> good approach though you have to be sure that the snapshot is completely
> atomic across all filesystems that PostgreSQL is using, or you have to
> deal with running pg_start/stop_backup and putting a backup_label into
> place for the restored snapshot and a recovery.conf to provide a way for
> PG to get at any WAL which was generated while the snapshot (or
> snapshots) was being taken.

Very much yes to everything Stephen says. Regularly refreshing nonprod via your 
normal backup/restore process is an efficient way to test your backups, and 
snapshots are a great way to do backups when your data volume is greater than 
your churn between backups. (And at 20+ TB, I hope that's the case for you.)


Re: automated refresh of dev from prod

2019-02-28 Thread Stephen Frost
Greetings,

* Julie Nishimura (juliez...@hotmail.com) wrote:
> Hello everybody, I am new to postgresql environment, but trying to get up to 
> speed.
> Can you please share your experience on how you can automate refreshment of 
> dev environment on regular basis (desirably weekly), taking for consideration 
> some of prod dbs can be very large (like 20+ TB
> 
> Any suggestions?

The approach that I like to recommend is to have your backup/restore
solution be involved in this refreshing process, so that you're also
testing that your backup/restore process works correctly.  For dealing
with larger databases, using a backup tool which has parallel backup,
parallel restore, and is able to restore just the files which are
different from the backup can make the restore take much less time (this
is what the 'delta-restore' option in pgbackrest does, and it was
specifically written to support exactly this kind of prod->dev periodic
refresh, though other tools may also support that these days).

As mentioned elsewhere on this thread, using snapshots can also be a
good approach though you have to be sure that the snapshot is completely
atomic across all filesystems that PostgreSQL is using, or you have to
deal with running pg_start/stop_backup and putting a backup_label into
place for the restored snapshot and a recovery.conf to provide a way for
PG to get at any WAL which was generated while the snapshot (or
snapshots) was being taken.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: automated refresh of dev from prod

2019-02-27 Thread Ron

On 2/27/19 3:15 PM, Julie Nishimura wrote:
Hello everybody, I am new to postgresql environment, but trying to get up 
to speed.
Can you please share your experience on how you can automate refreshment 
of dev environment on regular basis (desirably weekly), taking for 
consideration some of prod dbs can be very large (like 20+ TB


Any suggestions?


Weekly refreshes of 20TB from prod to dev seems a bit excessive.


--
Angular momentum makes the world go 'round.


RE: automated refresh of dev from prod

2019-02-27 Thread Scot Kreienkamp
My method is complex and not so good for newbies, but it is incredibly fast and 
should scale to almost any size database.  Mine are not nearly as large though.

I use two methods... the normal backup/restore for longer lived development 
environments, and for shorter lived environments I use postgres native 
mirroring from a secondary prod server to all my dev environments, then use LVM 
snapshots to take a snapshot of the postgres mount.  Mount the snapshot and 
startup a second postgres instance in it and you have a mirror of production 
ready for use.  That only lasts for a finite amount of time though (until you 
fill the space dedicated to the snapshot) before it becomes unusable,
that's the downside... it can't be long lived (hours, 1-2 days maybe).  The 
upside is that the refresh from production in my environment for a 400G 
database is 3 seconds.  It is a trade-off and not fit for every use, that's why 
we also use the traditional backup/restore in some cases.



Scot Kreienkamp |Senior Systems Engineer | La-Z-Boy Corporate
One La-Z-Boy Drive| Monroe, Michigan 48162 | Office: 734-384-6403 | | Mobile: 
7349151444 | Email: scot.kreienk...@la-z-boy.com
From: Julie Nishimura [mailto:juliez...@hotmail.com]
Sent: Wednesday, February 27, 2019 4:16 PM
To: pgsql-gene...@postgresql.org
Subject: automated refresh of dev from prod


ATTENTION:   This email was sent to La-Z-Boy from an external source. Be 
vigilant when opening attachments or clicking links.
Hello everybody, I am new to postgresql environment, but trying to get up to 
speed.
Can you please share your experience on how you can automate refreshment of dev 
environment on regular basis (desirably weekly), taking for consideration some 
of prod dbs can be very large (like 20+ TB

Any suggestions?

Thank you!

This message is intended only for the individual or entity to which it is 
addressed.  It may contain privileged, confidential information which is exempt 
from disclosure under applicable laws.  If you are not the intended recipient, 
you are strictly prohibited from disseminating or distributing this information 
(other than to the intended recipient) or copying this information.  If you 
have received this communication in error, please notify us immediately by 
e-mail or by telephone at the above number. Thank you.