Re: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?

Ron Sun, 15 Jan 2023 14:26:22 -0800

We regularly use "history" databases. Put them on slow media, and only takea backup when data is added to them (monthly, quarterly, etc).


On 1/15/23 15:57, HECTOR INGERTO wrote:

> But you cannot and should not rely on snapshots alone

That’s only for non atomic (multiple pools) snapshots. Isn’t?
If I need to rely only on ZFS (automated) snapshots, then the best optionwould be to have two DB? Each one in each own pool. One HDD DB and one SSDDB. Then, the backend code should know on which DB the requested data is.
*De: *Magnus Hagander <mailto:[email protected]>
*Enviado: *domingo, 15 de enero de 2023 20:36
*Para: *HECTOR INGERTO <mailto:[email protected]>
*CC: *[email protected]
*Asunto: *Re: Are ZFS snapshots unsafe when PGSQL is spreading throughmultiple zpools?
On Sun, Jan 15, 2023 at 8:18 PM HECTOR INGERTO <[email protected]> wrote:

    Hello everybody,

    I’m using PostgreSQL on openZFS. I use ZFS snapshots as a backup +
    hotspare method.

    From man zfs-snapshot: “Snapshots are taken atomically, so that all
    snapshots correspond to the same moment in time.” So if a PSQL
    instance is started from a zfs snapshot, it will start to replay the
    WAL from the last checkpoint, in the same way it would do in a crash
    or power loss scenario. So from my knowledge, ZFS snapshots can be
    used to rollback to a previous point in time. Also, sending those
    snapshots to other computers will allow you to have hotspares and
    remote backups. If I’m wrong here, I would appreciate being told about
    it because I’m basing the whole question on this premise.

    On the other hand, we have the tablespace PGSQL feature, which is
    great because it allows “unimportant” big data to be written into
    cheap HDD and frequently used data into fast NVMe.

    So far, so good. The problem is when both ideas are merged. Then,
    snapshots from different pools are NOT atomical, snapshot on the HDD
    pool isn’t going to be done at the same exact time as the one on the
    SSD pool, and I don’t know enough about PGSQL internals to know how
    dangerous this is. So here is where I would like to ask for your help
    with the following questions:

    First of all, what kind of problem can this lead to? Are we talking
    about potential whole DB corruption or only the loss of a few of the
    latest transactions?

Silent data corruption. *not* just losing your latest transaction.

    In second place, if I’m initializing a corrupted PGSQL instance
    because ZFS snapshots are from different pools and slightly different
    times, am I going to notice it somehow or is it going to fail silently?

Silent. You might notice at the application level. Might.

    In third and last place, is there some way to quantify the amount of
    risk taken when snapshotting a PGSQL instance spread across two (or
    more) different pools?

"Don't do it".

If you can't get atomic snapshots, don't do it, period.
You can use them together with a regular online backup. That ispg_start_backup() // <snapshot multiple volumes> // pg_stop_backup()together with log archiving. That's a perfectly valid method. But youcannot and should not rely on snapshots alone.
--

 Magnus Hagander
Me: https://www.hagander.net/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.hagander.net%2F&data=05%7C01%7C%7C4860509b67ea484420fb08daf72fddd4%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638094082195595508%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=LRa%2BFTXpoZNsMLMrNLbL6xmgo9I3Mxx2CcCAh6nmguU%3D&reserved=0> Work: https://www.redpill-linpro.com/<https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.redpill-linpro.com%2F&data=05%7C01%7C%7C4860509b67ea484420fb08daf72fddd4%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C638094082195752157%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=ziYhcTa5YOvHZEr2xk2nEKvSjLICE75zRhhCehvzIMs%3D&reserved=0>


--
Born in Arizona, moved to Babylonia.

Re: Are ZFS snapshots unsafe when PGSQL is spreading through multiple zpools?

Reply via email to