Re: Logical replication, need to reclaim big disk space

Achilleas Mantzios Mon, 19 May 2025 05:41:37 -0700


On 5/19/25 09:14, Moreno Andreo wrote:

On 16/05/25 21:33, Achilleas Mantzios wrote:
On 16/5/25 18:45, Moreno Andreo wrote:
Hi,
we are moving our old binary data approach, moving them frombytea fields in a table to external storage (making database smallerand related operations faster and smarter).In short, we have a job that runs in background and copies data fromthe table to an external file and then sets the bytea field to NULL.
(UPDATE tbl SET blob = NULL, ref = 'path/to/file' WHERE id = <uuid>)
This results, at the end of the operations, to a table that's lessthan one tenth in size.We have a multi-tenant architecture (100s of schemas with identicalarchitecture, all inheriting from public) and we are performing thetask on one table per schema.
So? toasted data are kept on separate TOAST tables, unless thosebytea cols are selected, you won't even touch them. I cannotunderstand what you are trying to achieve here.
Years ago, when I made the mistake to go for a coffee and let mydevelopers "improvise" , the result was a design similar to what youare trying to achieve. Years after, I am seriously considering movingthose data back to PostgreSQL.
The "related operations" I was talking about are backups and databasemaintenance when needed, cluster/replica management, etc. With asmaller database size they would be easier in timing and effort, right?

Ok, but you'll lose replica functionality for those blobs, which meansyou don't care about them, correct me if I am wrong.

We are mostly talking about costs, here. To give things their names,I'm moving bytea contents (85% of total data) to files into GoogleCloud Storage buckets, that has a fraction of the cost of the disksholding my database (on GCE, to be clear ).

May I ask the size of the bytea data (uncompressed) ?.

This data is not accessed frequently (just by the owner when he needsto do it), so no need to keep it on expensive hardware.I've already read in these years that keeping many big bytea fields indatabases is not recommended, but might have misunderstood this.

Ok, I assume those are unimportant data, but let me ask, what is thelongevity or expected legitimacy of those ? I haven't worked with thosejust reading :


https://cloud.google.com/storage/pricing?_gl=1*1b25r8o*_up*MQ..&gclid=CjwKCAjwravBBhBjEiwAIr30VKfaOJytxmk7J29vjG4rBBkk2EUimPU5zPibST73nm3XRL2h0O9SxRoCaogQAvD_BwE&gclsrc=aw.ds#storage-pricing

would you choose e.g. "*Anywhere Cache storage" ?
*

Another way would have been to move these tables to a differenttablespace, in cheaper storage, but it still would have been 3 timesthe buckets cost.

can you actually mount those Cloud Storage Buckets under a supported FSin linux and just move them to tablespaces backed by this storage ?

Why are you considering to get data back to database tables?

Because now if we need to migrate from cloud to on-premise, or justupgrade or move the specific server which holds those data I will havean extra headache. Also this is a single point of failure, or best casea cause for fragmented technology introduced just for the sake ofkeeping things out of the DB.

The problem is: this is generating BIG table bloat, as you may imagine.
Running a VACUUM FULL on an ex-22GB table on a standalone testserver is almost immediate.If I had only one server, I'll process a table a time, with anightly script, and issue a VACUUM FULL to tables that have alreadybeen processed.
But I'm in a logical replication architecture (we are using amultimaster system called pgEdge, but I don't think it will make bigdifference, since it's based on logical replication), and I'mbuilding a test cluster.
So you use PgEdge , but you wanna lose all the benefits ofmulti-master , since your binary data won't be replicated ...
I don't think I need it to be replicated, since this data cannot be"edited", so either it's there or it's been deleted. Buckets haveprotections for data deletions or events like ransomware attacks andsuch.Also multi-master was an absolute requirement one year ago because ofa project we were building, but it has been abandoned and now a simplelogical replication would be enough, but let's do one thing a time.

Multi-master is cool, you can configure your pooler / clients to takeadvantage of this for full load balanced architecture, but if not astrict requirement , you can live without it, as so many of us, andemploy other means of load balancing the reads.

I've been instructed to issue VACUUM FULL on both nodes, nightly,but before proceeding I read on docs that VACUUM FULL can disruptlogical replication, so I'm a bit concerned on how to proceed. Rowsare cleared one a time (one transaction, one row, to keep errors tothe record that issued them)

Mind if you shared the specific doc ?

PgEdge is based on the old pg_logical, the old 2ndQuadrant extension,not the native logical replication we have since pgsql 10. But Imight be mistaken.
Don't know about this, it keeps running on latest pg versions (we areabout to upgrade to 17.4, if I'm not wrong), but I'll ask
I read about extensions like pg_squeeze, but I wonder if they arestill not dangerous for replication.
What's pgEdge take on that, I mean the bytea thing you are trying toachieve here.
They are positive, it's they that suggested to do VACUUM FULL on bothnodes... I'm quite new to replication, so I'm searching some advise here.

As I told you, pgEdge logical replication (old 2ndquadrant BDR) !=native logical replication. You may look here :


https://github.com/pgEdge/spock

If multi-master is not a must you could convert to vanilla postgresqland focus on standard physical and logical replication.

Thanks for your help.
Moreno.-

Re: Logical replication, need to reclaim big disk space

Reply via email to