You might have index fragmentation and possibly reindexing them conncurrently
on PG12, should do that. As everyone agreed most of space will be marked for
re-use later for table segments but indices in your case could be problem. On
previous versions you can use pg_repack to do index rebuilds which will help
reduce space consumed.
Regards,
Virendra
On Thursday, May 14, 2020, 1:20:41 PM PDT, Eduard Rozenberg
<[email protected]> wrote:
@Adrian thanks again.
I read the postgres docs the same way - that previously used space is marked as
available and therefore no need for vacuum full. Previously used = now
available space, which gets re-used, in theory.
And yet this same DB is growing at 30-50 GB weekly, despite the fact that 2.4
TB of previously used space is clearly available ("clearly available" as proven
by vacuum full shrinking the DB space usage by 2.4 TB). I did verify
postgresql.conf has always been properly configured re: autovacuum:
'autovacuum = on'and 'track_counts = on'
I'm not planning on running VACUUM FULL regularly, just "this one time". And I
was trying to to parallelize VACUUM FULL and minimize downtime of the
production DB caused by table locking. And then I found the option of using
"vacuumdb --jobs" which sounded like the perfect solution except for "well you
can't actually use --jobs because you'll run into a deadlock and everybody
knows that and nobody has a (good) solution for it" :).
--Ed
> On May 14, 2020, at 11:46, Adrian Klaver <[email protected]> wrote:
>
> On 5/14/20 11:27 AM, Eduard Rozenberg wrote:
>> @Adrian thanks.
>> I did a test vacuum full the "normal" non-parallel way (VACUUM FULL sql)
>> previously on a test db copy and saw the DB size (postgres 9.6) shrink from
>> 6.8 TB to 4.4 TB of actual, real disk space usage ($ df -d 1).
>> I don't know the reason so much space was "locked up" (other than there is a
>> lot of data "churn" from daily updates). But I definitely do need to do the
>> vac full on the production db to get down to the smaller size - cannot
>> afford the 2.4 TB of "wasted" space on an ongoing basis.
>
> It may not be wasted space. A regular VACUUM marks space within a table
> available for reuse(and not returned to OS) when it removes unneeded tuples.
> It then fills that space up with new tuples, roughly speaking. So if the
> vacuuming is working properly you will reach a steady state where space is
> reused and the database on disk size grows slowly as reusable space is
> occupied. I would monitor the database size on a regular basis. My guess is
> that the VACUUM FULL is dropping the OS used space and then it fills up again
> as the database does those updates.
>
>> Based on your email it sounds like the vacuuming of those pg_catalog.*
>> tables is done regardless, as a normal part of doing vac full on my own
>> database.
>> Unfortunately I still don't see an ideal way to run vacuum full in parallel
>> via vacuumdb without running into the expected and documented deadlock. Only
>> method I'm aware of is to list each table individually with "-t table1 -t
>> table2..." to "vacuum db --jobs" which is not pleasant and not exceedingly
>> beautiful.
>> Thanks.
>
>
> --
> Adrian Klaver
> [email protected]