On 09/15/2016 06:40 PM, Robert Haas wrote:
On Thu, Sep 15, 2016 at 12:22 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
Tomas Vondra <tomas.von...@2ndquadrant.com> writes:
On 09/14/2016 07:57 PM, Tom Lane wrote:
People who are vacuuming because they are out of disk space will be very
very unhappy with that solution.
The people are usually running out of space for data, while these files
would be temporary files placed wherever temp_tablespaces points to. I'd
argue if this is a source of problems, the people are already in deep
trouble due to sorts, CREATE INDEX, ... as those commands may also
generate a lot of temporary files.
Except that if you are trying to recover disk space, VACUUM is what you
are doing, not CREATE INDEX. Requiring extra disk space to perform a
vacuum successfully is exactly the wrong direction to be going in.
See for example this current commitfest entry:
https://commitfest.postgresql.org/10/649/
Regardless of what you think of the merits of that patch, it's trying
to solve a real-world problem. And as Robert has already pointed out,
making this aspect of VACUUM more complicated is not solving any
pressing problem. "But we made it faster" is going to be a poor answer
for the next person who finds themselves up against the wall with no
recourse.
I very much agree.
How does VACUUM alone help with recovering disk space? AFAIK it only
makes the space available for new data, it does not reclaim the disk
space at all. Sure, we truncate empty pages at the end of the last
segment, but how likely is that in practice? What I do see people doing
is usually either VACUUM FULL (which is however doomed for obvious
reasons) or VACUUM + reindexing to get rid of index bloat (which however
leads to CREATE INDEX using temporary files).
I'm not sure I agree with your claim there's no pressing problem. We do
see quite a few people having to do VACUUM with multiple index scans
(because the TIDs don't fit into m_w_m), which certainly has significant
impact on production systems - both in terms of performance and it also
slows down reclaiming the space. Sure, being able to set m_w_m above 1GB
is an improvement, but perhaps using a more efficient TID storage would
improve the situation further. Writing the TIDs to a temporary file may
not the right approach, but I don't see why that would make the original
problem less severe?
For example, we always allocate the TID array as large as we can fit
into m_w_m, but maybe we don't need to wait with switching to the bitmap
until filling the whole array - we could wait as long as the bitmap fits
into the remaining part of the array, build it there and then copy it to
the beginning (and use the bitmap from that point).
regards
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers