On 11/21/2013 04:15 PM, Arnt Gulbrandsen wrote:
Stephen R. van den Berg wrote:
Arnt Gulbrandsen wrote:
Wow. I disagree with almost everything you write.

Well, maybe that indicates we start with wildly differing
assumptions.

That's what I think too.

Running the delete does not free up space instantly, only after a
vacuum full.

What? Doesn't the autovacuumer start to free up space after a few
seconds and then the space will gradually become available?

No, that's not how autovacuum works. See http://www.postgresql.org/docs/9.3/static/sql-vacuum.html.

Regarding "plain" vacuum: "However, extra space is not returned to the operating system (in most cases); it's just kept available for re-use within the same table."

So while postgres will re-use that reclaimed space, the OS will never
see it. For that you'd need to do a VACUUM FULL, which locks the table
while doing it.


Valid question. I asked my production db, about 5 minutes ago, the
query is still running.  I'm not certain I can leave it running to
completion without disrupting service too much.

It locks nothing.

I think he means a performance drop due to the long running query, not specifically a lock.

Watch this space...

- A postgresql vacuum will finish sooner (under normal
conditions).

Doesn't vacuum depend mostly on the number of rows? A few very
large rows won't make much difference, then. If there are many
attachments involved, there are also many messages being harmed
and we're not talking about a few unimportant attachments any
more.

If you actually want to free up space, a vacuum needs to copy big
amounts of data to compact.  The copying is not free.

- A postgresql vacuum will actually finish (if the db has
reached sizes of >300GB, a postgresql vacuum is a problem by
itself).

Reducing the size to 280GB makes little difference.

I was considering reducing it to 20GB actually, that *will* make a
dent.

Indeed.

at some later point in time.  Most of the repository is going
to be coherent and accessible soon.  Restoring the attachments
(which first) can then be prioritised according to urgency.

Unfortunately, while the database is coherent as far as SQL goes,
it's not coherent in the IMAP sense, and fixing that up woud be
real work.

It would mean that imap would return a message with (forced) empty
attachments, you think that would be difficult to get right?

Either tricky to get right, or hard on the database, or both. I
don't feel good about that when alternatives like this exist:

delete from mailbox_messages where message in (select id from
messages order by rfc822size desc limit 20);

Then run aox vacuum and vacuumdb. Not at all fast, but oh, how
simple.

If you want to see who uses the space, you can ask for the 20
biggest messages:

select u.login, mb.name, mm.uid, m.rfc822size from users u join
mailboxes mb on (u.id=mb.owner) join mailbox_messages mm on
(mb.id=mm.mailbox) join messages m on (mm.message=m.id) order by
m.rfc822size desc limit 20;

I know they're slow, but I like these queries _so_ much better than
a 200-line block of C++ that modifies the database.

Arnt


Greetings,
Lee Garrett

Reply via email to