Stephen R. van den Berg wrote:
> Arnt Gulbrandsen wrote:
> >Wow. I disagree with almost everything you write.
> 
> Well, maybe that indicates we start with wildly differing assumptions.

That's what I think too.

> Running the delete does not free up space instantly, only after
> a vacuum full.

What? Doesn't the autovacuumer start to free up space after a few
seconds and then the space will gradually become available?

> Valid question.
> I asked my production db, about 5 minutes ago, the query is still
> running.  I'm not certain I can leave it running to completion
> without disrupting service too much.

It locks nothing.

> Watch this space...
> 
> >> - A postgresql vacuum will finish sooner (under normal conditions).
> 
> >Doesn't vacuum depend mostly on the number of rows? A few very large
> >rows won't make much difference, then. If there are many attachments
> >involved, there are also many messages being harmed and we're not
> >talking about a few unimportant attachments any more.
> 
> If you actually want to free up space, a vacuum needs to copy
> big amounts of data to compact.  The copying is not free.
> 
> >> - A postgresql vacuum will actually finish (if the db has reached
> >>   sizes of >300GB, a postgresql vacuum is a problem by itself).
> 
> >Reducing the size to 280GB makes little difference.
> 
> I was considering reducing it to 20GB actually, that *will* make a dent.

Indeed.

> >>   at some later point in time.  Most of the repository is going
> >>   to be coherent and accessible soon.  Restoring the attachments
> >>   (which first) can then be prioritised according to urgency.
> 
> >Unfortunately, while the database is coherent as far as SQL goes, it's
> >not coherent in the IMAP sense, and fixing that up woud be real work.
> 
> It would mean that imap would return a message with (forced) empty
> attachments, you think that would be difficult to get right?

Either tricky to get right, or hard on the database, or both. I don't
feel good about that when alternatives like this exist:

   delete from mailbox_messages where message in
     (select id from messages order by rfc822size desc limit 20);

Then run aox vacuum and vacuumdb. Not at all fast, but oh, how simple.

If you want to see who uses the space, you can ask for the 20 biggest
messages:

   select u.login, mb.name, mm.uid, m.rfc822size
     from users u
     join mailboxes mb on (u.id=mb.owner)
     join mailbox_messages mm on (mb.id=mm.mailbox)
     join messages m on (mm.message=m.id)
     order by m.rfc822size desc limit 20;

I know they're slow, but I like these queries _so_ much better than a
200-line block of C++ that modifies the database.

Arnt

Reply via email to