Re: [HACKERS] vacuum, performance, and MVCC

Mark Woodward Fri, 23 Jun 2006 07:13:47 -0700

> Ãhel kenal pÃ¤eval, N, 2006-06-22 kell 12:41, kirjutas Mark Woodward:
>
>> > Depending on exact details and optimisations done, this can be either
>> > slower or faster than postgresql's way, but they still need to do
>> > something to get transactional visibility rules implemented.
>>
>> I think they have a different strategy. I think they maintain the notion
>> of "current version" of a row, and hunt for previous versions when
>> needed,
>> at least that's how I suspect Oracle does it with redo logs.
>
> Not "current" but "last" :)


True
>
> And one side effect of redo logs is that it is practically impossible to
> do large deletes on production databases. So you design around that,
> like you have to design around limitations of MVCC.

Think that's bad, try doing an update in PostgreSQL on a table with 20
million rows and a few indexes. I had to write a script to chunk up the
block update into segments and vacuum between each.

>
>> >> > > There has to be a more linear way of handling this scenario.
>> >> >
>> >> > So vacuum the table often.
>> >>
>> >> It's easy to say VACUUM often... but I'd bet that vacuuming is going
>> >> to lessen the throughput in his tests even more; no matter how it's
>> >> tuned.
>> >
>> > Running VACUUM often/continuously will likely keep his update rate
>> > fluctuatons within a corridor of maybe 5-10%, at the cost of 1-2%
>> extra
>> > load. At least if vacuum is configured right and the server is not
>> > already running at 100% IO saturation, in which case it will be worse.
>>
>> Assuming the table is a reasonable size, the I/O required for vacuum
>> doesn't kill everything else!
>
> I have solved the problem of unneccessary IO by keeping active and
> finished rows in separate tables, with the finish() function moving the
> row between tables.

Sorry, an RDBMS is a "relational database management system," if you are
doing the "database management," it isn't a very good RDBMS.

>
> In case of the number of actively modified rows being in only tens or
> low hundreds of thousands of rows, (i.e. the modified set fits in
> memory) the continuous vacuum process shows up as just another backend,
> not really taking order of magnitude more resources. It mainly generates
> WAL traffic, as modified pages are already in memory/cache and are
> mostly synced by background writer and/or checkpoint.
>
> Of course you have to adjust vacuum_cost_* variables so as to not
> saturate IO.

These sort of solutions, IMHO, don't show how good PostgreSQL is, but show
where it is very lacking.

>
>> > The max throughput figure is not something you actually need very
>> often
>> > in production.
>>
>> No, but you need to have some degree of certainty and predictability in
>> the system you are developing.
>
> Yup. You have to design it so it has.

I was refereing to the system as a whole and the individual components.
PostgreSQL's performance under some pathalogical condictions is not very
predictable or reliable.

>
>> > What is interesting is setting up the server so that you
>> > can service your loads comfortably. Running the server at 100% lead is
>> > not anything you want to do on production server. There will be things
>> > you need to do anyway and you need some headroom for that.
>>
>> Of course, you design it so peaks are easily managed, but unless you run
>> vacuum continuously, and that has its own set of problems, you run into
>> this problem, and it can get really really bad.
>
> Usually it gets really bad if you *don't* run vacuum continuously, maybe
> hopeing to do it in slower times at night. For high-update db you have
> to run it continuously, maybe having some 5-15 sec pauses between runs.

And how much I/O does this take?

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

Re: [HACKERS] vacuum, performance, and MVCC

Reply via email to