Bruce Momjian wrote:

Jan Wieck wrote:
If the system is write-bound, the checkpointer will find that many dirty blocks that he has no time to nap and will burst them out as fast as possible anyway. Well, at least that's the theory.

PostgreSQL with the non-overwriting storage concept can never have hot-written pages for a long time anyway, can it? They fill up and cool down until vacuum.

Another idea on removing sync() --- if we are going to use fsync() on each file during checkpoint (open, fsync, close), seems we could keep a hash of written block dbid/relfilenode pairs and cycle through that on checkpoint. We could keep the hash in shared memory, and dump it to a backing store when it gets full, or just have it exist in buffer writer process memory (so it can grow) and have backends that do their own buffer writes all open a single file in append mode and write the pairs to the file, or something like that, and the checkpoint process can read from there.


I am not really aiming at removing sync() alltogether. We know already that open,fsync,close does not guarantee you flush dirty OS-buffers for which another process might so far only have done open,write. And you really don't want to remove all the vfd logic or fsync on every write done by a backend.


What doing frequent fdatasync/fsync during a constant ongoing checkpoint will cause is to significantly lower the physical write storm happening at the sync(), which is causing huge problems right now.

The reason why people blame vacuum that much is that not only does it replaces the buffer cache with useless garbage, it also leaves that garbage to be flushed by other backends or the checkpointer and it rapidly fills WAL, causing exactly that checkpoint we don't have the IO bandwidth for right now! They only see that vacuum is running, and if they kill it the system returns to a healty state after a while ... easy enought but only half the story.


Jan


--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== [EMAIL PROTECTED] #


---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Reply via email to