Greg Smith <[EMAIL PROTECTED]> wrote:

> On Mon, 11 Jun 2007, ITAGAKI Takahiro wrote:
> > If the kernel can treat sequential writes better than random writes, is 
> > it worth sorting dirty buffers in block order per file at the start of 
> > checkpoints?

I wrote and tested the attached sorted-writes patch base on Heikki's
ldc-justwrites-1.patch. There was obvious performance win on OLTP workload.

  tests                    | pgbench | DBT-2 response time (avg/90%/max)
 LDC only                  | 181 tps | 1.12 / 4.38 / 12.13 s
 + BM_CHECKPOINT_NEEDED(*) | 187 tps | 0.83 / 2.68 /  9.26 s
 + Sorted writes           | 224 tps | 0.36 / 0.80 /  8.11 s

(*) Don't write buffers that were dirtied after starting the checkpoint.

machine : 2GB-ram, SCSI*4 RAID-5
pgbench : -s400 -t40000 -c10  (about 5GB of database)
DBT-2   : 60WH (about 6GB of database)

> I think it has the potential to improve things.  There are three obvious 
> and one subtle argument against it I can think of:
> 1) Extra complexity for something that may not help.  This would need some 
> good, robust benchmarking improvements to justify its use.

Exactly. I think we need a discussion board for I/O performance issues.
Can I use Developers Wiki for this purpose?  Since performance graphs and
result tables are important for the discussion, so it might be better
than mailing lists, that are text-based.

> 2) Block number ordering may not reflect actual order on disk.  While 
> true, it's got to be better correlated with it than writing at random.
> 3) The OS disk elevator should be dealing with this issue, particularly 
> because it may really know the actual disk ordering.

Yes, both are true. However, I think there is pretty high correlation
in those orderings. In addition, we should use filesystem to assure
those orderings correspond to each other. For example, pre-allocation
of files might help us, as has often been discussed.

> Here's the subtle thing:  by writing in the same order the LRU scan occurs 
> in, you are writing dirty buffers in the optimal fashion to eliminate 
> client backend writes during BuferAlloc.  This makes the checkpoint a 
> really effective LRU clearing mechanism.  Writing in block order will 
> change that.

The issue will probably go away after we have LDC, because it writes LRU
buffers during checkpoints.

ITAGAKI Takahiro
NTT Open Source Software Center

Attachment: sorted-ckpt.patch
Description: Binary data

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to