This thread has been saved for the 8.1 release:

        http://momjian.postgresql.org/cgi-bin/pgpatches2

---------------------------------------------------------------------------

ITAGAKI Takahiro wrote:
> Hello, all.
> 
> I think that there is room for improvement in WAL. 
> Here is a patch for it.
>   - Multiple pages are written in one write() if it is contiguous.
>   - Add 'open_direct' to wal_sync_method.
> 
> WAL writer writes one page in one write(). This is not efficient
> when wal_sync_method is 'open_sync', because the writer waits for
> IO completions at each write(). Multipage-writer can reduce syscalls
> and improve IO throughput. 
> 
> 'open_direct' uses O_DIRECT instead of O_SYNC. O_DIRECT implies synchronous
> writing, so it may show the tendency like open_sync. But maybe it can reduce
> memcpy() and save OS's disk cache memory.
> 
> I benchmarked this patch with pgbench. It works well and 
> improved 50% of tps on my machine. WAL seems to be bottle-neck
> on machines with poor disks.
> 
> This patch has not yet tested enough. I would like it to be examined much
> and taken into PostgreSQL.
> 
> There are still many TODOs:
>   * Is this logic really correct?
>   - O_DIRECT_BUFFER_ALIGN should be adjusted to runtime, not compile time.
>   - Consider to use writev() instead of write().
>     Buffers are noncontiguous when WAL ring buffer rotates.
>   - If wan_sync_method is not open_direct, XLOG_EXTRA_BUFFERS can be 0.
> 
> 
> Sincerely,
> ITAGAKI Takahiro
> 
> 
> 
> -- pgbench result --
> 
> $ ./pgbench -s 100 -c 50 -t 400
> 
> - 8.0.0 default + fsync:
>     tps = 20.630632 (including connections establishing)
>     tps = 20.636768 (excluding connections establishing)
> - multipage-writer + open_direct:
>     tps = 33.761917 (including connections establishing)
>     tps = 33.778320 (excluding connections establishing)
> 
> Environment:
>   OS     : Linux kernel 2.6.9
>   CPU    : Pentium 4 3GHz
>   disk   : ATA 5400rpm (Data and WAL are placed on same partition.)
>   memory : 1GB
>   config : shared_buffers=10000, wal_buffers=256,
>            XLOG_SEG_SIZE=256MB, checkpoint_segment=4
> 
> ---
> ITAGAKI Takahiro <[EMAIL PROTECTED]>
> NTT Cyber Space Laboratories
> Nippon Telegraph and Telephone Corporation.

[ Attachment, skipping... ]

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to