Hi, I am submitting a patch for review that:
1. for small relation (smaller than 60% of bufferpool), use the current logic
2. for big relation: - use a ring buffer in heap scan - pin first 12 pages when scan starts - on consumption of every 4-page, read and pin the next 4-page- invalidate used pages of in the scan so they do not force out other useful pages
4 files changed: bufmgr.c, bufmgr.h, heapam.c, relscan.hIf there are interests, I can submit another scan patch that returns N tuples at a time, instead of current one-at-a-time interface. This improves code locality and further improve performance by another 10-20%.
For TPCH 1G tables, we are seeing more than 20% improvement in scans on the same hardware.
------------------------------------------------------------------------ -
----- PATCHED VERSION------------------------------------------------------------------------ -
gptest=# select count(*) from lineitem; count --------- 6001215 (1 row) Time: 2117.025 ms------------------------------------------------------------------------ -
----- ORIGINAL CVS HEAD VERSION------------------------------------------------------------------------ -
gptest=# select count(*) from lineitem; count --------- 6001215 (1 row) Time: 2722.441 ms Suggestions for improvement are welcome. Regards, -cktan Greenplum, Inc.
PATCH
Description: Binary data
---------------------------(end of broadcast)--------------------------- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate