Re: [HACKERS] Controlling Load Distributed Checkpoints

PFC Wed, 13 Jun 2007 15:13:17 -0700

>If we extended relations by more than 8k at a time, we would know a lot
>more about disk layout, at least on filesystems with a decent amount of
>free space.


I doubt it makes that much difference. If there was a significant amount
of fragmentation, we'd hear more complaints about seq scan performance.

The issue here is that we don't know which relations are on which drives
and controllers, how they're striped, mirrored etc.


Actually, isn't pre-allocation one of the tricks that Greenplum uses to
get it's seqscan performance?

My tests here show that, at least on reiserfs, after a few hours ofbenchmark torture (this represents several million write queries), tablefiles become significantly fragmented. I believe the table and index filesget extended more or less simultaneously and end up somehow a bit mixed upon disk. Seq scan perf suffers. reiserfs doesn't have an excellentfragmentation behaviour... NTFS is worse than hell in this respect. So,pre-alloc could be a good idea. Brutal Defrag (cp /var/lib/postgresql tosomewhere and back) gets seq scan perf back to disk throughput.

Also, by the way, InnoDB uses a BTree organized table. The advantage isthat data is always clustered on the primary key (which means you have touse something as your primary key that isn't necessary "natural", you haveto choose it to get good clustering, and you can't always do it right, soit somehow, in the end, sucks rather badly). Anyway, seq-scan on InnoDB isvery slow because, as the btree grows (just like postgres indexes) pagesare split and scanning the pages in btree order becomes a mess of seeks.So, seq scan in InnoDB is very very slow unless periodic OPTIMIZE TABLE isapplied. (caveat to the postgres TODO item "implement automatic tableclustering"...)


---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [HACKERS] Controlling Load Distributed Checkpoints

Reply via email to