On Thursday 05 February 2004 12:13, Damien Dougan wrote: > Hi All, > > I've been seeing very slow read performance on a database of 1 million > indexed subscribers, which I believe is nothing to do with the data > itself, but delays on processing the index. > > If I make a random jump into the index (say X), it can take about 50ms > to read the subscriber. If I then make a "close by" lookup (say X+10), > it takes only about 0.5ms to read the subscriber. Making another lookup > to a "far away" (say X+1000), it again takes about 50ms to read.
The first time, it has to fetch a block from disk. The second time that disk block is already in RAM so it's much faster. The third time it needs a different disk block. > Am I correct in my analysis? Is there anything I can do to improve the > performance of the index lookups? Make sure you have enough RAM to buffer your disks. Buy faster disks. > I've tried increasing the index memory and making a number of queries > around the index range, but a stray of several hundred indexes from a > cached entry always results in a major lookup delay. Yep, that'll be your disks. > I've also increased the shared memory available to Postgres to 80MB > incase this is a paging of the index, but it hasn't seemed to have any > effect. Probably the wrong thing to do (although you don't mention what hardware you've got). Read the tuning document at: http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php > Sample analyze output for an initial query: > > hydradb=# explain analyze select * from pvsubscriber where actorid = > 'b3432-asdas-232-Subscriber793500'; ... > -> Index Scan using mc_actor_key on mc_actor > (cost=0.00..4.08 rows=1 width=69) (actual time=39.497..39.499 rows=1 > loops=1) ... > Total runtime: 49.845 ms > And the analyze output for a "nearby" subscriber (10 indexes away): > > hydradb=# explain analyze select * from pvsubscriber where actorid = > 'b3432-asdas-232-Subscriber793510'; > ... > -> Index Scan using mc_actor_key on mc_actor > (cost=0.00..4.08 rows=1 width=69) (actual time=0.220..0.221 rows=1 > loops=1) > Total runtime: 0.428 ms > (15 rows) That certainly seems to be the big change - the only way to consistently get 1ms timings is going to be to make sure all your data is cached. Try the tuning guide above and see what difference that makes. If that's no good, post again with details of your config settings, hardware, number of clients etc... -- Richard Huxton Archonet Ltd ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org