Thanks for your help. I create index on channel_id and data_id like your comment.
- Index: idx_d_channel_id2 -- DROP INDEX idx_d_channel_id2; CREATE INDEX idx_d_channel_id2 ON sym_data USING btree (channel_id); -- Index: idx_d_channel_id3 -- DROP INDEX idx_d_channel_id3; CREATE INDEX idx_d_channel_id3 ON sym_data USING btree (data_id) WHERE channel_id::text = 'sale_transaction'::text; -- Index: idx_d_channel_id4 -- DROP INDEX idx_d_channel_id4; CREATE INDEX idx_d_channel_id4 ON sym_data USING btree (data_id) WHERE channel_id::text = 'item'::text; Here is new explan analyze explain analyze select d.data_id, d.table_name, d.event_type, d.row_data, d.pk_data, d.old_data, d.create_time, d.trigger_hist_id, d.channel_id, d.transaction_id, d.source_node_id, d.external_data, '' from sym_data d inner join sym_data_gap g on g.status='GP' and d.data_id between g.start_id and g.end_id where d.channel_id='sale_transaction' order by d.data_id asc; Nested Loop (cost=0.00..1512979014.35 rows=26268463088 width=1401) (actual time=25741.704..7650979.311 rows=2764140 loops=1) -> Index Scan using idx_d_channel_id3 on sym_data d (cost=0.00..1781979.40 rows=3117384 width=1401) (actual time=83.718..55126.002 rows=3124631 loops=1) -> Index Scan using sym_data_gap_pkey on sym_data_gap g (cost=0.00..358.37 rows=8426 width=8) (actual time=2.428..2.429 rows=1 loops=3124631) Index Cond: ((d.data_id >= g.start_id) AND (d.data_id <= g.end_id)) Filter: (g.status = 'GP'::bpchar) Total runtime: 7651803.073 ms But query performance don't change. Please help me. Tuan Hoang ANh On Sat, Aug 6, 2011 at 12:20 AM, Kevin Grittner <kevin.gritt...@wicourts.gov > wrote: > tuanhoanganh <hatua...@gmail.com> wrote: > > > I have postgresql 9.0.1 > > http://www.postgresql.org/support/versioning > > > 6GB ram > > > work_mem = 2097151 > > I think that has the potential to push you into swapping: > > cc=> set work_mem = 2097151; > SET > cc=> show work_mem; > work_mem > ----------- > 2097151kB > (1 row) > > That's 2GB, and that much can be allocated, potentially several > times, per connection. > > > -> Index Scan using sym_data_pkey on sym_data d > > (cost=0.00..637148.72 rows=3129103 width=1403) > > (actual time=71.989..55643.665 rows=3124631 loops=1) > > Filter: ((channel_id)::text = 'sale_transaction'::text) > > This index scan is going to randomly access all tuples in the > table's heap. That is probably going to be much slower than a > sequential scan. It is apparently choosing this index to avoid a > sort, because of the mis-estimation on the number of rows. Is it > critical that the rows be returned in that order? If not, you might > see much faster performance by leaving off the ORDER BY clause so > that it can use the seqscan. > > You could potentially make queries like this much faster by indexing > on channel_id, or by indexing on data_id WHERE channel_id = > 'sale_transaction'.. > > You could also set up optimization barriers with clever use of a CTE > or an OFFSET 0 to force it to use a seqscan followed by a sort, but > I would look at the other options first. > > -Kevin >