Thanks for your help.
I create index on channel_id and data_id like your comment.

- Index: idx_d_channel_id2

-- DROP INDEX idx_d_channel_id2;

CREATE INDEX idx_d_channel_id2
  ON sym_data
  USING btree
  (channel_id);

-- Index: idx_d_channel_id3

-- DROP INDEX idx_d_channel_id3;

CREATE INDEX idx_d_channel_id3
  ON sym_data
  USING btree
  (data_id)
  WHERE channel_id::text = 'sale_transaction'::text;

-- Index: idx_d_channel_id4

-- DROP INDEX idx_d_channel_id4;

CREATE INDEX idx_d_channel_id4
  ON sym_data
  USING btree
  (data_id)
  WHERE channel_id::text = 'item'::text;

Here is new explan analyze

explain analyze select d.data_id, d.table_name, d.event_type, d.row_data,
d.pk_data, d.old_data, d.create_time, d.trigger_hist_id, d.channel_id,
d.transaction_id, d.source_node_id, d.external_data, '' from sym_data d
inner join sym_data_gap g on g.status='GP' and d.data_id between g.start_id
and g.end_id where d.channel_id='sale_transaction' order by d.data_id asc;

Nested Loop  (cost=0.00..1512979014.35 rows=26268463088 width=1401) (actual
time=25741.704..7650979.311 rows=2764140 loops=1)
  ->  Index Scan using idx_d_channel_id3 on sym_data d
 (cost=0.00..1781979.40 rows=3117384 width=1401) (actual
time=83.718..55126.002 rows=3124631 loops=1)
  ->  Index Scan using sym_data_gap_pkey on sym_data_gap g
 (cost=0.00..358.37 rows=8426 width=8) (actual time=2.428..2.429 rows=1
loops=3124631)
        Index Cond: ((d.data_id >= g.start_id) AND (d.data_id <= g.end_id))
        Filter: (g.status = 'GP'::bpchar)
Total runtime: 7651803.073 ms

But query performance don't change.
Please help me.

Tuan Hoang ANh

On Sat, Aug 6, 2011 at 12:20 AM, Kevin Grittner <kevin.gritt...@wicourts.gov
> wrote:

> tuanhoanganh <hatua...@gmail.com> wrote:
>
> > I have postgresql 9.0.1
>
> http://www.postgresql.org/support/versioning
>
> > 6GB ram
>
> > work_mem = 2097151
>
> I think that has the potential to push you into swapping:
>
> cc=> set work_mem = 2097151;
> SET
> cc=> show work_mem;
>  work_mem
> -----------
>  2097151kB
> (1 row)
>
> That's 2GB, and that much can be allocated, potentially several
> times, per connection.
>
> >   ->  Index Scan using sym_data_pkey on sym_data d
> >           (cost=0.00..637148.72 rows=3129103 width=1403)
> >           (actual time=71.989..55643.665 rows=3124631 loops=1)
> >         Filter: ((channel_id)::text = 'sale_transaction'::text)
>
> This index scan is going to randomly access all tuples in the
> table's heap. That is probably going to be much slower than a
> sequential scan. It is apparently choosing this index to avoid a
> sort, because of the mis-estimation on the number of rows.  Is it
> critical that the rows be returned in that order?  If not, you might
> see much faster performance by leaving off the ORDER BY clause so
> that it can use the seqscan.
>
> You could potentially make queries like this much faster by indexing
> on channel_id, or by indexing on data_id WHERE channel_id =
> 'sale_transaction'..
>
> You could also set up optimization barriers with clever use of a CTE
> or an OFFSET 0 to force it to use a seqscan followed by a sort, but
> I would look at the other options first.
>
> -Kevin
>

Reply via email to