Re: [PERFORM] OFFSET impact on Performance???
Although larger offsets have some effect, your real problem is the sort (of 42693 rows). Try: SELECT r.id_order FROM report r WHERE r.id_order IN (SELECT id FROM orders WHERE id_status = 6 ORDER BY 1 LIMIT 10 OFFSET 1000) ORDER BY 1 The subquery doesn't *have* to sort because the table is already ordered on the primary key. You can still add a join to orders outside the subselect without significant cost. Incidentally, I don't know how you got the first plan - it should include a sort as well. Andrei Bintintan wrote: > explain analyze > SELECT o.id > FROM report r > INNER JOIN orders o ON o.id=r.id_order AND o.id_status=6 > ORDER BY 1 LIMIT 10 OFFSET 10 > > Limit (cost=44.37..88.75 rows=10 width=4) (actual time=0.160..0.275 rows=10 loops=1) > -> Merge Join (cost=0.00..182150.17 rows=41049 width=4) (actual time=0.041..0.260 rows=20 loops=1) > Merge Cond: ("outer".id_order = "inner".id) > -> Index Scan using report_id_order_idx on report r (cost=0.00..157550.90 rows=42862 width=4) (actual time=0.018..0.075 rows=20 loops=1) > -> Index Scan using orders_pkey on orders o (cost=0.00..24127.04 rows=42501 width=4) (actual time=0.013..0.078 rows=20 loops=1) > Filter: (id_status = 6) > Total runtime: 0.373 ms > > explain analyze > SELECT o.id > FROM report r > INNER JOIN orders o ON o.id=r.id_order AND o.id_status=6 > ORDER BY 1 LIMIT 10 OFFSET 100 > Limit (cost=31216.85..31216.85 rows=1 width=4) (actual time=1168.152..1168.152 rows=0 loops=1) > -> Sort (cost=31114.23..31216.85 rows=41049 width=4) (actual time=1121.769..1152.246 rows=42693 loops=1) > Sort Key: o.id > -> Hash Join (cost=2329.99..27684.03 rows=41049 width=4) (actual time=441.879..925.498 rows=42693 loops=1) > Hash Cond: ("outer".id_order = "inner".id) > -> Seq Scan on report r (cost=0.00..23860.62 rows=42862 width=4) (actual time=38.634..366.035 rows=42864 loops=1) > -> Hash (cost=2077.74..2077.74 rows=42501 width=4) (actual time=140.200..140.200 rows=0 loops=1) > -> Seq Scan on orders o (cost=0.00..2077.74 rows=42501 width=4) (actual time=0.059..96.890 rows=42693 loops=1) > Filter: (id_status = 6) > Total runtime: 1170.586 ms ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] OFFSET impact on Performance???
Andrei: Hi to all, I have the following 2 examples. Now, regarding on the offset if it is small(10) or big(>5) what is the impact on the performance of the query?? I noticed that if I return more data's(columns) or if I make more joins then the query runs even slower if the OFFSET is bigger. How can I somehow improve the performance on this? Merlin: Offset is not suitable for traversal of large data sets. Better not use it at all! There are many ways to deal with this problem, the two most direct being the view approach and the cursor approach. cursor approach: declare report_order with hold cursor for select * from report r, order o [...] Remember to close the cursor when you're done. Now fetch time is proportional to the number of rows fetched, and should be very fast. The major drawback to this approach is that cursors in postgres (currently) are always insensitive, so that record changes after you declare the cursor from other users are not visible to you. If this is a big deal, try the view approach. view approach: create view report_order as select * from report r, order o [...] and this: prepare fetch_from_report_order(numeric, numeric, int4) as select * from report_order where order_id >= $1 and (order_id > $1 or report_id > $2) order by order_id, report_id limit $3; fetch next 1000 records from report_order: execute fetch_from_report_order(o, f, 1000); o and f being the last key values you fetched (pass in zeroes to start it off). This is not quite as fast as the cursor approach (but it will be when we get a proper row constructor, heh), but it more flexible in that it is sensitive to changes from other users. This is more of a 'permanent' binding whereas cursor is a binding around a particular task. Good luck! Merlin ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] OFFSET impact on Performance???
Hi to all, I have the following 2 examples. Now, regarding on the offset if it is small(10) or big(>5) what is the impact on the performance of the query?? I noticed that if I return more data's(columns) or if I make more joins then the query runs even slower if the OFFSET is bigger. How can I somehow improve the performance on this? Best regards, Andy. explain analyzeSELECT o.idFROM report r INNER JOIN orders o ON o.id=r.id_order AND o.id_status=6ORDER BY 1 LIMIT 10 OFFSET 10 Limit (cost=44.37..88.75 rows=10 width=4) (actual time=0.160..0.275 rows=10 loops=1) -> Merge Join (cost=0.00..182150.17 rows=41049 width=4) (actual time=0.041..0.260 rows=20 loops=1) Merge Cond: ("outer".id_order = "inner".id) -> Index Scan using report_id_order_idx on report r (cost=0.00..157550.90 rows=42862 width=4) (actual time=0.018..0.075 rows=20 loops=1) -> Index Scan using orders_pkey on orders o (cost=0.00..24127.04 rows=42501 width=4) (actual time=0.013..0.078 rows=20 loops=1) Filter: (id_status = 6)Total runtime: 0.373 ms explain analyzeSELECT o.idFROM report r INNER JOIN orders o ON o.id=r.id_order AND o.id_status=6ORDER BY 1 LIMIT 10 OFFSET 100Limit (cost=31216.85..31216.85 rows=1 width=4) (actual time=1168.152..1168.152 rows=0 loops=1) -> Sort (cost=31114.23..31216.85 rows=41049 width=4) (actual time=1121.769..1152.246 rows=42693 loops=1) Sort Key: o.id -> Hash Join (cost=2329.99..27684.03 rows=41049 width=4) (actual time=441.879..925.498 rows=42693 loops=1) Hash Cond: ("outer".id_order = "inner".id) -> Seq Scan on report r (cost=0.00..23860.62 rows=42862 width=4) (actual time=38.634..366.035 rows=42864 loops=1) -> Hash (cost=2077.74..2077.74 rows=42501 width=4) (actual time=140.200..140.200 rows=0 loops=1) -> Seq Scan on orders o (cost=0.00..2077.74 rows=42501 width=4) (actual time=0.059..96.890 rows=42693 loops=1) Filter: (id_status = 6)Total runtime: 1170.586 ms