Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Lists
Josh Berkus wrote: Tom, Right, because they do. If you think otherwise, demonstrate it. (bonnie tests approximating a reverse seqscan are not relevant to the performance of indexscans.) Working on it. I *think* I've seen this issue in the field, which is why I brought it up in the first

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Grzegorz Jaśkiewicz
create index foobar on table(row desc); -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] Really dumb planner decision

2009-04-16 Thread Matthew Wakeling
I have a query that is executed really badly by Postgres. It is a nine table join, where two of the tables are represented in a view. If I remove one of the tables from the query, then the query runs very quickly using a completely different plan. Here is the view:

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Grzegorz Jaśkiewicz
On Thu, Apr 16, 2009 at 11:37 AM, Matthew Wakeling matt...@flymine.org wrote: I have a query that is executed really badly by Postgres. It is a nine table join, where two of the tables are represented in a view. If I remove one of the tables from the query, then the query runs very quickly

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Grzegorz Jaśkiewicz wrote: On a second look, it looks like you are are joining that view twice, at this point, I have no idea myself what it might be. But I guess it has to search over 5M rows for each of 105 in other query. I wonder what more experienced guys here will

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Grzegorz Jaśkiewicz
On Thu, Apr 16, 2009 at 11:37 AM, Matthew Wakeling matt...@flymine.org wrote: SELECT * FROM    gene AS a1_,    intergenicregion AS a2_,    regulatoryregion AS a3_,    chromosome AS a4_,    location AS a5_,    dataset AS a6_,    LocatedSequenceFeatureOverlappingFeatures AS indirect0,    

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Grzegorz Jaśkiewicz wrote: On Thu, Apr 16, 2009 at 11:37 AM, Matthew Wakeling matt...@flymine.org wrote: I have a query that is executed really badly by Postgres. It is a nine table join, where two of the tables are represented in a view. If I remove one of the tables from

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Robert Haas wrote: What happens if you change join_collapse_limit and from_collapse_limit to some huge number? http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-FROM-COLLAPSE-LIMIT That solves the problem. So, a view is treated as a subquery

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Merlin Moncure
On Thu, Apr 16, 2009 at 2:02 AM, Lists li...@on-track.ca wrote: Right, because they do.  If you think otherwise, demonstrate it. (bonnie tests approximating a reverse seqscan are not relevant to the performance of indexscans.) Working on it.  I *think* I've seen this issue in the field,

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Merlin Moncure
On Thu, Apr 16, 2009 at 8:05 AM, Matthew Wakeling matt...@flymine.org wrote: On Thu, 16 Apr 2009, Robert Haas wrote: What happens if you change join_collapse_limit and from_collapse_limit to some huge number?

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Robert Haas
2009/4/16 Matthew Wakeling matt...@flymine.org: On Thu, 16 Apr 2009, Grzegorz Jaśkiewicz wrote: On Thu, Apr 16, 2009 at 11:37 AM, Matthew Wakeling matt...@flymine.org wrote: I have a query that is executed really badly by Postgres. It is a nine table join, where two of the tables are

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes: On Thu, Apr 16, 2009 at 8:05 AM, Matthew Wakeling matt...@flymine.org wrote: That solves the problem. So, a view is treated as a subquery then? no...the view is simply inlined into the query (think C macro) using the rules. You just bumped into an

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote: Bear in mind that those limits exist to keep you from running into exponentially increasing planning time when the size of a planning problem gets big. Raise 'em to the moon isn't really a sane strategy. It might be that we could get away with raising them

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Merlin Moncure
On Thu, Apr 16, 2009 at 10:11 AM, Kevin Grittner kevin.gritt...@wicourts.gov wrote: Tom Lane t...@sss.pgh.pa.us wrote: Bear in mind that those limits exist to keep you from running into exponentially increasing planning time when the size of a planning problem gets big.  Raise 'em to the moon

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes: On Thu, Apr 16, 2009 at 2:02 AM, Lists li...@on-track.ca wrote: select comment_date from user_comments where user_comments.uid=1 order by comment_date desc limit 1 try this: create index comment_data_uid_idx on user_comments(uid,

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Tom Lane
Lists li...@on-track.ca writes: The query select comment_date from user_comments where user_comments.uid=1 order by comment_date desc limit 1 Explain: Limit (cost=0.00..2699.07 rows=1 width=8) (actual time=52848.785..52848.787 rows=1 loops=1)

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Robert Haas
On Thu, Apr 16, 2009 at 9:49 AM, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: On Thu, Apr 16, 2009 at 8:05 AM, Matthew Wakeling matt...@flymine.org wrote: That solves the problem. So, a view is treated as a subquery then? no...the view is simply inlined into

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Lists
Tom Lane wrote: Lists li...@on-track.ca writes: The query select comment_date from user_comments where user_comments.uid=1 order by comment_date desc limit 1 Explain: Limit (cost=0.00..2699.07 rows=1 width=8) (actual

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Robert Haas wrote: I hasten to point out that I only suggested raising them to the moon as a DEBUGGING strategy, not a production configuration. The problem is that we have created a view that by itself a very time-consuming query to answer, relying on it being

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Tom Lane
Lists li...@on-track.ca writes: I already had a separate index on uid CREATE INDEX idx_user_comments_uid ON user_comments USING btree (uid); Under the circumstances, shouldn't a bitmap of those 2 indexes be far faster than using just the date index (compared to the

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Tom Lane
Matthew Wakeling matt...@flymine.org writes: On Thu, 16 Apr 2009, Robert Haas wrote: I hasten to point out that I only suggested raising them to the moon as a DEBUGGING strategy, not a production configuration. The problem is that we have created a view that by itself a very time-consuming

[PERFORM] GiST index performance

2009-04-16 Thread Matthew Wakeling
I have been doing some queries that are best answered with GiST indexes, however I have found that their performance is a little lacking. I thought I would do a direct comparison on a level playing field. Here are two EXPLAIN ANALYSE results for the same query, with two different indexes.

Re: [PERFORM] GiST index performance

2009-04-16 Thread Kevin Grittner
Matthew Wakeling matt...@flymine.org wrote: I have been doing some queries that are best answered with GiST indexes For what definition of best answered? Since an index is only a performance tuning feature (unless declared UNIQUE), and should never alter the results (beyond possibly

Re: [PERFORM] GiST index performance

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Kevin Grittner wrote: Matthew Wakeling matt...@flymine.org wrote: I have been doing some queries that are best answered with GiST indexes For what definition of best answered? Since an index is only a performance tuning feature (unless declared UNIQUE), and should never

Re: [PERFORM] GiST index performance

2009-04-16 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Matthew Wakeling matt...@flymine.org wrote: I have been doing some queries that are best answered with GiST indexes For what definition of best answered? Since an index is only a performance tuning feature (unless declared UNIQUE), and

Re: [PERFORM] GiST index performance

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, dforum wrote: there is other performance problem on this request. If you analyse query plan, you see that most of the time are lost during sequencial scan, and you have 2 seq scan. Nonsense. Sequential scans account for all of one or two seconds of processing in these

Re: [PERFORM] GiST index performance

2009-04-16 Thread dforum
hello, there is other performance problem on this request. If you analyse query plan, you see that most of the time are lost during sequencial scan, and you have 2 seq scan. You have to create other indexes to match the request. Postgresq is totally dependant on index to reach is

Re: [PERFORM] GiST index performance

2009-04-16 Thread Tom Lane
dforum dfor...@vieonet.com writes: If you analyse query plan, you see that most of the time are lost during sequencial scan, and you have 2 seq scan. I think you missed the loops count. - Index Scan using location_object_start_gist on location l1 (cost=0.00..4.16

Re: [PERFORM] GiST index performance

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Tom Lane wrote: Matthew, can you put together a self-contained test case with a similar slowdown? It isn't the smoking gun I thought it would be, but: CREATE TABLE a AS SELECT a FROM generate_series(1,100) AS a(a); CREATE TABLE b AS SELECT b FROM

Re: [PERFORM] GiST index performance

2009-04-16 Thread Tom Lane
Matthew Wakeling matt...@flymine.org writes: On Thu, 16 Apr 2009, Tom Lane wrote: Also, what are the physical sizes of the two indexes? location_object_start_gist | 193 MB location_object_start | 75 MB I notice that the inner nestloop join gets slower too, when it's not changed at

Re: [PERFORM] GiST index performance

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Tom Lane wrote: Hmm, and what is shared_buffers set to? How big are the tables and other indexes used in the query? We still have to explain why the inner nestloop got slower, and it's hard to see that unless something stopped fitting in cache. I just noticed that

[PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Kris Jurka
PG (8.3.7) doesn't seem to want to do a hash join across two partitioned tables. I have two partition hierarchies: impounds (with different impound sources) and liens (with vehicle liens from different companies). Trying to match those up gives: EXPLAIN SELECT COUNT(*) FROM impounds i

Re: [PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Tom Lane
Kris Jurka bo...@ejurka.com writes: PG (8.3.7) doesn't seem to want to do a hash join across two partitioned tables. Could we see the whole declaration of these tables? (pg_dump -s output would be convenient) regards, tom lane -- Sent via pgsql-performance mailing

Re: [PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Kris Jurka
On Thu, 16 Apr 2009, Tom Lane wrote: Kris Jurka bo...@ejurka.com writes: PG (8.3.7) doesn't seem to want to do a hash join across two partitioned tables. Could we see the whole declaration of these tables? (pg_dump -s output would be convenient) The attached table definition with no

Re: [PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Kris Jurka
On Thu, 16 Apr 2009, Kris Jurka wrote: Perhaps the cost estimates for the real data are so high because of this bogus row count that the fudge factor to disable mergejoin isn't enough? Indeed, I get these cost estimates on 8.4b1 with an increased disable_cost value: nestloop:

[PERFORM] Optimizer's issue

2009-04-16 Thread Vlad Arkhipov
I have a problem with a part of big query because of incorrect estimation. It's easy to emulate the case: create table a (id bigint, id2 bigint); create table b (id bigint, id2 bigint); insert into a (id, id2) select random() * 10, random() * 100 from generate_series(1, 10); insert