[PERFORM] Optimizer's issue

2009-04-16 Thread Vlad Arkhipov
I have a problem with a part of big query because of incorrect estimation. It's easy to emulate the case: create table a (id bigint, id2 bigint); create table b (id bigint, id2 bigint); insert into a (id, id2) select random() * 10, random() * 100 from generate_series(1, 10); insert int

Re: [PERFORM] GiST index performance

2009-04-16 Thread Craig Ringer
dforum wrote: > hello, > > there is other performance problem on this request. > > If you analyse query plan, you see that most of the time are lost during > sequencial scan, and you have 2 seq scan. > > You have to create other indexes to match the request. > > Postgresq is totally dependant o

Re: [PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Kris Jurka
On Thu, 16 Apr 2009, Kris Jurka wrote: Perhaps the cost estimates for the real data are so high because of this bogus row count that the fudge factor to disable mergejoin isn't enough? Indeed, I get these cost estimates on 8.4b1 with an increased disable_cost value: nestloop: 11171206.

Re: [PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Kris Jurka
On Thu, 16 Apr 2009, Tom Lane wrote: Kris Jurka writes: PG (8.3.7) doesn't seem to want to do a hash join across two partitioned tables. Could we see the whole declaration of these tables? (pg_dump -s output would be convenient) The attached table definition with no data wants to merge

Re: [PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Tom Lane
Kris Jurka writes: > PG (8.3.7) doesn't seem to want to do a hash join across two partitioned > tables. Could we see the whole declaration of these tables? (pg_dump -s output would be convenient) regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-perf

[PERFORM] No hash join across partitioned tables?

2009-04-16 Thread Kris Jurka
PG (8.3.7) doesn't seem to want to do a hash join across two partitioned tables. I have two partition hierarchies: impounds (with different impound sources) and liens (with vehicle liens from different companies). Trying to match those up gives: EXPLAIN SELECT COUNT(*) FROM impounds i

Re: [PERFORM] GiST index performance

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Tom Lane wrote: Hmm, and what is shared_buffers set to? How big are the tables and other indexes used in the query? We still have to explain why the inner nestloop got slower, and it's hard to see that unless something stopped fitting in cache. I just noticed that someone

Re: [PERFORM] GiST index performance

2009-04-16 Thread Tom Lane
Matthew Wakeling writes: > On Thu, 16 Apr 2009, Tom Lane wrote: >> Also, what are the physical sizes of the two indexes? > location_object_start_gist | 193 MB > location_object_start | 75 MB >> I notice that the inner nestloop join gets slower too, when it's not >> changed at all --- th

Re: [PERFORM] GiST index performance

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Tom Lane wrote: Matthew, can you put together a self-contained test case with a similar slowdown? It isn't the smoking gun I thought it would be, but: CREATE TABLE a AS SELECT a FROM generate_series(1,100) AS a(a); CREATE TABLE b AS SELECT b FROM generate_series(1,1000

Re: [PERFORM] GiST index performance

2009-04-16 Thread Tom Lane
dforum writes: > If you analyse query plan, you see that most of the time are lost during > sequencial scan, and you have 2 seq scan. I think you missed the loops count. >> -> Index Scan using location_object_start_gist on location l1 >>(cost=0.00..4.16 rows=150 width=

Re: [PERFORM] GiST index performance

2009-04-16 Thread dforum
hello, there is other performance problem on this request. If you analyse query plan, you see that most of the time are lost during sequencial scan, and you have 2 seq scan. You have to create other indexes to match the request. Postgresq is totally dependant on index to reach is performance

Re: [PERFORM] GiST index performance

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, dforum wrote: there is other performance problem on this request. If you analyse query plan, you see that most of the time are lost during sequencial scan, and you have 2 seq scan. Nonsense. Sequential scans account for all of one or two seconds of processing in these qu

Re: [PERFORM] GiST index performance

2009-04-16 Thread Tom Lane
"Kevin Grittner" writes: > Matthew Wakeling wrote: >> I have been doing some queries that are best answered with GiST >> indexes > For what definition of "best answered"? > Since an index is only a performance tuning feature (unless declared > UNIQUE), and should never alter the results (beyo

Re: [PERFORM] GiST index performance

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Kevin Grittner wrote: Matthew Wakeling wrote: I have been doing some queries that are best answered with GiST indexes For what definition of "best answered"? Since an index is only a performance tuning feature (unless declared UNIQUE), and should never alter the results

Re: [PERFORM] GiST index performance

2009-04-16 Thread Kevin Grittner
Matthew Wakeling wrote: > I have been doing some queries that are best answered with GiST > indexes For what definition of "best answered"? Since an index is only a performance tuning feature (unless declared UNIQUE), and should never alter the results (beyond possibly affecting row order if t

[PERFORM] GiST index performance

2009-04-16 Thread Matthew Wakeling
I have been doing some queries that are best answered with GiST indexes, however I have found that their performance is a little lacking. I thought I would do a direct comparison on a level playing field. Here are two EXPLAIN ANALYSE results for the same query, with two different indexes. The

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Tom Lane
Matthew Wakeling writes: > On Thu, 16 Apr 2009, Robert Haas wrote: >> I hasten to point out that I only suggested raising them to the moon >> as a DEBUGGING strategy, not a production configuration. > The problem is that we have created a view that by itself a very > time-consuming query to answ

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Tom Lane
Lists writes: > I already had a separate index on uid > CREATE INDEX idx_user_comments_uid > ON user_comments > USING btree > (uid); > Under the circumstances, shouldn't a bitmap of those 2 indexes be far > faster than using just the date index (compared to the old plan, n

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Robert Haas wrote: I hasten to point out that I only suggested raising them to the moon as a DEBUGGING strategy, not a production configuration. The problem is that we have created a view that by itself a very time-consuming query to answer, relying on it being incorporate

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Lists
Tom Lane wrote: Lists writes: The query select comment_date from user_comments where user_comments.uid=1 order by comment_date desc limit 1 Explain: "Limit (cost=0.00..2699.07 rows=1 width=8) (actual time=52848.785..52848.787 row

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Robert Haas
On Thu, Apr 16, 2009 at 9:49 AM, Tom Lane wrote: > Merlin Moncure writes: >> On Thu, Apr 16, 2009 at 8:05 AM, Matthew Wakeling >> wrote: >>> That solves the problem. So, a view is treated as a subquery then? > >> no...the view is simply inlined into the query (think C macro) using >> the rules.

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Tom Lane
Lists writes: > The query > select comment_date > from user_comments > where user_comments.uid=1 > order by comment_date desc limit 1 > Explain: > "Limit (cost=0.00..2699.07 rows=1 width=8) (actual > time=52848.785..52848.787 rows=1 loops=1)" > " ->

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Tom Lane
Merlin Moncure writes: > On Thu, Apr 16, 2009 at 2:02 AM, Lists wrote: >> select comment_date >> from user_comments >> where user_comments.uid=1 >> order by comment_date desc limit 1 > try this: > create index comment_data_uid_idx on user_comments(uid, comment_date); > select * from

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Merlin Moncure
On Thu, Apr 16, 2009 at 10:11 AM, Kevin Grittner wrote: > Tom Lane wrote: >> Bear in mind that those limits exist to keep you from running into >> exponentially increasing planning time when the size of a planning >> problem gets big.  "Raise 'em to the moon" isn't really a sane > strategy. >> It

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Kevin Grittner
Tom Lane wrote: > Bear in mind that those limits exist to keep you from running into > exponentially increasing planning time when the size of a planning > problem gets big. "Raise 'em to the moon" isn't really a sane strategy. > It might be that we could get away with raising them by one or two

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Tom Lane
Merlin Moncure writes: > On Thu, Apr 16, 2009 at 8:05 AM, Matthew Wakeling wrote: >> That solves the problem. So, a view is treated as a subquery then? > no...the view is simply inlined into the query (think C macro) using > the rules. You just bumped into an arbitrary (and probably too low) >

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Merlin Moncure
On Thu, Apr 16, 2009 at 8:05 AM, Matthew Wakeling wrote: > On Thu, 16 Apr 2009, Robert Haas wrote: >> >> What happens if you change join_collapse_limit and from_collapse_limit >> to some huge number? >> >> >> http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-FROM-COLLAPSE

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Merlin Moncure
On Thu, Apr 16, 2009 at 2:02 AM, Lists wrote: > > Right, because they do.  If you think otherwise, demonstrate it. > (bonnie tests approximating a reverse seqscan are not relevant > to the performance of indexscans.) > > Working on it.  I *think* I've seen this issue in the field, which is why I >

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Robert Haas wrote: What happens if you change join_collapse_limit and from_collapse_limit to some huge number? http://www.postgresql.org/docs/current/static/runtime-config-query.html#GUC-FROM-COLLAPSE-LIMIT That solves the problem. So, a view is treated as a subquery then?

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Robert Haas
2009/4/16 Matthew Wakeling : > On Thu, 16 Apr 2009, Grzegorz Jaśkiewicz wrote: >> >> On Thu, Apr 16, 2009 at 11:37 AM, Matthew Wakeling >> wrote: >>> >>> I have a query that is executed really badly by Postgres. It is a nine >>> table >>> join, where two of the tables are represented in a view. If

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Grzegorz Jaśkiewicz wrote: On Thu, Apr 16, 2009 at 11:37 AM, Matthew Wakeling wrote: I have a query that is executed really badly by Postgres. It is a nine table join, where two of the tables are represented in a view. If I remove one of the tables from the query, then the

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Matthew Wakeling
On Thu, 16 Apr 2009, Grzegorz Jaśkiewicz wrote: On a second look, it looks like you are are joining that view twice, at this point, I have no idea myself what it might be. But I guess it has to search over 5M rows for each of 105 in other query. I wonder what more experienced guys here will have

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Grzegorz Jaśkiewicz
On Thu, Apr 16, 2009 at 11:37 AM, Matthew Wakeling wrote: > SELECT * > FROM >    gene AS a1_, >    intergenicregion AS a2_, >    regulatoryregion AS a3_, >    chromosome AS a4_, >    location AS a5_, >    dataset AS a6_, >    LocatedSequenceFeatureOverlappingFeatures AS indirect0, >    BioEntities

Re: [PERFORM] Really dumb planner decision

2009-04-16 Thread Grzegorz Jaśkiewicz
On Thu, Apr 16, 2009 at 11:37 AM, Matthew Wakeling wrote: > > I have a query that is executed really badly by Postgres. It is a nine table > join, where two of the tables are represented in a view. If I remove one of > the tables from the query, then the query runs very quickly using a > completel

[PERFORM] Really dumb planner decision

2009-04-16 Thread Matthew Wakeling
I have a query that is executed really badly by Postgres. It is a nine table join, where two of the tables are represented in a view. If I remove one of the tables from the query, then the query runs very quickly using a completely different plan. Here is the view: release-16.0-preview-09-a

Re: [PERFORM] Shouldn't the planner have a higher cost for reverse index scans?

2009-04-16 Thread Grzegorz Jaśkiewicz
create index foobar on table(row desc); -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance