Re: [PERFORM] A question on the query planner

2003-12-03 Thread Bruce Momjian
Tom Lane wrote:
 Greg Stark [EMAIL PROTECTED] writes:
  Tom Lane [EMAIL PROTECTED] writes:
  Define no longer works well.
 
  Well it seems to completely bar the use of a straight merge join between two
  index scans:
 
 Hmmm ... [squints] ... it's not supposed to do that ... [digs] ... yeah,
 there's something busted here.  Will get back to you ...

LOL, but I am not sure why.  :-)

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] A question on the query planner

2003-12-02 Thread Robert Treat
On Mon, 2003-12-01 at 16:44, Jared Carr wrote:
 I am currently working on optimizing some fairly time consuming queries 
 on a decently large
 dataset.
 
 The Following is the query in question.
 
 SELECT z.lat, z.lon, z.city, z.state, q.date_time, c.make, c.model, c.year
 FROM quotes AS q, zips AS z, cars AS c
 WHERE
 z.zip = q.zip AND
 c.car_id = q.car_id AND
 z.state != 'AA' AND
 z.state != 'AE' AND
 z.state != 'AP' AND
 z.state = 'WA'
  ORDER BY date_time;
 

This wont completely solve your problem, but z.state = 'WA' would seem
to be mutually exclusive of the != AA|AE|AP.  While it's not much, it is
extra overhead there doesn't seem to be any need for...

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] A question on the query planner

2003-12-02 Thread Greg Stark
Jared Carr [EMAIL PROTECTED] writes:

 Furthermore noticed that in the following query plan it is doing the
 sequential scan on quotes first, and then doing the sequential on zips. IMHO
 this should be the other way around, since the result set for zips is
 considerably smaller especially give that we are using a where clause to
 limit the number of items returned from zips, so it would seem that it would
 be faster to scan zips then join onto quotes, but perhaps it needs to do the
 sequential scan on both regardless.

-  Hash Join  (cost=1088.19..56382.58 rows=4058 width=62) (actual 
time=86.111..1834.682 rows=10193 loops=1)
  Hash Cond: ((outer.zip)::text = (inner.zip)::text)
  -  Seq Scan on quotes q  (cost=0.00..10664.25 rows=336525 width=27) (actual 
 time=0.098..658.905 rows=336963 loops=1)
  -  Hash  (cost=1086.90..1086.90 rows=516 width=52) (actual time=85.798..85.798 
 rows=0 loops=1)
-  Seq Scan on zips z  (cost=0.00..1086.90 rows=516 width=52) (actual 
 time=79.532..84.151 rows=718 loops=1)
  Filter: ((state)::text = 'WA'::text)

You're misreading it. Hash join is done by reading in one table into a hash
table, then reading the other table looking up entries in the hash table. The
zips are being read into the hash table which is appropriate if it's the
smaller table.


 Of course still there is the holy grail of getting it to actually use 
 the indexes. :P

  Merge Cond: (outer.?column7? = inner.?column5?)

Well it looks like you have something strange going on. What data type is
car_id in each table? 


-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] A question on the query planner

2003-12-02 Thread Jared Carr
Greg Stark wrote:


Merge Cond: (outer.?column7? = inner.?column5?)
   

Well it looks like you have something strange going on. What data type is
car_id in each table? 

 

car_id is a varchar(10) in both tables.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] A question on the query planner

2003-12-02 Thread Greg Stark
Jared Carr [EMAIL PROTECTED] writes:

 Greg Stark wrote:
 
 
  Merge Cond: (outer.?column7? = inner.?column5?)
 
 
 Well it looks like you have something strange going on. What data type is
  car_id in each table?
 car_id is a varchar(10) in both tables.

Well for some reason it's being cast to a text to do the merge.

What version of postgres is this btw? The analyzes look like 7.4?

-- 
greg


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] A question on the query planner

2003-12-02 Thread Jared Carr
Greg Stark wrote:

Jared Carr [EMAIL PROTECTED] writes:

 

Greg Stark wrote:

   

   Merge Cond: (outer.?column7? = inner.?column5?)

   

Well it looks like you have something strange going on. What data type is
car_id in each table?
 

car_id is a varchar(10) in both tables.
   

Well for some reason it's being cast to a text to do the merge.

What version of postgres is this btw? The analyzes look like 7.4?

 

Yes, this is 7.4.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] A question on the query planner

2003-12-02 Thread Greg Stark

Jared Carr [EMAIL PROTECTED] writes:

 Greg Stark wrote:
 
  Well it looks like you have something strange going on. What data type is
  car_id in each table?
 
 car_id is a varchar(10) in both tables.

Huh. The following shows something strange. It seems joining on two varchars
no longer works well. Instead the optimizer has to convert both columns to
text.

I know some inter-type comparisons were removed a while ago, but I would not
have thought that would effect varchar-varchar comparisons. I think this is
pretty bad.


test=# create table a (x varchar primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index a_pkey for table a
CREATE TABLE
test=# create table b (x varchar primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index b_pkey for table b
CREATE TABLE
test=# select * from a,b where a.x=b.x;
 x | x 
---+---
(0 rows)

test=# explain select * from a,b where a.x=b.x;
QUERY PLAN
--
 Merge Join  (cost=139.66..159.67 rows=1001 width=64)
   Merge Cond: (outer.?column2? = inner.?column2?)
   -  Sort  (cost=69.83..72.33 rows=1000 width=32)
 Sort Key: (a.x)::text
 -  Seq Scan on a  (cost=0.00..20.00 rows=1000 width=32)
   -  Sort  (cost=69.83..72.33 rows=1000 width=32)
 Sort Key: (b.x)::text
 -  Seq Scan on b  (cost=0.00..20.00 rows=1000 width=32)
(8 rows)

test=# create table a2 (x text primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index a2_pkey for table a2
CREATE TABLE
test=# create table b2 (x text primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index b2_pkey for table b2
CREATE TABLE
test=# explain select * from a2,b2 where a2.x=b2.x;
QUERY PLAN 
---
 Hash Join  (cost=22.50..57.51 rows=1001 width=64)
   Hash Cond: (outer.x = inner.x)
   -  Seq Scan on a2  (cost=0.00..20.00 rows=1000 width=32)
   -  Hash  (cost=20.00..20.00 rows=1000 width=32)
 -  Seq Scan on b2  (cost=0.00..20.00 rows=1000 width=32)
(5 rows)


-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] A question on the query planner

2003-12-02 Thread Greg Stark

Greg Stark [EMAIL PROTECTED] writes:

 Huh. The following shows something strange. 

Worse, with enable_hashjoin off it's even more obvious something's broken:


test=# set enable_hashjoin = off;
SET
test=# explain select * from a,b where a.x=b.x;
QUERY PLAN
--
 Merge Join  (cost=139.66..159.67 rows=1001 width=64)
   Merge Cond: (outer.?column2? = inner.?column2?)
   -  Sort  (cost=69.83..72.33 rows=1000 width=32)
 Sort Key: (a.x)::text
 -  Seq Scan on a  (cost=0.00..20.00 rows=1000 width=32)
   -  Sort  (cost=69.83..72.33 rows=1000 width=32)
 Sort Key: (b.x)::text
 -  Seq Scan on b  (cost=0.00..20.00 rows=1000 width=32)
(8 rows)

test=# explain select * from a2,b2 where a2.x=b2.x;
 QUERY PLAN  
-
 Merge Join  (cost=0.00..63.04 rows=1001 width=64)
   Merge Cond: (outer.x = inner.x)
   -  Index Scan using a2_pkey on a2  (cost=0.00..24.00 rows=1000 width=32)
   -  Index Scan using b2_pkey on b2  (cost=0.00..24.00 rows=1000 width=32)
(4 rows)

-- 
greg


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] A question on the query planner

2003-12-02 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Huh. The following shows something strange. It seems joining on two varchars
 no longer works well. Instead the optimizer has to convert both columns to
 text.

Define no longer works well.  varchar doesn't have its own comparison
operators anymore, but AFAIK that makes no difference.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] A question on the query planner

2003-12-02 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 Define no longer works well.

 Well it seems to completely bar the use of a straight merge join between two
 index scans:

Hmmm ... [squints] ... it's not supposed to do that ... [digs] ... yeah,
there's something busted here.  Will get back to you ...

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] A question on the query planner

2003-12-01 Thread Tom Lane
Jared Carr [EMAIL PROTECTED] writes:
 I am currently working on optimizing some fairly time consuming queries 
 on a decently large dataset.

It doesn't look that large from here ;-).  I'd suggest experimenting
with reducing random_page_cost, since at least for your test query
it sure looks like everything is in RAM.  In theory random_page_cost = 1.0
is the correct setting for all-in-RAM cases.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend