[PERFORM] Optimisation of INTERSECT expressions

2004-03-23 Thread Phil Endecott
Dear PostgresQL Experts,

I am trying to get to the bottom of some efficiency problems and hope that
you can help.  The difficulty seems to be with INTERSECT expressions.

I have a query of the form
 select A from T where C1 intersect select A from T where C2;
It runs in about 100 ms.

But it is equivalent to this query
 select A from T where C1 and C2;
which runs in less than 10 ms.

Looking at the output of explain analyse on the first query, it seems
that PostgresQL always computes the two sub-expressions and then computes
an explicit intersection on the results.  I had hoped that it would notice
that both subexpressions are scanning the same input table T and convert
the expression to the second form.

Is there a reason why it can't do this transformation?

(Of course, I could just re-write my code to use the second form, but my
application is generating these bits of SQL programmatically, and it is not
trivial as in some cases the two tables are not the same so an intersection
really is needed; if PostgresQL can do it for me, that would be much
better.  I don't want to write an SQL parser!)


While studying the same code I found another case where my INTERSECT
expressions don't seem to be optimised as much as I'd like.  In this case,
one of the subexpressions being intersected is empty much of the time.  But
even when it is empty, PostgresQL computes the other (expensive)
subexpression and does an intersect.  Could PostgresQL do something like this:

- guess which subexpression is likely to produce fewest rows
- compute this subexpression
- if empty, return now with an empty result
- compute other subexpression
- compute intersection
- return intersection

Alternatively, it could be defined that the left subexpression is always
computed first and the second not computed if it is empty, like the
behaviour of logical AND and OR operators in C.

Thanks in advance for any suggestions.

--Phil.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Optimisation of INTERSECT expressions

2004-03-23 Thread Stephan Szabo
On Tue, 23 Mar 2004, Phil Endecott wrote:

 Dear PostgresQL Experts,

 I am trying to get to the bottom of some efficiency problems and hope that
 you can help.  The difficulty seems to be with INTERSECT expressions.

 I have a query of the form
  select A from T where C1 intersect select A from T where C2;
 It runs in about 100 ms.

 But it is equivalent to this query
  select A from T where C1 and C2;
 which runs in less than 10 ms.

 Looking at the output of explain analyse on the first query, it seems
 that PostgresQL always computes the two sub-expressions and then computes
 an explicit intersection on the results.  I had hoped that it would notice
 that both subexpressions are scanning the same input table T and convert
 the expression to the second form.

 Is there a reason why it can't do this transformation?

Probably because noone's bothered to try to prove under what conditions
it's the same.

For example, given a non-unique A, the two queries can give different
answers (if say the same two A values match both C1 and C2 in different
rows how many output rows does each give? *), also given a non-stable A
(for example random) the two queries are not necessarily equivalent.

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Optimisation of INTERSECT expressions

2004-03-23 Thread Stephan Szabo

On Tue, 23 Mar 2004, Stephan Szabo wrote:

 On Tue, 23 Mar 2004, Phil Endecott wrote:

  Dear PostgresQL Experts,
 
  I am trying to get to the bottom of some efficiency problems and hope that
  you can help.  The difficulty seems to be with INTERSECT expressions.
 
  I have a query of the form
   select A from T where C1 intersect select A from T where C2;
  It runs in about 100 ms.
 
  But it is equivalent to this query
   select A from T where C1 and C2;
  which runs in less than 10 ms.
 
  Looking at the output of explain analyse on the first query, it seems
  that PostgresQL always computes the two sub-expressions and then computes
  an explicit intersection on the results.  I had hoped that it would notice
  that both subexpressions are scanning the same input table T and convert
  the expression to the second form.
 
  Is there a reason why it can't do this transformation?

 Probably because noone's bothered to try to prove under what conditions
 it's the same.

 For example, given a non-unique A, the two queries can give different
 answers (if say the same two A values match both C1 and C2 in different
 rows how many output rows does each give? *), also given a non-stable A
 (for example random) the two queries are not necessarily equivalent.

Ugh, the example got trimmed out for the *

Given a non-unique A, C1 as B5, c2 as C5 and the data:
A | B | C
1 | 6 | 1
1 | 1 | 6

The intersect gives 1 row, the and query gives 0 AFAICS.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] atrocious update performance

2004-03-23 Thread Rosser Schwarz
Greg Spiegelberg wrote:

 I've been following this thread closely as I have the same problem
 with an UPDATE.  Everything is identical here right down to the
 strace output.

 Has anyone found a workaround or resolved the problem?  If not,
 I have test systems here which I can use to help up test and explore.

I'm still gathering data.  The explain analyze I'd expected to finish
Thursday afternoon hasn't yet.  I'm going to kill it and try a few
smaller runs, increasing in size, until the behavior manifests.

Will advise.

/rls


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Optimisation of INTERSECT expressions

2004-03-23 Thread Phil Endecott
I asked:
 select A from T where C1 intersect select A from T where C2;
 select A from T where C1 and C2;
 [why isn't the first optimised into the second?]

Stephan Szabo answered:
 Given a non-unique A, C1 as B5, c2 as C5 and the data:
 A | B | C
 1 | 6 | 1
 1 | 1 | 6
 The intersect gives 1 row, the and query gives 0 AFAICS.

Tom Lane answered: 
 Another way that the queries are not equivalent is that INTERSECT is
 defined to remove duplicate output rows (much like DISTINCT) whereas
 the AND form of course won't do that.

Thanks!  In my case the attribute A is unique - it is the primary key - and
I hadn't considered the more general case properly.

So I suppose I'll have to find a more sophisticated way to generate my
queries.  Imagine a user interface for a search facility with various
buttons and text entry fields.  At the moment, for each part of the search
that the user has enabled I create a string of SQL.  I then compose them
into a single statement using INTERSECT.  Each sub-query always returns the
same attribute, but to make things complicated they may come from different
tables.  It now seems that I'll have to merge the queries more thoroughly.
 Does anyone have any suggestions about how to do this?  I'd like a nice
general technique that works for all possible subqueries, as my current
composition with INTERSECT does.


Any thoughts on my other question about empty intersections?

Thanks again for the feedback.

--Phil.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Optimisation of INTERSECT expressions

2004-03-23 Thread Bruno Wolff III
On Tue, Mar 23, 2004 at 11:21:39 -0500,
  Phil Endecott [EMAIL PROTECTED] wrote:
  Does anyone have any suggestions about how to do this?  I'd like a nice
 general technique that works for all possible subqueries, as my current
 composition with INTERSECT does.

One adjustment you might make is using INTERSECT ALL if you know there
can't be duplicates. Then time won't be wasted trying to remove duplicates.

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Optimisation of INTERSECT expressions

2004-03-23 Thread Tom Lane
Bruno Wolff III [EMAIL PROTECTED] writes:
 On Tue, Mar 23, 2004 at 11:21:39 -0500,
   Phil Endecott [EMAIL PROTECTED] wrote:
 Does anyone have any suggestions about how to do this?  I'd like a nice
 general technique that works for all possible subqueries, as my current
 composition with INTERSECT does.

 One adjustment you might make is using INTERSECT ALL if you know there
 can't be duplicates. Then time won't be wasted trying to remove duplicates.

Actually, I don't think that will help.  UNION ALL is much faster than
UNION, because it doesn't have to match up duplicates, but INTERSECT
and EXCEPT still have to match rows from the inputs in order to find
out if they should emit a row at all.  IIRC there will not be any
noticeable speed difference with or without ALL.

AFAICS, what Phil will want to do is

SELECT a FROM table1 WHERE cond11 AND cond12 AND ...
INTERSECT
SELECT a FROM table2 WHERE cond21 AND cond22 AND ...
INTERSECT
...

which is more painful to assemble than his current approach, but it
shouldn't be *that* bad --- you just need to tag each condition with the
table it applies to, and bring together matching tags when you build the
SQL string.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Optimisation of INTERSECT expressions

2004-03-23 Thread Josh Berkus
Phil,

 So I suppose I'll have to find a more sophisticated way to generate my
 queries.  Imagine a user interface for a search facility with various
 buttons and text entry fields.  At the moment, for each part of the search
 that the user has enabled I create a string of SQL.  I then compose them
 into a single statement using INTERSECT.  Each sub-query always returns the
 same attribute, but to make things complicated they may come from different
 tables.  It now seems that I'll have to merge the queries more thoroughly.
  Does anyone have any suggestions about how to do this?  I'd like a nice
 general technique that works for all possible subqueries, as my current
 composition with INTERSECT does.

I've done this but it involves a choice between a lot of infrastrucure for 
fully configurable queries, or limiting user choice.The former option 
requires that you construct reference tables holding what search fields are 
available, what kind of values they hold, and what operators to use while 
querying, as well as a table storing the joins used for the various tables 
that can be queried.

Based on that, you can construct dynamically a query on any field or combo of 
fields listed in your reference tables.

If search options are more constrained, you can simply take the easier path of 
hard-coding the query building blocks into a set-returning function.   I do 
this all the time for Web search interfaces, where the user only has about 9 
things to search on.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Andrew Sullivan
On Mon, Mar 22, 2004 at 04:05:45PM -0800, Subbiah, Stalin wrote:
 being the key performance booster for postgres.  what is the preferred OS
 for postgres deployment if given an option between linux and solaris. As

One thing this very much depends on is what you're trying to do. 
Suns have a reputation for greater reliability.  While my own
experience with Sun hardware has been rather shy of sterling, I _can_
say that it stands head and shoulders above a lot of the x86 gear you
can get.

If you're planning to use Solaris on x86, don't bother.  Solaris is a
slow, bloated pig compared to Linux, at least when it comes to
managing the largish number of processes that Postgres requires.

If pure speed is what you're after, I have found that 2-way, 32 bit
Linux on P-IIIs compares very favourably to 4 way 64 bit Ultra SPARC
IIs.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The fact that technology doesn't work is no bar to success in the marketplace.
--Philip Greenspun

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Subbiah, Stalin
We are looking into Sun V210 (2 x 1 GHz cpu, 2 gig ram, 5.8Os) vs. Dell 1750
(2 x 2.4 GHz xeon, 2 gig ram, RH3.0). database will mostly be
write intensive and disks will be on raid 10. Wondering if 64bit 1 GHz to
32bit 2.4 GHz make a big difference here. 

Thanks!

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Andrew
Sullivan
Sent: Tuesday, March 23, 2004 9:37 AM
To: '[EMAIL PROTECTED]'
Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux


On Mon, Mar 22, 2004 at 04:05:45PM -0800, Subbiah, Stalin wrote:
 being the key performance booster for postgres.  what is the preferred OS
 for postgres deployment if given an option between linux and solaris. As

One thing this very much depends on is what you're trying to do. 
Suns have a reputation for greater reliability.  While my own
experience with Sun hardware has been rather shy of sterling, I _can_
say that it stands head and shoulders above a lot of the x86 gear you
can get.

If you're planning to use Solaris on x86, don't bother.  Solaris is a
slow, bloated pig compared to Linux, at least when it comes to
managing the largish number of processes that Postgres requires.

If pure speed is what you're after, I have found that 2-way, 32 bit
Linux on P-IIIs compares very favourably to 4 way 64 bit Ultra SPARC
IIs.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
The fact that technology doesn't work is no bar to success in the
marketplace.
--Philip Greenspun

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Anjan Dave
What bus speeds?
 
533MHz on the 32-bit Intel will give you about 4.2Gbps of IO throughput...
 
I think the Sun will be 150MHz, 64bit is 2.4Gbps of IO. Correct me if i am wrong.
 
Thanks,
Anjan

-Original Message- 
From: Subbiah, Stalin [mailto:[EMAIL PROTECTED] 
Sent: Tue 3/23/2004 1:40 PM 
To: 'Andrew Sullivan'; '[EMAIL PROTECTED]' 
Cc: 
Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux



We are looking into Sun V210 (2 x 1 GHz cpu, 2 gig ram, 5.8Os) vs. Dell 1750
(2 x 2.4 GHz xeon, 2 gig ram, RH3.0). database will mostly be
write intensive and disks will be on raid 10. Wondering if 64bit 1 GHz to
32bit 2.4 GHz make a big difference here.

Thanks!

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] Behalf Of Andrew
Sullivan
Sent: Tuesday, March 23, 2004 9:37 AM
To: '[EMAIL PROTECTED]'
Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux


On Mon, Mar 22, 2004 at 04:05:45PM -0800, Subbiah, Stalin wrote:
 being the key performance booster for postgres.  what is the preferred OS
 for postgres deployment if given an option between linux and solaris. As

One thing this very much depends on is what you're trying to do.
Suns have a reputation for greater reliability.  While my own
experience with Sun hardware has been rather shy of sterling, I _can_
say that it stands head and shoulders above a lot of the x86 gear you
can get.

If you're planning to use Solaris on x86, don't bother.  Solaris is a
slow, bloated pig compared to Linux, at least when it comes to
managing the largish number of processes that Postgres requires.

If pure speed is what you're after, I have found that 2-way, 32 bit
Linux on P-IIIs compares very favourably to 4 way 64 bit Ultra SPARC
IIs.

A

--
Andrew Sullivan  | [EMAIL PROTECTED]
The fact that technology doesn't work is no bar to success in the
marketplace.
--Philip Greenspun

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match



---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Matt Clark
If it's going to be write intensive then the RAID controller will be the most 
important thing.  A dual p3/500 with a write-back
cache will smoke either of the boxes you mention using software RAID on write 
performance.

As for the compute intensive side (complex joins  sorts etc), the Dell will most 
likely beat the Sun by some distance, although
what the Sun lacks in CPU power it may make up a bit in memory bandwidth/latency.

Matt

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Subbiah,
 Stalin
 Sent: 23 March 2004 18:41
 To: 'Andrew Sullivan'; '[EMAIL PROTECTED]'
 Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux


 We are looking into Sun V210 (2 x 1 GHz cpu, 2 gig ram, 5.8Os) vs. Dell 1750
 (2 x 2.4 GHz xeon, 2 gig ram, RH3.0). database will mostly be
 write intensive and disks will be on raid 10. Wondering if 64bit 1 GHz to
 32bit 2.4 GHz make a big difference here.

 Thanks!

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] Behalf Of Andrew
 Sullivan
 Sent: Tuesday, March 23, 2004 9:37 AM
 To: '[EMAIL PROTECTED]'
 Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux


 On Mon, Mar 22, 2004 at 04:05:45PM -0800, Subbiah, Stalin wrote:
  being the key performance booster for postgres.  what is the preferred OS
  for postgres deployment if given an option between linux and solaris. As

 One thing this very much depends on is what you're trying to do.
 Suns have a reputation for greater reliability.  While my own
 experience with Sun hardware has been rather shy of sterling, I _can_
 say that it stands head and shoulders above a lot of the x86 gear you
 can get.

 If you're planning to use Solaris on x86, don't bother.  Solaris is a
 slow, bloated pig compared to Linux, at least when it comes to
 managing the largish number of processes that Postgres requires.

 If pure speed is what you're after, I have found that 2-way, 32 bit
 Linux on P-IIIs compares very favourably to 4 way 64 bit Ultra SPARC
 IIs.

 A

 --
 Andrew Sullivan  | [EMAIL PROTECTED]
 The fact that technology doesn't work is no bar to success in the
 marketplace.
   --Philip Greenspun

 ---(end of broadcast)---
 TIP 2: you can get off all lists at once with the unregister command
 (send unregister YourEmailAddressHere to [EMAIL PROTECTED])

 ---(end of broadcast)---
 TIP 9: the planner will ignore your desire to choose an index scan if your
   joining column's datatypes do not match




---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PERFORM] Help with query plan inconsistencies

2004-03-23 Thread Woody Woodring
Hello,

I am using postgres 7.4.2 as a backend for geocode data for a mapping
application.  My question is why can't I get a consistent use of my indexes
during a query, I tend to get a lot of seq scan results.

I use a standard query:

SELECT lat, long, mac, status FROM (
   SELECT text(mac) as mac, lat long, CASE status WHEN 0 THEN 0 WHEN 1 THEN
1 ELSE -1 END 
  as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE
boxtype='d'
)AS FOO WHERE (long=X1) AND (long=X2) AND (lat=Y1) AND (lat=Y2)

Where X1,X2,Y1,Y2 are the coordinates for the rectangle of the map viewing
area.

QUERY PLAN #1  #2 are from when I get a view from 10 miles out, sometimes
it uses the index(#1) and most of the time not(#2).  I do run into plans
that seq scan both sides of the join.

QUERY PLAN #3 is when I view from 5 miles out, and I have much greater
chance of getting index scans ( about 90% of the time).

I have listed information about the database below.

Cable_billing ~500,000 rows updated once per day
Davic  ~500,000 rows, about 100 rows update per minute

Any info or suggestions would be appreciated.

Woody


twc-ral-overview=# \d cable_billing;
 Table public.cable_billing
 Column  |  Type  | Modifiers 
-++---
 cable_billingid | integer| not null
 mac | macaddr| not null
 account | integer| 
 number  | character varying(10)  | 
 address | character varying(200) | 
 region  | character varying(30)  | 
 division| integer| 
 franchise   | integer| 
 node| character varying(10)  | 
 lat | numeric| 
 long| numeric| 
 trunk   | character varying(5)   | 
 ps  | character varying(5)   | 
 fd  | character varying(5)   | 
 le  | character varying(5)   | 
 update  | integer| 
 boxtype | character(1)   | 
Indexes: cable_billing_pkey primary key btree (mac),
 cable_billing_account_index btree (account),
 cable_billing_lat_long_idx btree (lat, long),
 cable_billing_node_index btree (node),
 cable_billing_region_index btree (region)

twc-ral-overview=# \d davic
Table public.davic
 Column  | Type  | Modifiers 
-+---+---
 davicid | integer   | not null
 mac | macaddr   | not null
 source  | character varying(20) | 
 status  | smallint  | 
 updtime | integer   | 
 type| character varying(10) | 
 avail1  | integer   | 
Indexes: davic_pkey primary key btree (mac)



twc-ral-overview=# vacuum analyze;
VACUUM
twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM
(SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1
ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac)
WHERE boxtype='d') AS foo  WHERE (long=-78.70723462816063) AND
(long=-78.53096764204116) AND (lat=35.5741118787) AND
(lat=35.66366331376857);
QUERY PLAN #1



-
 Nested Loop Left Join  (cost=0.00..23433.18 rows=1871 width=34) (actual
time=0.555..5095.434 rows=3224 loops=1)
   -  Index Scan using cable_billing_lat_long_idx on cable_billing
(cost=0.00..12145.85 rows=1871 width=32) (actual time=0.431..249.931
rows=3224 loops=1)
 Index Cond: ((lat = 35.5741118787) AND (lat =
35.66366331376857) AND (long = -78.70723462816063) AND (long =
-78.53096764204116))
 Filter: (boxtype = 'd'::bpchar)
   -  Index Scan using davic_pkey on davic  (cost=0.00..6.01 rows=1
width=8) (actual time=1.476..1.480 rows=1 loops=3224)
 Index Cond: (outer.mac = davic.mac)
 Total runtime: 5100.028 ms
(7 rows)



twc-ral-overview=# vacuum analyze;
VACUUM
twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM
(SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1
ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac)
WHERE boxtype='d') AS foo  WHERE (long=-78.87878592206046) AND
(long=-78.70220280717479) AND (lat=35.71703190638861) AND
(lat=35.80658335998006);
QUERY PLAN #2



---
 Nested Loop Left Join  (cost=0.00..76468.90 rows=9223 width=34) (actual
time=0.559..17387.427 rows=19997 loops=1)
   -  Seq Scan on cable_billing  (cost=0.00..20837.76 rows=9223 width=32)
(actual time=0.290..7117.799 rows=19997 loops=1)
 Filter: ((boxtype = 'd'::bpchar) AND (long = -78.87878592206046)
AND (long = 

Re: [PERFORM] Help with query plan inconsistencies

2004-03-23 Thread Joseph Shraibman
I'm going to ask because someone else surely will:

Do you regularily vacuum/analyze the database?

Woody Woodring wrote:
Hello,

I am using postgres 7.4.2 as a backend for geocode data for a mapping
application.  My question is why can't I get a consistent use of my indexes
during a query, I tend to get a lot of seq scan results.
I use a standard query:

SELECT lat, long, mac, status FROM (
   SELECT text(mac) as mac, lat long, CASE status WHEN 0 THEN 0 WHEN 1 THEN
1 ELSE -1 END 
  as status FROM cable_billing LEFT OUTER JOIN davic USING(mac) WHERE
boxtype='d'
)AS FOO WHERE (long=X1) AND (long=X2) AND (lat=Y1) AND (lat=Y2)

Where X1,X2,Y1,Y2 are the coordinates for the rectangle of the map viewing
area.
QUERY PLAN #1  #2 are from when I get a view from 10 miles out, sometimes
it uses the index(#1) and most of the time not(#2).  I do run into plans
that seq scan both sides of the join.
QUERY PLAN #3 is when I view from 5 miles out, and I have much greater
chance of getting index scans ( about 90% of the time).
I have listed information about the database below.

Cable_billing ~500,000 rows updated once per day
Davic  ~500,000 rows, about 100 rows update per minute
Any info or suggestions would be appreciated.

Woody

twc-ral-overview=# \d cable_billing;
 Table public.cable_billing
 Column  |  Type  | Modifiers 
-++---
 cable_billingid | integer| not null
 mac | macaddr| not null
 account | integer| 
 number  | character varying(10)  | 
 address | character varying(200) | 
 region  | character varying(30)  | 
 division| integer| 
 franchise   | integer| 
 node| character varying(10)  | 
 lat | numeric| 
 long| numeric| 
 trunk   | character varying(5)   | 
 ps  | character varying(5)   | 
 fd  | character varying(5)   | 
 le  | character varying(5)   | 
 update  | integer| 
 boxtype | character(1)   | 
Indexes: cable_billing_pkey primary key btree (mac),
 cable_billing_account_index btree (account),
 cable_billing_lat_long_idx btree (lat, long),
 cable_billing_node_index btree (node),
 cable_billing_region_index btree (region)

twc-ral-overview=# \d davic
Table public.davic
 Column  | Type  | Modifiers 
-+---+---
 davicid | integer   | not null
 mac | macaddr   | not null
 source  | character varying(20) | 
 status  | smallint  | 
 updtime | integer   | 
 type| character varying(10) | 
 avail1  | integer   | 
Indexes: davic_pkey primary key btree (mac)



twc-ral-overview=# vacuum analyze;
VACUUM
twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM
(SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1
ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac)
WHERE boxtype='d') AS foo  WHERE (long=-78.70723462816063) AND
(long=-78.53096764204116) AND (lat=35.5741118787) AND
(lat=35.66366331376857);
QUERY PLAN #1


-
 Nested Loop Left Join  (cost=0.00..23433.18 rows=1871 width=34) (actual
time=0.555..5095.434 rows=3224 loops=1)
   -  Index Scan using cable_billing_lat_long_idx on cable_billing
(cost=0.00..12145.85 rows=1871 width=32) (actual time=0.431..249.931
rows=3224 loops=1)
 Index Cond: ((lat = 35.5741118787) AND (lat =
35.66366331376857) AND (long = -78.70723462816063) AND (long =
-78.53096764204116))
 Filter: (boxtype = 'd'::bpchar)
   -  Index Scan using davic_pkey on davic  (cost=0.00..6.01 rows=1
width=8) (actual time=1.476..1.480 rows=1 loops=3224)
 Index Cond: (outer.mac = davic.mac)
 Total runtime: 5100.028 ms
(7 rows)


twc-ral-overview=# vacuum analyze;
VACUUM
twc-ral-overview=# explain analyze SELECT lat, long, mac, status FROM
(SELECT text(mac) as mac, lat, long, CASE status WHEN 0 THEN 0 WHEN 1 THEN 1
ELSE -1 END as status FROM cable_billing LEFT OUTER JOIN davic USING(mac)
WHERE boxtype='d') AS foo  WHERE (long=-78.87878592206046) AND
(long=-78.70220280717479) AND (lat=35.71703190638861) AND
(lat=35.80658335998006);
QUERY PLAN #2


---
 Nested Loop Left Join  (cost=0.00..76468.90 rows=9223 width=34) (actual
time=0.559..17387.427 rows=19997 loops=1)
   -  Seq Scan on cable_billing  (cost=0.00..20837.76 rows=9223 width=32)
(actual time=0.290..7117.799 

Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Josh Berkus
Matt, Stalin,

 As for the compute intensive side (complex joins  sorts etc), the Dell will 
most likely beat the Sun by some distance, although
 what the Sun lacks in CPU power it may make up a bit in memory bandwidth/
latency.

Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do 
better than you think, comparitively.On all the Dell servers I've used so 
far, I've not seen performance that comes even close to the hardware specs.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] SLOW query with aggregates

2004-03-23 Thread Tom Lane
A Palmblad [EMAIL PROTECTED] writes:
  GroupAggregate  (cost=0.00..338300.34 rows=884 width=345) (actual
 time=86943.272..382718.104 rows=3117 loops=1)
-  Merge Join  (cost=0.00..93642.52 rows=1135610 width=345) (actual
 time=0.148..24006.748 rows=1120974 loops=1)

You do not have a planning problem here, and trying to change the plan
is a waste of time.  The slowness is in the actual computation of the
aggregate functions; ergo the only way to speed it up is to change what
you're computing.  What aggregates are you computing exactly, and over
what datatypes?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread scott.marlowe
On Tue, 23 Mar 2004, Josh Berkus wrote:

 Matt, Stalin,
 
  As for the compute intensive side (complex joins  sorts etc), the Dell will 
 most likely beat the Sun by some distance, although
  what the Sun lacks in CPU power it may make up a bit in memory bandwidth/
 latency.
 
 Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do 
 better than you think, comparitively.On all the Dell servers I've used so 
 far, I've not seen performance that comes even close to the hardware specs.

We use a 2600 at work (dual 2.8GHz) with the LSI/Megaraid based battery 
backed caching controller, and it flies.  Truly flies.

It's not Dell that's so slow, it's the default adaptec RAID controller or 
IDE drives that are slow.  Ours has 533 MHz memory bus, by the way.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!

2004-03-23 Thread Darcy Buskermolen


--  Forwarded Message  --

Subject: FreeBSD, PostgreSQL, semwait and sbwait!
Date: March 23, 2004 12:02 pm
From: Jason Coene [EMAIL PROTECTED]
To: [EMAIL PROTECTED]

Hello all,

We're having a substantial problem with our FreeBSD 5.2 database server
running PostgreSQL - it's getting a lot of traffic (figure about 3,000
queries per second), but queries are slow, and it's seemingly waiting on
other things than CPU time.

The database server is a dual P4-2.8 w/ HT enabled (kernel finds 4
processors), 2GB RAM, 4 disk Serial ATA on 3ware RAID, gigabit Ethernet
connection to web servers.  It's running FreeBSD 5.2 and PostgreSQL 7.4.1.

The server is taking a while to respond to both connections, and then
queries (between 1-3 seconds, on a query that should execute in 0.05 or
less).

The CPU usage for the server never goes above 30% (70% idle), and the CPU
time that's in use is nearly always split equal between user and system.
The system is using

Doing a top, this is what we see:

last pid: 51833;  load averages: 13.72, 11.74, 10.01 up 0+01:55:45 15:00:03
116 processes: 1 running, 99 sleeping, 16 lock
CPU states: 14.6% user, 0.0% nice, 23.7% system, 0.2% interrupt, 61.5% idle
Mem: 91M Active, 1043M Inact, 160M Wired, 52K Cache, 112M Buf, 644M Free
Swap: 4096M Total, 4096M Free

20354 pgsql1310 80728K  5352K select 0   0:24  1.71%  1.71% postgres
36415 pgsql  40 81656K 67468K sbwait 2   0:00  3.23%  0.59% postgres
36442 pgsql1280 82360K 15868K select 2   0:00  1.75%  0.24% postgres
36447 pgsql -40 82544K 10616K semwai 0   0:00  2.05%  0.20% postgres
36461 pgsql -40 81612K  6844K semwai 2   0:00  2.05%  0.20% postgres
36368 pgsql  40 82416K 20780K sbwait 3   0:00  0.50%  0.15% postgres
36459 pgsql -40 81840K  7816K semwai 0   0:00  1.54%  0.15% postgres
36469 pgsql -40 81840K  7964K semwai 2   0:00  1.54%  0.15% postgres
36466 pgsql1290 81840K  7976K *Giant 2   0:00  1.54%  0.15% postgres
36479 pgsql -40 81528K  6648K semwai 0   0:00  3.00%  0.15% postgres
36457 pgsql -40 81840K  8040K semwai 1   0:00  1.03%  0.10% postgres
36450 pgsql1290 82352K  8188K *Giant 2   0:00  1.03%  0.10% postgres
36472 pgsql -40 81824K  7416K semwai 2   0:00  1.03%  0.10% postgres
36478 pgsql1310 81840K  7936K select 0   0:00  2.00%  0.10% postgres
36454 pgsql  40 82416K 16300K sbwait 3   0:00  0.51%  0.05% postgres
36414 pgsql  40 82416K 15872K sbwait 2   0:00  0.27%  0.05% postgres

Our kernel is GENERIC plus:

maxusers512
options SYSVSHM
options SHMMAXPGS=262144
options SHMSEG=512
options SHMMNI=512
options SYSVSEM
options SEMMNI=512
options SEMMNS=1024
options SEMMNU=512
options SEMMAP=512
options NMBCLUSTERS=32768

Interesting bits from postgresql.conf:

max_connections = 512
shared_buffers = 8192
sort_mem = 16384
vacuum_mem = 8192
fsync = false

It seems that queries are executing fine once they start, but it's taking a
while for them to get going, while the postgres process sits in semwait,
sbwait or select.  This problem doesn't happen when there's little load on
the server, it's only when we open it for public consumption that it
exhibits these problems.

Anyone have this type of problem before?  Am I missing something?

Thanks, Jason

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to
 [EMAIL PROTECTED]

---

-- 
Darcy Buskermolen
Wavefire Technologies Corp.
ph: 250.717.0200
fx:  250.763.1759
http://www.wavefire.com


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Subbiah, Stalin
As anyone done performance benchmark testing with solaris sparc/intel linux.
I once read a post here, which had benchmarking test results for using
different filesystem like xfs, ext3, ext2, ufs etc. i couldn't find that
link anymore and google is failing on me, so anyone have the link handy.

Thanks!

-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED]
Sent: Tuesday, March 23, 2004 12:13 PM
To: Matt Clark; Subbiah, Stalin; 'Andrew Sullivan';
[EMAIL PROTECTED]
Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux


Matt, Stalin,

 As for the compute intensive side (complex joins  sorts etc), the Dell
will 
most likely beat the Sun by some distance, although
 what the Sun lacks in CPU power it may make up a bit in memory bandwidth/
latency.

Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do 
better than you think, comparitively.On all the Dell servers I've used
so 
far, I've not seen performance that comes even close to the hardware specs.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Andrew Sullivan
On Tue, Mar 23, 2004 at 08:53:42PM -, [EMAIL PROTECTED] wrote:

 is way down the priority list compared with IO throughput, stability,
 manageability, support, etc etc.

Indeed, if our Suns actually diabled the broken hardware when they
died, fell over, and rebooted themselves, I'd certainly praise them
to heaven.  But I have to say that the really very good reporting of
failing memory has saved me some headaches.  

 environment, I'd take the Sun every day of the week, assuming that those
 compile option changes have sorted out the oddly slow PG performance at
 last.

I seem to have hit a bad batch of Dell hardware recently, which makes
me second this opinion.

I should say, also, that my initial experience of AIX has been
extremely good.  I can't comment on the fun it might involve in the
long haul, of course.

A

-- 
Andrew Sullivan  | [EMAIL PROTECTED]
This work was visionary and imaginative, and goes to show that visionary
and imaginative work need not end up well. 
--Dennis Ritchie

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Bill Moran
Subbiah, Stalin wrote:
As anyone done performance benchmark testing with solaris sparc/intel linux.
I once read a post here, which had benchmarking test results for using
different filesystem like xfs, ext3, ext2, ufs etc. i couldn't find that
link anymore and google is failing on me, so anyone have the link handy.
If you're talking about the work I did, it's here:
http://www.potentialtech.com/wmoran/ (then follow the link)
Anyway, that should be easily portable to any platform that will run Postgres,
but I don't know how useful it is in comparing two different platforms.  See
the information in the document.  It was intended only to test disk access speed,
and attempts to flood the HDD system with database work to do.
Thanks!

-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED]
Sent: Tuesday, March 23, 2004 12:13 PM
To: Matt Clark; Subbiah, Stalin; 'Andrew Sullivan';
[EMAIL PROTECTED]
Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
Matt, Stalin,


As for the compute intensive side (complex joins  sorts etc), the Dell
will 
most likely beat the Sun by some distance, although

what the Sun lacks in CPU power it may make up a bit in memory bandwidth/
latency.

Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do 
better than you think, comparitively.On all the Dell servers I've used
so 
far, I've not seen performance that comes even close to the hardware specs.



--
Bill Moran
Potential Technologies
http://www.potentialtech.com
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!

2004-03-23 Thread Tom Lane
Darcy Buskermolen [EMAIL PROTECTED] writes:
 The database server is a dual P4-2.8 w/ HT enabled (kernel finds 4
 processors), 2GB RAM, 4 disk Serial ATA on 3ware RAID, gigabit Ethernet
 connection to web servers.  It's running FreeBSD 5.2 and PostgreSQL 7.4.1.

Hm.  What happens if you turn off the hyperthreading?

We have seen a number of reports recently that suggest that our
spinlocking code behaves inefficiently on hyperthreaded machines.
This hasn't got to the point where we have any substantiated evidence,
mind you, but maybe you can help provide some.

Also it might be interesting to put one of these into the outer loop in
s_lock():

__asm__ __volatile__(
 rep; nop  \n
: : : memory);

(This suggestion is a quick-and-dirty backport of a change that's
already in CVS tip.)

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!

2004-03-23 Thread Josh Berkus
Darcy,

I suggest getting this person over here instead.They have a *lot* to learn 
about tuning PostgreSQL.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] atrocious update performance

2004-03-23 Thread Greg Spiegelberg
Rosser Schwarz wrote:
Greg Spiegelberg wrote:


I've been following this thread closely as I have the same problem
with an UPDATE.  Everything is identical here right down to the
strace output.


Has anyone found a workaround or resolved the problem?  If not,
I have test systems here which I can use to help up test and explore.


I'm still gathering data.  The explain analyze I'd expected to finish
Thursday afternoon hasn't yet.  I'm going to kill it and try a few
smaller runs, increasing in size, until the behavior manifests.
Will advise.
I've replaced my atrocious UPDATE with the following.

begin;
-- Drop all contraints
alter table ORIG drop constraint ...;
-- Drop all indexes
drop index ...;
-- Update
update ORIG set column=... where...;
commit;
Problem is when I recreate the indexes and add the constraints back
on ORIG I end up with the same long running process.  The original
UPDATE runs for about 30 minutes on a table of 400,000 with the
WHERE matching about 70% of the rows.  The above runs for about 2
minutes without adding the constraints or indexes however adding the
constraints and creating the dropped indexes negates any gain.
RedHat 7.3 + Kernel 2.4.24 + ext3 + PostgreSQL 7.3.5
Dual PIII 1.3'ishGHz, 2GB Memory
U160 OS drives and a 1Gbps test SAN on a Hitachi 9910
Greg

--
Greg Spiegelberg
 Sr. Product Development Engineer
 Cranel, Incorporated.
 Phone: 614.318.4314
 Fax:   614.431.8388
 Email: [EMAIL PROTECTED]
Cranel. Technology. Integrity. Focus.


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Aaron Werman
Are you talking about
http://www.potentialtech.com/wmoran/postgresql.php#conclusion
- Original Message - 
From: Subbiah, Stalin [EMAIL PROTECTED]
To: [EMAIL PROTECTED]; Matt Clark [EMAIL PROTECTED]; Subbiah, Stalin
[EMAIL PROTECTED]; 'Andrew Sullivan' [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Tuesday, March 23, 2004 3:42 PM
Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux


 As anyone done performance benchmark testing with solaris sparc/intel
linux.
 I once read a post here, which had benchmarking test results for using
 different filesystem like xfs, ext3, ext2, ufs etc. i couldn't find that
 link anymore and google is failing on me, so anyone have the link handy.

 Thanks!

 -Original Message-
 From: Josh Berkus [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, March 23, 2004 12:13 PM
 To: Matt Clark; Subbiah, Stalin; 'Andrew Sullivan';
 [EMAIL PROTECTED]
 Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux


 Matt, Stalin,

  As for the compute intensive side (complex joins  sorts etc), the Dell
 will
 most likely beat the Sun by some distance, although
  what the Sun lacks in CPU power it may make up a bit in memory
bandwidth/
 latency.

 Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do
 better than you think, comparitively.On all the Dell servers I've used
 so
 far, I've not seen performance that comes even close to the hardware
specs.

 -- 
 -Josh Berkus
  Aglio Database Solutions
  San Francisco

 ---(end of broadcast)---
 TIP 8: explain analyze is your friend


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] atrocious update performance

2004-03-23 Thread Tom Lane
Greg Spiegelberg [EMAIL PROTECTED] writes:
 RedHat 7.3 + Kernel 2.4.24 + ext3 + PostgreSQL 7.3.5
  

Please try 7.4.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!

2004-03-23 Thread Josh Berkus
Tom,

 Hm.  What happens if you turn off the hyperthreading?

Forget hyperthreading.   Look at their postgresql.conf settings.   8mb shared 
mem, 16mb sort mem per connection for 512 connections, default 
effective_cache_size.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Fwd: FreeBSD, PostgreSQL, semwait and sbwait!

2004-03-23 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 Forget hyperthreading.   Look at their postgresql.conf settings.   8mb shared
 mem, 16mb sort mem per connection for 512 connections, default 
 effective_cache_size.

They could well be going into swap hell due to the oversized sort_mem,
but that didn't quite seem to explain the reported behavior.  I'd want
to see vmstat or similar output to confirm whether the disks are busy,
though.  Amazing how many people forget that a database is normally
I/O-bound rather than CPU-bound.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Subbiah, Stalin
Yep. Thanks Bill.

-Original Message-
From: Bill Moran [mailto:[EMAIL PROTECTED]
Sent: Tuesday, March 23, 2004 2:10 PM
To: Subbiah, Stalin
Cc: [EMAIL PROTECTED]
Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux


Subbiah, Stalin wrote:
 As anyone done performance benchmark testing with solaris sparc/intel
linux.
 I once read a post here, which had benchmarking test results for using
 different filesystem like xfs, ext3, ext2, ufs etc. i couldn't find that
 link anymore and google is failing on me, so anyone have the link handy.

If you're talking about the work I did, it's here:
http://www.potentialtech.com/wmoran/ (then follow the link)

Anyway, that should be easily portable to any platform that will run
Postgres,
but I don't know how useful it is in comparing two different platforms.  See
the information in the document.  It was intended only to test disk access
speed,
and attempts to flood the HDD system with database work to do.

 
 Thanks!
 
 -Original Message-
 From: Josh Berkus [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, March 23, 2004 12:13 PM
 To: Matt Clark; Subbiah, Stalin; 'Andrew Sullivan';
 [EMAIL PROTECTED]
 Subject: Re: [PERFORM] [ADMIN] Benchmarking postgres on Solaris/Linux
 
 
 Matt, Stalin,
 
 
As for the compute intensive side (complex joins  sorts etc), the Dell
 
 will 
 most likely beat the Sun by some distance, although
 
what the Sun lacks in CPU power it may make up a bit in memory bandwidth/
 
 latency.
 
 Personally, I've been unimpressed by Dell/Xeon; I think the Sun might do 
 better than you think, comparitively.On all the Dell servers I've used
 so 
 far, I've not seen performance that comes even close to the hardware
specs.
 


-- 
Bill Moran
Potential Technologies
http://www.potentialtech.com

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] SLOW query with aggregates

2004-03-23 Thread Greg Stark

A Palmblad [EMAIL PROTECTED] writes:

  GroupAggregate  (cost=0.00..338300.34 rows=884 width=345) (actual
 time=86943.272..382718.104 rows=3117 loops=1)
-  Merge Join  (cost=0.00..93642.52 rows=1135610 width=345) (actual
 time=0.148..24006.748 rows=1120974 loops=1)

I think the reason you're getting a GroupAggregate here instead of a
HashAggregate is that the MergeJoin is already producing the records in the
desired order, so the GroupAggregate doesn't require an extra sort, ie, it's
effectively free.

You might be able to verify this by running the query with 

enable_indexscan = off  and/or  enable_mergejoin = off

some combination of which might get the planner to do a seqscan of the large
table with a hash join to the small table and then a HashAggregate.

If you're reading a lot of the large table the seqscan could be a little
faster, not much though. And given the accurate statistics guesses here the
planner may well have gotten this one right and the seqscan is slower. Can't
hurt to be verify it though.

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Benchmarking postgres on Solaris/Linux

2004-03-23 Thread Josh Berkus
Mark,

 It might be worth considering Apple if you want a 64-bit chip that has a
 clock speed comparable to Intel's - the Xserv is similarly priced to Sun
 V210 (both dual cpu 1U's).

Personally I'd stay *far* away from the XServs until Apple learns to build 
some real server harware.The current XServs have internal parts more 
appropriate to a Dell desktop (promise controller, low-speed commodity IDE 
drives), than a server.

If Apple has prices these IU desktop machines similar to Sun, then I sense 
doom ahead for the Apple Server Division.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] [ADMIN] Databases Vs. Schemas

2004-03-23 Thread Subbiah, Stalin
And we also created rules to allow update, delete, and insert on those 
views so that they looked like tables.  The reason we did this is 
because we ran into issues with too many open files during pg_dump when 
we had thousands of tables instead of about 1 hundred tables and 
thousands of views.

Is it because you had smaller value set for max. allowable number of open
files descriptor. what was ulimit -a set to ?

We, however, did have a need to periodically select data from 2 schemas 
at a time, and it was simpler logic than if we needed 2 database 
connections.

Adam Ruth

On Mar 22, 2004, at 2:30 PM, Subbiah, Stalin wrote:

 --sorry to repost, just subscribed to the list. hopefully it gets to 
 the
 list this time --

 Hi All,

 We are evaluating the options for having multiple databases vs. 
 schemas on a
 single database cluster for a custom grown app that we developed. Each 
 app
 installs same set of tables for each service. And the service could 
 easily
 be in thousands. so Is it better to have 1000 databases vs 1000 
 schemas in a
 database cluster. What are the performance overhead of having multiple
 databases vs. schemas (if any). I'm leaning towards having schemas 
 rather
 than databases but i would like to get others opinion on this. 
 Appreciate
 your reply.

 Thanks,
 Stalin

 ---(end of 
 broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html