Re: [PERFORM] Problem with grouping, uses Sort and GroupAggregate, HashAggregate is better(?)

2007-01-15 Thread HA/EXA
 

 -Original Message-
 From: Dave Dutcher [mailto:[EMAIL PROTECTED] 
 Sent: Sunday, January 14, 2007 5:12 PM
 To: Rolf Østvik (HA/EXA); pgsql-performance@postgresql.org
 Subject: RE: [PERFORM] Problem with grouping, uses Sort and 
 GroupAggregate, HashAggregate is better(?)
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  [mailto:[EMAIL PROTECTED] On Behalf Of 
  Rolf Østvik (HA/EXA)
 
 Have you tried set enable_sort=off with 8.1.2?  I'm not 
 sure if that will
 change anything because it has to do at least one sort.  Its 
 just a lots
 faster to do a hashagg + small sort than one big sort in this 
 case.  (I
 wonder if there should be enable_groupagg?)

Did you mean enable_sort = 'off' for 8.2.1?


I tried to set enable_sort = 'off' for both the
8.1.4 server and the 8.2.1 server.
Both servers used the same plan as Run 4 and Run 3 respectively.
There were of course some changes in the planner cost for the sort 
steps, but the execution times was of course the same.

Regards
Rolf Østvik

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PERFORM] pg_trgm performance

2007-01-15 Thread Florian Weimer
I've got a table with a few million rows, consisting of a single text
column.  The average length is about 17 characters.  For the sake of
an experiment, I put a trigram index on that table.  Unfortunately, %
queries without smallish LIMITs are ridiculously slow (they take
longer than an hour).  A full table scan with a WHERE similarity(...)
= 0.4 clause completes in just a couple of minutes.  The queries
only select a few hundred rows, so an index scan has got a real chance
to be faster than a sequential scan.

Am I missing something?  Or are trigrams just a poor match for my data
set?  Are the individual strings too long, maybe?

(This is with PostgreSQL 8.2.0, BTW.)

-- 
Florian Weimer[EMAIL PROTECTED]
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] max() versus order/limit (WAS: High update activity, PostgreSQL vs BigDBMS)

2007-01-15 Thread Alvaro Herrera
Adam Rich wrote:
 
 Did anybody get a chance to look at this?  Is it expected behavior?
 Everyone seemed so incredulous, I hoped maybe this exposed a bug
 that would be fixed in a near release.

Actually, the planner is only able to do the min()/max() transformation
into order by/limit in the case of a single table being scanned.  Since
you have a join here, the optimization is obviously not used:

 select max(item_id)
 from events e, receipts r, receipt_items ri
 where e.event_id=r.event_id and r.receipt_id=ri.receipt_id

plan/planagg.c says

/*
 * We also restrict the query to reference exactly one table, since join
 * conditions can't be handled reasonably.  (We could perhaps handle a
 * query containing cartesian-product joins, but it hardly seems worth the
 * trouble.)
 */

so you should keep using your hand-written order by/limit query.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] max() versus order/limit (WAS: High update

2007-01-15 Thread Alvaro Herrera
Luke Lonergan wrote:
 Adam,
 
 This optimization would require teaching the planner to use an index for
 MAX/MIN when available.  It seems like an OK thing to do to me.

This optimization already exists, albeit for queries that use a single
table.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] pg_trgm performance

2007-01-15 Thread Steinar H. Gunderson
On Mon, Jan 15, 2007 at 11:16:36AM +0100, Florian Weimer wrote:
 Am I missing something?  Or are trigrams just a poor match for my data
 set?  Are the individual strings too long, maybe?

FWIW, I've seen the same results with 8.1.x.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] max() versus order/limit (WAS: High update

2007-01-15 Thread Joshua D. Drake
Luke Lonergan wrote:
 Adam,
 
 This optimization would require teaching the planner to use an index for
 MAX/MIN when available.  It seems like an OK thing to do to me.

Uhmmm I thought we did that already in 8.1?

Joshua D. Drake


 
 - Luke
 
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Adam Rich
 Sent: Sunday, January 14, 2007 8:52 PM
 To: 'Joshua D. Drake'; 'Tom Lane'
 Cc: 'Craig A. James'; 'PostgreSQL Performance'
 Subject: Re: [PERFORM] max() versus order/limit (WAS: High 
 update activity, PostgreSQL vs BigDBMS)


 Did anybody get a chance to look at this?  Is it expected behavior?
 Everyone seemed so incredulous, I hoped maybe this exposed a 
 bug that would be fixed in a near release.


 -Original Message-
 From: Adam Rich [mailto:[EMAIL PROTECTED]
 Sent: Sunday, January 07, 2007 11:53 PM
 To: 'Joshua D. Drake'; 'Tom Lane'
 Cc: 'Craig A. James'; 'PostgreSQL Performance'
 Subject: RE: [PERFORM] High update activity, PostgreSQL vs BigDBMS



 Here's another, more drastic example... Here the order by / limit
 version
 runs in less than 1/7000 the time of the MAX() version.


 select max(item_id)
 from events e, receipts r, receipt_items ri
 where e.event_id=r.event_id and r.receipt_id=ri.receipt_id

 Aggregate  (cost=10850.84..10850.85 rows=1 width=4) (actual
 time=816.382..816.383 rows=1 loops=1)
   -  Hash Join  (cost=2072.12..10503.30 rows=139019 width=4) (actual
 time=155.177..675.870 rows=147383 loops=1)
 Hash Cond: (ri.receipt_id = r.receipt_id)
 -  Seq Scan on receipt_items ri  (cost=0.00..4097.56
 rows=168196 width=8) (actual time=0.009..176.894 rows=168196 loops=1)
 -  Hash  (cost=2010.69..2010.69 rows=24571 width=4) (actual
 time=155.146..155.146 rows=24571 loops=1)
   -  Hash Join  (cost=506.84..2010.69 rows=24571 width=4)
 (actual time=34.803..126.452 rows=24571 loops=1)
 Hash Cond: (r.event_id = e.event_id)
 -  Seq Scan on receipts r  (cost=0.00..663.58
 rows=29728 width=8) (actual time=0.006..30.870 rows=29728 loops=1)
 -  Hash  (cost=469.73..469.73 rows=14843 width=4)
 (actual time=34.780..34.780 rows=14843 loops=1)
   -  Seq Scan on events e  (cost=0.00..469.73
 rows=14843 width=4) (actual time=0.007..17.603 rows=14843 loops=1)
 Total runtime: 816.645 ms

 select item_id
 from events e, receipts r, receipt_items ri
 where e.event_id=r.event_id and r.receipt_id=ri.receipt_id
 order by item_id desc limit 1


 Limit  (cost=0.00..0.16 rows=1 width=4) (actual 
 time=0.047..0.048 rows=1
 loops=1)
   -  Nested Loop  (cost=0.00..22131.43 rows=139019 width=4) (actual
 time=0.044..0.044 rows=1 loops=1)
 -  Nested Loop  (cost=0.00..12987.42 rows=168196 width=8)
 (actual time=0.032..0.032 rows=1 loops=1)
   -  Index Scan Backward using receipt_items_pkey on
 receipt_items ri  (cost=0.00..6885.50 rows=168196 width=8) (actual
 time=0.016..0.016 rows=1 loops=1)
   -  Index Scan using receipts_pkey on receipts r
 (cost=0.00..0.02 rows=1 width=8) (actual time=0.010..0.010 rows=1
 loops=1)
 Index Cond: (r.receipt_id = ri.receipt_id)
 -  Index Scan using events_pkey on events e  (cost=0.00..0.04
 rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1)
   Index Cond: (e.event_id = r.event_id)
 Total runtime: 0.112 ms





 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Joshua D.
 Drake
 Sent: Sunday, January 07, 2007 9:10 PM
 To: Adam Rich
 Cc: 'Craig A. James'; 'Guy Rouillier'; 'PostgreSQL Performance'
 Subject: Re: [PERFORM] High update activity, PostgreSQL vs BigDBMS


 On Sun, 2007-01-07 at 20:26 -0600, Adam Rich wrote:
 I'm using 8.2 and using order by  limit is still faster than MAX()
 even though MAX() now seems to rewrite to an almost identical plan
 internally.

 Gonna need you to back that up :) Can we get an explain analyze?


 Count(*) still seems to use a full table scan rather than an index
 scan.
 There is a TODO out there to help this. Don't know if it will 
 get done.

 Joshua D. Drake

 -- 

   === The PostgreSQL Company: Command Prompt, Inc. ===
 Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
 Providing the most comprehensive  PostgreSQL solutions since 1997
  http://www.commandprompt.com/

 Donate to the PostgreSQL Project: 
 http://www.postgresql.org/about/donate




 ---(end of 
 broadcast)---
 TIP 4: Have you searched our list archives?

http://archives.postgresql.org


 ---(end of 
 broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at

 http://www.postgresql.org/about/donate


 
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will 

[PERFORM] FiberChannel cards for FreeBSD on AMD64

2007-01-15 Thread Andrew Hammond
Does anyone here have positive experiences to relate running
fiberchannel cards on FreeBSD on AMD64? The last time I tried it was
with FreeBSD 4 about 2 years ago and none of the cards I tried could
cross the 32bit memory barrier (since they were all actually 32bit
cards despite plugging into a 64bit PCI bus).

Andrew


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster