Re: [PERFORM] Possible to improve query plan?

2011-01-17 Thread Jayadevan M
Hello,

 
 The distribution of the data is that all but 120,000 rows have null 
 values in the _revision_expired column.
 

A shot in the dark - will a partial index on the above column help?
http://www.postgresql.org/docs/current/interactive/indexes-partial.html
http://en.wikipedia.org/wiki/Partial_index

One link with discussion about it...
http://www.devheads.net/database/postgresql/general/when-can-postgresql-use-partial-not-null-index-seems-depend-size-clause-even-enable-seqscan.htm

Regards,
Jayadevan





DISCLAIMER: 

The information in this e-mail and any attachment is intended only for 
the person to whom it is addressed and may contain confidential and/or 
privileged material. If you have received this e-mail in error, kindly 
contact the sender and destroy all copies of the original communication. 
IBS makes no warranty, express or implied, nor guarantees the accuracy, 
adequacy or completeness of the information contained in this email or any 
attachment and is not liable for any errors, defects, omissions, viruses 
or for resultant loss or damage, if any, direct or indirect.






Re: [PERFORM] Possible to improve query plan?

2011-01-17 Thread Ing. Marcos Ortiz Valmaseda
Which is the type of your application? You can see it on the Performance 
Whackamole Presentation from Josh Berkus on the 
PgCon 2009:
- Web application
- Online Transaction Processing (OLTP)
- Data WareHousing (DW)

And based on the type of your application, you can configure the 
postgresql.conf to gain a better performance of your PostgreSQL server.
PostgreSQL postgresql.conf baseline:
 shared_buffers = 25% RAM
 work_mem = 512K[W] 2 MB[O] 128 MB[D]
 - but no more that RAM/no_connections
 maintenance_work_mem = 1/16 RAM
 checkpoint_segments = 8 [W], 16-64 [O], [D]
 wal_buffer = 1 MB [W], 8 MB [O], [D]
 effective_cache_size = 2/3 RAM

Regards
 

Ing. Marcos Luís Ortíz Valmaseda
Linux User # 418229  PostgreSQL DBA
Centro de Tecnologías Gestión de Datos (DATEC)
http://postgresql.uci.cu
http://www.postgresql.org
http://it.toolbox.com/blogs/sql-apprentice

- Mensaje original -
De: Jeremy Palmer jpal...@linz.govt.nz
Para: Andy Colson a...@squeakycode.net
CC: pgsql-performance@postgresql.org
Enviados: Lunes, 17 de Enero 2011 0:13:25 GMT -05:00 Región oriental EE. 
UU./Canadá
Asunto: Re: [PERFORM] Possible to improve query plan?

Hi Andy,

Yes important omissions:

Server version: 8.4.6
OS Windows Server 2003 Standard Ed :(
The work mem is 50mb.

I tried setting the work_mem to 500mb, but it didn't make a huge difference in 
query execution time. But then again the OS disk caching is probably taking 
over here.

Ok here's the new plan with work_mem = 50mb:

http://explain.depesz.com/s/xwv

And here another plan with work_mem = 500mb:

http://explain.depesz.com/s/VmO

Thanks,
Jeremy

-Original Message-
From: Andy Colson [mailto:a...@squeakycode.net] 
Sent: Monday, 17 January 2011 5:57 p.m.
To: Jeremy Palmer
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Possible to improve query plan?


Hum.. yeah it looks like it takes no time at all to pull data from the 
individual indexes, and them bitmap them.  I'm not sure what the bitmap heap 
scan is, or why its slow.  Hopefully someone smarter will come along.

Also its weird that explain.depesz.com didnt parse and show your entire plan.  
Hum.. you seem to have ending quotes on some of the lines?

One other though: quicksort  Memory: 23960kB
It needs 20Meg to sort... It could be your sort is swapping to disk.

What sort of PG version is this?
What are you using for work_mem?  (you could try to bump it up a little (its 
possible to set for session only, no need for server restart) and see if that'd 
help.

And sorry, but its my bedtime, good luck though.

-Andy

__

This message contains information, which is confidential and may be subject to 
legal privilege. 
If you are not the intended recipient, you must not peruse, use, disseminate, 
distribute or copy this message.
If you have received this message in error, please notify us immediately (Phone 
0800 665 463 or i...@linz.govt.nz) and destroy the original message.
LINZ accepts no responsibility for changes to this email, or for any 
attachments, after its transmission from LINZ.

Thank you.
__

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] NOT IN substantially slower in 9.0.2 than 8.3.13 - NOT EXISTS runs fast in both 8.3.13 and 9.0.2

2011-01-17 Thread Achilleas Mantzios
Hello,
just coming from this thread : 
http://archives.postgresql.org/pgsql-admin/2011-01/msg00050.php
It started as an admin question and turned out to be a performance question.
You may look at it for a history of this issue. I will repost all data here.

Description of the machines involved:

1) Prod machine (thereafter called LINUX_PROD) :
System: Linux Suse 2.6.16.46-0.12-smp, 16 x Intel Xeon(R) X7350 @ 2.93GHz, 64GB 
memory
DB: PostgreSQL 8.3.13, shared_buffers=16GB, work_mem=512MB, db size=94GB
2) Dev machine (therafter called FBSD_DEV) :
System : FreeBSD 6.3, Intel(R) Core(TM)2 Duo CPU @ 2.80GHz, 2GB memory
DB: PostgreSQL 8.3.13, shared_buffers=512MB, work_mem=1MB, db size=76GB
3) Test machine (thereafter called FBSD_TEST) :
System: FreeBSD 8.1, 4 x AMD Phenom(tm) 965 @ 3.4 GHz, 8GB memory
DB: PostgreSQL 9.0.2, shared_buffers=5GB, work_mem=512MB, db size=7GB
4) Linux Test machine (thereafter called LINUX_TEST) :
System : Debian GNU/Linux 5.0, 2x AMD athlon @2.2GZ, 4GB Mem
DB: PostgreSQL 9.0.2, shared_buffers=2GB, work_mem=512MB, db size=7GB

(all DBs in the last three systems are identical, originating from FBSD_DEV)
(additiinally no paging or thrashing were observed during the tests)

Query is :
SELECT distinct m.id,coalesce(m.givenname,''),coalesce(m.midname,''),m.surname 
from marinerstates ms,vessels vsl,mariner m 
where m.id=ms.marinerid and ms.vslid=vsl.id and ms.state='Active' and 
coalesce(ms.endtime,now())::date = '2006-07-15' and 
ms.starttime::date = '2007-01-11'  and m.marinertype='Mariner'  and m.id not 
in 
(SELECT distinct mold.id from marinerstates msold,vessels vslold,mariner mold  
where mold.id=msold.marinerid and msold.vslid=vslold.id and 
msold.state='Active' and coalesce(msold.endtime,now())::date = '2006-07-15' 
and msold.starttime::date = '2007-01-11' and exists 
(select 1 from marinerstates msold2 where msold2.marinerid=msold.marinerid and 
msold2.state='Active' and msold2.id  msold.id and 
msold2.starttimemsold.starttime AND (msold.starttime-msold2.endtime)='18 
months')  and mold.marinertype='Mariner' ) 
order by m.surname,coalesce(m.givenname,''),coalesce(m.midname,'');   

i get the following execution times: (with \timing) 
FBSD_DEV : query : 240.419 ms
LINUX_PROD : query : 219.568 ms
FBSD_TEST : query :  2285.509 ms
LINUX_TEST : query : 5788.988 ms

Re writing the query in the NOT EXIST variation like:

SELECT distinct m.id,coalesce(m.givenname,''),coalesce(m.midname,''),m.surname 
from marinerstates ms,vessels vsl,mariner m where 
m.id=ms.marinerid and ms.vslid=vsl.id and ms.state='Active' and 
coalesce(ms.endtime,now())::date = '2006-07-15' and 
ms.starttime::date = '2007-01-11'  and m.marinertype='Mariner'  and NOT EXISTS 
   (SELECT distinct mold.id from marinerstates msold,vessels vslold,mariner 
mold  where mold.id=msold.marinerid and msold.vslid=vslold.id and 
   msold.state='Active' and coalesce(msold.endtime,now())::date = '2006-07-15' 
and msold.starttime::date = '2007-01-11' and 
   exists (select 1 from marinerstates msold2 where 
msold2.marinerid=msold.marinerid and msold2.state='Active' and msold2.id  
msold.id and 
              msold2.starttimemsold.starttime AND 
(msold.starttime-msold2.endtime)='18 months')  
   and mold.marinertype='Mariner' AND mold.id=m.id) 
order by m.surname,coalesce(m.givenname,''),coalesce(m.midname,'');
gives:

FBSD_DEV : query : 154.000 ms
LINUX_PROD : query : 153.408 ms
FBSD_TEST : query :  137.000 ms
LINUX_TEST : query : 404.000 ms

I found this query, since i observed that running the calling program was 
actually the first case that i 
encountered FBSD_TEST (while running a bigger database, a recent dump from 
LINUX_PROD) to be actually
slower than LINUX_PROD.
From the whole set of the tests involved, it seems like the NOT IN version 
of the query runs slow
in any postgresql 9.0.2 tested.
-- 
Achilleas Mantzios

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] NOT IN substantially slower in 9.0.2 than 8.3.13 - NOT EXISTS runs fast in both 8.3.13 and 9.0.2

2011-01-17 Thread Ing. Marcos Ortiz Valmaseda
Query is :
SELECT distinct m.id,coalesce(m.givenname,''),
   coalesce(m.midname,''),
   m.surname from marinerstates ms,vessels vsl,mariner m 
WHERE  m.id=ms.marinerid and ms.vslid=vsl.id 
ANDms.state='Active' and coalesce(ms.endtime,now())::date = '2006-07-15'
ANDms.starttime::date = '2007-01-11'  AND
   m.marinertype='Mariner'  and m.id 
NOT IN (SELECT distinct mold.id
FROM marinerstates msold,
vessels vslold,
mariner mold  
WHERE mold.id=msold.marinerid 
AND  msold.vslid=vslold.id
AND msold.state='Active' 
AND coalesce(msold.endtime,now())::date = '2006-07-15' 
AND msold.starttime::date = '2007-01-11' 
AND EXISTS (SELECT 1 
FROM marinerstates msold2 
WHERE msold2.marinerid=msold.marinerid
AND msold2.state='Active' 
AND msold2.id  msold.id 
AND msold2.starttimemsold.starttime
AND (msold.starttime-msold2.endtime)='18 months')
AND mold.marinertype='Mariner' ) 
   ORDER BY m.surname,coalesce(m.givenname,'')
  ,coalesce(m.midname,'');   

i get the following execution times: (with \timing) 
FBSD_DEV : query : 240.419 ms
LINUX_PROD : query : 219.568 ms
FBSD_TEST : query :  2285.509 ms
LINUX_TEST : query : 5788.988 ms

Re writing the query in the NOT EXIST variation like:

SELECT distinct m.id,coalesce(m.givenname,''),coalesce(m.midname,''),m.surname 
from marinerstates ms,vessels vsl,mariner m where 
m.id=ms.marinerid and ms.vslid=vsl.id and ms.state='Active' and 
coalesce(ms.endtime,now())::date = '2006-07-15' and 
ms.starttime::date = '2007-01-11'  and m.marinertype='Mariner'  and NOT EXISTS 
   (SELECT distinct mold.id from marinerstates msold,vessels vslold,mariner 
mold  where mold.id=msold.marinerid and msold.vslid=vslold.id and 
   msold.state='Active' and coalesce(msold.endtime,now())::date = '2006-07-15' 
and msold.starttime::date = '2007-01-11' and 
   exists (select 1 from marinerstates msold2 where 
msold2.marinerid=msold.marinerid and msold2.state='Active' and msold2.id  
msold.id and 
  msold2.starttimemsold.starttime AND 
(msold.starttime-msold2.endtime)='18 months')  
   and mold.marinertype='Mariner' AND mold.id=m.id) 
order by m.surname,coalesce(m.givenname,''),coalesce(m.midname,'');
gives:

FBSD_DEV : query : 154.000 ms
LINUX_PROD : query : 153.408 ms
FBSD_TEST : query :  137.000 ms
LINUX_TEST : query : 404.000 ms


Well, on the Release Notes on the PostgreSQL-8.4 Documentation, the developers 
recommend to use NOT EXISTS 
instead NOT IN, because the first clause has a better performance. So, you can 
use it on that way.

Other questions?
- Do you have a partial index on marinerstates.marinerid where this condition 
is accomplished?
- Do you have a index on mariner.id?
- Can you provide a explain of these queries on the PostgreSQL-9.0 machines?

Regards


Ing. Marcos Luís Ortíz Valmaseda
Linux User # 418229  PostgreSQL DBA
Centro de Tecnologías Gestión de Datos (DATEC)
http://postgresql.uci.cu
http://www.postgresql.org
http://it.toolbox.com/blogs/sql-apprentice

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] NOT IN substantially slower in 9.0.2 than 8.3.13 - NOT EXISTS runs fast in both 8.3.13 and 9.0.2

2011-01-17 Thread Mladen Gogala

Achilleas Mantzios wrote:

From the whole set of the tests involved, it seems like the NOT IN version of 
the query runs slow
in any postgresql 9.0.2 tested.
  

Not only that, it will run slower even using Oracle 11.2 or MySQL 5.5.

--

Mladen Gogala 
Sr. Oracle DBA

1500 Broadway
New York, NY 10036
(212) 329-5251
http://www.vmsinfo.com 
The Leader in Integrated Media Intelligence Solutions





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Possible to improve query plan?

2011-01-17 Thread Kevin Grittner
Jeremy Palmer  wrote:
 
   WHERE (
   (_revision_created = 16
AND _revision_expired  16
AND _revision_expired = 40)
OR (_revision_created  16
AND _revision_created = 40))
 
 - Bitmap Heap Scan on version_crs_coordinate_revision
  (actual time=70.925..13531.720 rows=149557 loops=1)
 
 - BitmapOr (actual time=53.650..53.650 rows=0 loops=1)
 
This plan actually looks pretty good for what you're doing.  The
Bitmap Index Scans and BitmapOr determine which tuples in the heap
need to be visited.  The Bitmap Heap Scan then visits the heap pages
in physical order (to avoid repeated fetches of the same page and to
possibly edge toward sequential access speeds).  You don't seem to
have a lot of bloat, which could be a killer on this type of query,
since the rowcounts from the index scans aren't that much higher than
the counts after you check the heap.
 
The only thing I can think of which might help is to CLUSTER the
table on whichever of the two indexes used in the plan which is
typically more selective for such queries.  (In the example query
that seems to be idx_crs_coordinate_revision_created.)  That might
reduce the number of heap pages which need to be accessed and/or put
place them close enough that you'll get some sequential readahead.
 
I guess you could also try adjusting effective_io_concurrency upward
to see if that helps.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Bad plan when join on function

2011-01-17 Thread Zotov

It`s just a sample.

select c.id from OneRow c join abstract a on a.id=AsInteger(c.id)

Nested Loop  (cost=0.00..786642.96 rows=1 width=4) (actual 
time=91021.167..119601.344 rows=1 loops=1)

  Join Filter: ((a.id)::integer = asinteger((c.id)::integer))
  -  Seq Scan on onerow c  (cost=0.00..1.01 rows=1 width=4) (actual 
time=0.007..0.008 rows=1 loops=1)
  -  Seq Scan on abstract a  (cost=0.00..442339.78 rows=22953478 
width=4) (actual time=0.003..115193.283 rows=22953478 loops=1)

Total runtime: 119601.428 ms


select c.id from OneRow c join abstract a on a.id=c.id

Nested Loop  (cost=0.00..13.85 rows=1 width=4) (actual 
time=254.579..254.585 rows=1 loops=1)
  -  Seq Scan on onerow c  (cost=0.00..1.01 rows=1 width=4) (actual 
time=0.006..0.007 rows=1 loops=1)
  -  Index Scan using integ_1197 on abstract a  (cost=0.00..12.83 
rows=1 width=4) (actual time=254.559..254.563 rows=1 loops=1)

Index Cond: ((a.id)::integer = (c.id)::integer)
Total runtime: 254.648 ms


OneRow Contains only one row,
abstract contains 22 953 500 rows

AsInteger is simple function on Delphi
it just return input value

CREATE OR REPLACE FUNCTION asinteger(integer)
  RETURNS integer AS
'oeudfpg.dll', 'AsInteger'
  LANGUAGE c VOLATILE
  COST 1;


Why SeqScan???

this query is simple sample to show SLOW seq scan plan
I have a real query what i don`t know when it will be done... but at 
firebird this query with full fetch 1-2 minutes
I can`t give you this real query and database (database size is more, 
than 20 GB)

as i see that query have same problem as this sample
It`s so sad, because I spend so much time to support posgtresql in my 
project and now i see what more queries is slower more than 10 times...

Please HELP!

PostgreSQL version 9.0.2

--
С уважением,
Зотов Роман Владимирович
руководитель Отдела инструментария
ЗАО НПО Консультант
г.Иваново, ул. Палехская, д. 10
тел./факс: (4932) 41-01-21
mailto: zo...@oe-it.ru



Re: [PERFORM] Bad plan when join on function

2011-01-17 Thread Pavel Stehule
2011/1/17 Zotov zo...@oe-it.ru:
 It`s just a sample.

 select c.id from OneRow c join abstract a on a.id=AsInteger(c.id)

 Nested Loop  (cost=0.00..786642.96 rows=1 width=4) (actual
 time=91021.167..119601.344 rows=1 loops=1)
   Join Filter: ((a.id)::integer = asinteger((c.id)::integer))
   -  Seq Scan on onerow c  (cost=0.00..1.01 rows=1 width=4) (actual
 time=0.007..0.008 rows=1 loops=1)
   -  Seq Scan on abstract a  (cost=0.00..442339.78 rows=22953478 width=4)
 (actual time=0.003..115193.283 rows=22953478 loops=1)
 Total runtime: 119601.428 ms


 select c.id from OneRow c join abstract a on a.id=c.id

 Nested Loop  (cost=0.00..13.85 rows=1 width=4) (actual
 time=254.579..254.585 rows=1 loops=1)
   -  Seq Scan on onerow c  (cost=0.00..1.01 rows=1 width=4) (actual
 time=0.006..0.007 rows=1 loops=1)
   -  Index Scan using integ_1197 on abstract a  (cost=0.00..12.83 rows=1
 width=4) (actual time=254.559..254.563 rows=1 loops=1)
     Index Cond: ((a.id)::integer = (c.id)::integer)
 Total runtime: 254.648 ms


 OneRow Contains only one row,
 abstract contains 22 953 500 rows

 AsInteger is simple function on Delphi
 it just return input value

 CREATE OR REPLACE FUNCTION asinteger(integer)
   RETURNS integer AS
 'oeudfpg.dll', 'AsInteger'
   LANGUAGE c VOLATILE
   COST 1;

are you sure so your function needs a VOLATILE flag?

Regards

Pavel Stehule



 Why SeqScan???

 this query is simple sample to show SLOW seq scan plan
 I have a real query what i don`t know when it will be done... but at
 firebird this query with full fetch 1-2 minutes
 I can`t give you this real query and database (database size is more, than
 20 GB)
 as i see that query have same problem as this sample
 It`s so sad, because I spend so much time to support posgtresql in my
 project and now i see what more queries is slower more than 10 times...
 Please HELP!

 PostgreSQL version 9.0.2

 --
 С уважением,
 Зотов Роман Владимирович
 руководитель Отдела инструментария
 ЗАО НПО Консультант
 г.Иваново, ул. Палехская, д. 10
 тел./факс: (4932) 41-01-21
 mailto: zo...@oe-it.ru

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Possible to improve query plan?

2011-01-17 Thread Tom Lane
Jeremy Palmer jpal...@linz.govt.nz writes:
 I've come to a dead end in trying to get a commonly used query to
 perform better.

 EXPLAIN
 SELECT * FROM (
 SELECT
 row_number() OVER (PARTITION BY id ORDER BY _revision_created DESC) 
 as row_number,
 * 
 FROM
 version_crs_coordinate_revision
 WHERE (
 (_revision_created = 16 AND _revision_expired  16 AND 
 _revision_expired = 40) OR 
 (_revision_created  16 AND _revision_created = 40)
 )
 ) AS T 
 WHERE row_number = 1;

If I'm not mistaken, that's a DB2-ish locution for a query with DISTINCT
ON, ie, you're looking for the row with highest _revision_created for
each value of id.  It might perform well on DB2, but it's going to
mostly suck on Postgres --- we don't optimize window-function queries
very much at all at the moment.  Try writing it with DISTINCT ON instead
of a window function, like so:

SELECT DISTINCT ON (id)
* 
FROM
version_crs_coordinate_revision
WHERE (
(_revision_created = 16 AND _revision_expired  16 AND 
_revision_expired = 40) OR 
(_revision_created  16 AND _revision_created = 40)
)
ORDER BY id, _revision_created DESC;

You could also experiment with various forms of GROUP BY if you're loath
to use any Postgres-specific syntax.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad plan when join on function

2011-01-17 Thread Kevin Grittner
Zotov  wrote:
 
 select c.id from OneRow c join abstract a on a.id=AsInteger(c.id)
 
 Why SeqScan???
 
Because you don't have an index on AsInteger(c.id).
 
If your function is IMMUTABLE (each possible combination of input
values always yields the same result), and you declare it such, then
you can index on the function, and it will perform at a speed similar
to the other example.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Possible to improve query plan?

2011-01-17 Thread Jeremy Palmer
It fits a Data Warehousing type application. 

Apart from work_mem, my other parameters are pretty close to these numbers. I 
had the work_mem down a little because a noticed some clients were getting out 
of memory errors with large queries which involved lots of sorting.

Thanks
Jeremy

-Original Message-
From: Ing. Marcos Ortiz Valmaseda [mailto:mlor...@uci.cu] 
Sent: Tuesday, 18 January 2011 2:38 a.m.
To: Jeremy Palmer
Cc: pgsql-performance@postgresql.org; Andy Colson
Subject: Re: [PERFORM] Possible to improve query plan?

Which is the type of your application? You can see it on the Performance 
Whackamole Presentation from Josh Berkus on the 
PgCon 2009:
- Web application
- Online Transaction Processing (OLTP)
- Data WareHousing (DW)

And based on the type of your application, you can configure the 
postgresql.conf to gain a better performance of your PostgreSQL server.
PostgreSQL postgresql.conf baseline:
 shared_buffers = 25% RAM
 work_mem = 512K[W] 2 MB[O] 128 MB[D]
 - but no more that RAM/no_connections
 maintenance_work_mem = 1/16 RAM
 checkpoint_segments = 8 [W], 16-64 [O], [D]
 wal_buffer = 1 MB [W], 8 MB [O], [D]
 effective_cache_size = 2/3 RAM

Regards
 

Ing. Marcos Luís Ortíz Valmaseda
Linux User # 418229  PostgreSQL DBA
Centro de Tecnologías Gestión de Datos (DATEC)
http://postgresql.uci.cu
http://www.postgresql.org
http://it.toolbox.com/blogs/sql-apprentice

__

This message contains information, which is confidential and may be subject to 
legal privilege. 
If you are not the intended recipient, you must not peruse, use, disseminate, 
distribute or copy this message.
If you have received this message in error, please notify us immediately (Phone 
0800 665 463 or i...@linz.govt.nz) and destroy the original message.
LINZ accepts no responsibility for changes to this email, or for any 
attachments, after its transmission from LINZ.

Thank you.
__

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Possible to improve query plan?

2011-01-17 Thread Jeremy Palmer
Thanks that seems to make the query 10-15% faster :)

Cheers
jeremy

-Original Message-
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 
Sent: Tuesday, 18 January 2011 9:24 a.m.
To: Jeremy Palmer
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Possible to improve query plan? 

Jeremy Palmer jpal...@linz.govt.nz writes:
 I've come to a dead end in trying to get a commonly used query to
 perform better.

 EXPLAIN
 SELECT * FROM (
 SELECT
 row_number() OVER (PARTITION BY id ORDER BY _revision_created DESC) 
 as row_number,
 * 
 FROM
 version_crs_coordinate_revision
 WHERE (
 (_revision_created = 16 AND _revision_expired  16 AND 
 _revision_expired = 40) OR 
 (_revision_created  16 AND _revision_created = 40)
 )
 ) AS T 
 WHERE row_number = 1;

If I'm not mistaken, that's a DB2-ish locution for a query with DISTINCT
ON, ie, you're looking for the row with highest _revision_created for
each value of id.  It might perform well on DB2, but it's going to
mostly suck on Postgres --- we don't optimize window-function queries
very much at all at the moment.  Try writing it with DISTINCT ON instead
of a window function, like so:

SELECT DISTINCT ON (id)
* 
FROM
version_crs_coordinate_revision
WHERE (
(_revision_created = 16 AND _revision_expired  16 AND 
_revision_expired = 40) OR 
(_revision_created  16 AND _revision_created = 40)
)
ORDER BY id, _revision_created DESC;

You could also experiment with various forms of GROUP BY if you're loath
to use any Postgres-specific syntax.

regards, tom lane
__

This message contains information, which is confidential and may be subject to 
legal privilege. 
If you are not the intended recipient, you must not peruse, use, disseminate, 
distribute or copy this message.
If you have received this message in error, please notify us immediately (Phone 
0800 665 463 or i...@linz.govt.nz) and destroy the original message.
LINZ accepts no responsibility for changes to this email, or for any 
attachments, after its transmission from LINZ.

Thank you.
__

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad plan when join on function

2011-01-17 Thread Pavel Stehule
2011/1/17 Kevin Grittner kevin.gritt...@wicourts.gov:
 Zotov  wrote:

 select c.id from OneRow c join abstract a on a.id=AsInteger(c.id)

 Why SeqScan???

 Because you don't have an index on AsInteger(c.id).

 If your function is IMMUTABLE (each possible combination of input
 values always yields the same result), and you declare it such, then
 you can index on the function, and it will perform at a speed similar
 to the other example.

it should to work without functional index - but not sure about effectivity

postgres=# explain select 1 from a join b on a.f = sin(b.f);
 QUERY PLAN
-
 Merge Join  (cost=809.39..1352.64 rows=1 width=0)
   Merge Cond: (a.f = (sin(b.f)))
   -  Index Scan using a_f_idx on a  (cost=0.00..318.25 rows=1 width=8)
   -  Sort  (cost=809.39..834.39 rows=1 width=8)
 Sort Key: (sin(b.f))
 -  Seq Scan on b  (cost=0.00..145.00 rows=1 width=8)
(6 rows)

but functional index always helps

postgres=# create index on b((sin(f)));
CREATE INDEX
postgres=# explain select 1 from a join b on a.f = sin(b.f);
  QUERY PLAN
---
 Merge Join  (cost=0.00..968.50 rows=1 width=0)
   Merge Cond: (a.f = sin(b.f))
   -  Index Scan using a_f_idx on a  (cost=0.00..318.25 rows=1 width=8)
   -  Index Scan using b_sin_idx on b  (cost=0.00..450.25 rows=1 width=8)
(4 rows)

regards

Pavel Stehule

 -Kevin

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad plan when join on function

2011-01-17 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 it should to work without functional index - but not sure about effectivity

As long as the function is VOLATILE, the planner can't use any
intelligent query plan.  Merge or hash join both require at least
stable join keys.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Bad plan when join on function

2011-01-17 Thread Pavel Stehule
2011/1/17 Tom Lane t...@sss.pgh.pa.us:
 Pavel Stehule pavel.steh...@gmail.com writes:
 it should to work without functional index - but not sure about effectivity

 As long as the function is VOLATILE, the planner can't use any
 intelligent query plan.  Merge or hash join both require at least
 stable join keys.

sure, my first advice was a question about function volatility - and
my sentence was related to using immutable function.

regards

Pavel Stehule


                        regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Possible to improve query plan?

2011-01-17 Thread Mladen Gogala

Tom Lane wrote:
If I'm not mistaken, that's a DB2-ish locution 


It could also be a part of the Oracle vernacular. I've seen queries like 
that running against Oracle RDBMS, too.



for a query with DISTINCT
ON, ie, you're looking for the row with highest _revision_created for
each value of id.  It might perform well on DB2, but it's going to
mostly suck on Postgres --- we don't optimize window-function queries
very much at all at the moment.  
Hmmm, what optimizations do you have in mind? I thought that window 
functions are just clever tricks with memory? Anything that can be 
expected for 9.0x?




Try writing it with DISTINCT ON instead
of a window function, like so:
  

Wouldn't distinct necessarily bring about the sort/merge?

--

Mladen Gogala 
Sr. Oracle DBA

1500 Broadway
New York, NY 10036
(212) 329-5251
http://www.vmsinfo.com 
The Leader in Integrated Media Intelligence Solutions





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance