Re: [PERFORM] Poor performance using CTE

2012-11-23 Thread Gavin Flower

On 22/11/12 04:32, Andres Freund wrote:

On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote:

On 11/21/2012 09:59 AM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

If we're going to do it can we please come up with something more
intuitive and much, much more documented than OFFSET 0? And if/when we
do this we'll need to have big red warnings all over then release notes,
since a lot of people I know will need to do some extensive remediation
before moving to such a release.

The probability that we would actually *remove* that behavior of OFFSET
0 is not distinguishable from zero.  I'm not terribly excited about
having an alternate syntax to specify an optimization fence, but even
if we do create such a thing, there's no need to break the historical
usage.


I wasn't talking about removing it. My point was that if the optimization
fence around CTEs is removed a lot of people will need to rework apps where
they have used them for that purpose. And I continue to think that spelling
it OFFSET 0 is horribly obscure.

+1

WITH foo AS (SELECT ...) (barrier=on|off)?

9.3 introduces the syntax, defaulting to on
9.4 switches the default to off.

Greetings,

Andres Freund




WITH foo AS (SELECT ...) (fence=on|off)?


WITH foo AS (SELECT ...) (optimisation_fence=on|off)?




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-23 Thread Gavin Flower

On 22/11/12 08:42, Andrew Dunstan wrote:


On 11/21/2012 02:30 PM, Gavin Flower wrote:

WITH FENCE foo AS (SELECT ...)
default?


WITHOUT FENCE foo AS (SELECT ...) :-)
Nah!
I prefer this, but it is too specific to 'WITH',
and very unSQL standardish!

Alternatively one of the following

 1. WITH UNFENCED foo AS (SELECT ...)
 2. WITH NO FENCE foo AS (SELECT ...)
 3. WITH NOT FENCE foo AS (SELECT ...)

I loke the firsat variant, but the 3rd is
most SQL standardish!



As Tom (I think) pointed out, we should not have a syntax tied to CTEs.

cheers

andrew

If other SQL constructs have a optimisation fence, then the FENCE  NOT 
FENCE syntax can be used theire as well.


So what am I missing? (obviously WITHOUT FENCE would not make sense 
elsewhere, but I wasn't really being serious when I suggested it!)




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-23 Thread Bruce Momjian
On Tue, Nov 20, 2012 at 02:24:01PM -0600, Merlin Moncure wrote:
 On Tue, Nov 20, 2012 at 1:53 PM, Jon Nelson jnelson+pg...@jamponi.net wrote:
  As can be seen by the current conversation, not everyone is convinced
 that CTEs ought to be an explicit optimization barrier
 
 On Tue, Nov 20, 2012 at 1:26 PM, Claudio Freire klaussfre...@gmail.com 
 wrote:
  It *could* just be a lack of imagination on my part. But if it were
  not, then it'd be nice for it to be done automatically (since this
  particular CTE behavior bites enough people already).
 
 Sure.  I just find it personally hard to find a good demarcation line
 between A: queries where pushing quals through are universally
 beneficial and wanted and B: queries where we are inserting an
 explicit materialization step to avoid planner issues, particularly
 where there is substantial overlap with between A and C: queries that
 are written with a CTE and arguably shouldn't be.
 
 Put another way, I find CTE to express: 'this then that' where joins
 express 'this with that'.  So current behavior is not surprising at
 all. All that said, there could be a narrow class of low hanging cases
 (such as the OP's) that could be sniped...I'm just skeptical.

Is thi
-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-22 Thread Jon Nelson
On Thu, Nov 22, 2012 at 7:42 AM, Jeremy Harris j...@wizmail.org wrote:
 On 22/11/2012 00:08, Craig Ringer wrote:

 WITH
FENCE foo AS (SELECT ...),
bar AS (SELECT ...)
 SELECT * FROM bar;

 Are we fencing just foo? Or all expressions?


 WITH foo AS (FENCED SELECT ...),
  bar AS (SELECT ...),
 SELECT ... ;

I would much rather see 'MATERIALIZE' instead of 'FENCED', unless the
by the latter you mean to forbid *all* optimizations, whereas with the
latter the meaning is pretty clear.

-- 
Jon


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-22 Thread Vitalii Tymchyshyn
I'd also add ANALYZED/NOT ANALYZED. This should force it behave like
'create table, analyze, select' with statistics used in second query plan.

P.S. defaults can be configurable.
20 лист. 2012 02:22, Gavin Flower gavinflo...@archidevsys.co.nz напис.

 On 15/11/12 15:03, Peter Geoghegan wrote:

 On 15 November 2012 01:46, Andrew Dunstan and...@dunslane.net wrote:

 It cuts both ways. I have used CTEs a LOT precisely because this
 behaviour
 lets me get better plans. Without that I'll be back to using the offset
 0
 hack.

 Is the OFFSET 0 hack really so bad? We've been telling people to do
 that for years, so it's already something that we've effectively
 committed to.

  How about adding the keywords FENCED and NOT FENCED to the SQL
 definition of CTE's - with FENCED being the default?


 Cheers,
 Gavin



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.**
 org pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/**mailpref/pgsql-performancehttp://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Heikki Linnakangas

On 21.11.2012 01:53, Tom Lane wrote:

I think the more interesting question is what cases wouldn't be covered
by such a rule.  Typically you need to use OFFSET 0 in situations where
the planner has guessed wrong about costs or rowcounts, and I think
people are likely using WITH for that as well.  Should we be telling
people that they ought to insert OFFSET 0 in WITH queries if they want
to be sure there's an optimization fence?


Yes, I strongly feel that we should. Writing a query using WITH often 
makes it more readable. It would be a shame if people have to refrain 
from using it, because the planner treats it as an optimization fence.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Peter Geoghegan
On 21 November 2012 13:04, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 Yes, I strongly feel that we should. Writing a query using WITH often makes
 it more readable. It would be a shame if people have to refrain from using
 it, because the planner treats it as an optimization fence.

+1

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Andrew Dunstan


On 11/21/2012 08:04 AM, Heikki Linnakangas wrote:

On 21.11.2012 01:53, Tom Lane wrote:

I think the more interesting question is what cases wouldn't be covered
by such a rule.  Typically you need to use OFFSET 0 in situations where
the planner has guessed wrong about costs or rowcounts, and I think
people are likely using WITH for that as well.  Should we be telling
people that they ought to insert OFFSET 0 in WITH queries if they want
to be sure there's an optimization fence?


Yes, I strongly feel that we should. Writing a query using WITH often 
makes it more readable. It would be a shame if people have to refrain 
from using it, because the planner treats it as an optimization fence.







If we're going to do it can we please come up with something more 
intuitive and much, much more documented than OFFSET 0? And if/when we 
do this we'll need to have big red warnings all over then release notes, 
since a lot of people I know will need to do some extensive remediation 
before moving to such a release.


cheers

andrew



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 If we're going to do it can we please come up with something more 
 intuitive and much, much more documented than OFFSET 0? And if/when we 
 do this we'll need to have big red warnings all over then release notes, 
 since a lot of people I know will need to do some extensive remediation 
 before moving to such a release.

The probability that we would actually *remove* that behavior of OFFSET
0 is not distinguishable from zero.  I'm not terribly excited about
having an alternate syntax to specify an optimization fence, but even
if we do create such a thing, there's no need to break the historical
usage.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Andrew Dunstan


On 11/21/2012 09:59 AM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

If we're going to do it can we please come up with something more
intuitive and much, much more documented than OFFSET 0? And if/when we
do this we'll need to have big red warnings all over then release notes,
since a lot of people I know will need to do some extensive remediation
before moving to such a release.

The probability that we would actually *remove* that behavior of OFFSET
0 is not distinguishable from zero.  I'm not terribly excited about
having an alternate syntax to specify an optimization fence, but even
if we do create such a thing, there's no need to break the historical
usage.



I wasn't talking about removing it. My point was that if the 
optimization fence around CTEs is removed a lot of people will need to 
rework apps where they have used them for that purpose. And I continue 
to think that spelling it OFFSET 0 is horribly obscure.


cheers

andrew


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Andres Freund
On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote:
 
 On 11/21/2012 09:59 AM, Tom Lane wrote:
 Andrew Dunstan and...@dunslane.net writes:
 If we're going to do it can we please come up with something more
 intuitive and much, much more documented than OFFSET 0? And if/when we
 do this we'll need to have big red warnings all over then release notes,
 since a lot of people I know will need to do some extensive remediation
 before moving to such a release.
 The probability that we would actually *remove* that behavior of OFFSET
 0 is not distinguishable from zero.  I'm not terribly excited about
 having an alternate syntax to specify an optimization fence, but even
 if we do create such a thing, there's no need to break the historical
 usage.
 
 
 I wasn't talking about removing it. My point was that if the optimization
 fence around CTEs is removed a lot of people will need to rework apps where
 they have used them for that purpose. And I continue to think that spelling
 it OFFSET 0 is horribly obscure.

+1

WITH foo AS (SELECT ...) (barrier=on|off)?

9.3 introduces the syntax, defaulting to on
9.4 switches the default to off.

Greetings,

Andres Freund


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Peter Geoghegan
On 21 November 2012 15:21, Andrew Dunstan and...@dunslane.net wrote:
 And I continue to think that spelling it OFFSET 0 is horribly obscure.

I'm not sure that it's any more obscure than the very idea of an
optimisation fence.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Heikki Linnakangas

On 21.11.2012 17:42, Gavin Flower wrote:

On 22/11/12 04:32, Andres Freund wrote:

On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote:

I wasn't talking about removing it. My point was that if the
optimization
fence around CTEs is removed a lot of people will need to rework apps
where
they have used them for that purpose. And I continue to think that
spelling
it OFFSET 0 is horribly obscure.

+1


FWIW, I'm happy with OFFSET 0. Granted, it's pretty obscure, but 
that's what we've historically recommended, and it's pretty ugly to have 
to specify a fence like that in the first place. Whenever you have to 
resort to it, you ought have a comment in the query explaining why you 
need to force the planner like that, anyway.



WITH foo AS (SELECT ...) (barrier=on|off)?

9.3 introduces the syntax, defaulting to on
9.4 switches the default to off.


WITH foo AS (SELECT ...) (fence=on|off)?

WITH foo AS (SELECT ...) (optimisation_fence=on|off)?


If we are to invent a new syntax for this, can we please come up with 
something that's more widely applicable than just the WITH syntax. 
Something that you could use to replace OFFSET 0 in a subquery, too.


- Heikki


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Claudio Freire
On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund and...@anarazel.de wrote:
 +1

 WITH foo AS (SELECT ...) (barrier=on|off)?

 9.3 introduces the syntax, defaulting to on
 9.4 switches the default to off.

Why syntax? What about a guc?

collapse_cte_limit?


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Andres Freund
On 2012-11-21 13:16:25 -0300, Claudio Freire wrote:
 On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund and...@anarazel.de wrote:
  +1
 
  WITH foo AS (SELECT ...) (barrier=on|off)?
 
  9.3 introduces the syntax, defaulting to on
  9.4 switches the default to off.

 Why syntax? What about a guc?

 collapse_cte_limit?

Because there are very good reasons to want to current behaviour. A guc
is a global either/or so I don't see it helping much.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Claudio Freire
On Wed, Nov 21, 2012 at 1:24 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2012-11-21 13:16:25 -0300, Claudio Freire wrote:
 On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund and...@anarazel.de wrote:
  +1
 
  WITH foo AS (SELECT ...) (barrier=on|off)?
 
  9.3 introduces the syntax, defaulting to on
  9.4 switches the default to off.

 Why syntax? What about a guc?

 collapse_cte_limit?

 Because there are very good reasons to want to current behaviour. A guc
 is a global either/or so I don't see it helping much.

set collapse_cte_limit=8;
with blah as (blah) select blah;

Not global at all.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Andres Freund
On 2012-11-21 13:32:45 -0300, Claudio Freire wrote:
 On Wed, Nov 21, 2012 at 1:24 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2012-11-21 13:16:25 -0300, Claudio Freire wrote:
  On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund and...@anarazel.de wrote:
   +1
  
   WITH foo AS (SELECT ...) (barrier=on|off)?
  
   9.3 introduces the syntax, defaulting to on
   9.4 switches the default to off.
 
  Why syntax? What about a guc?
 
  collapse_cte_limit?
 
  Because there are very good reasons to want to current behaviour. A guc
  is a global either/or so I don't see it helping much.
 
 set collapse_cte_limit=8;
 with blah as (blah) select blah;
 
 Not global at all.

Not very manageable though. And it doesn't help if you need both in a
query which isn't actually that unlikely.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Andrew Dunstan


On 11/21/2012 11:32 AM, Claudio Freire wrote:

On Wed, Nov 21, 2012 at 1:24 PM, Andres Freund and...@2ndquadrant.com wrote:

On 2012-11-21 13:16:25 -0300, Claudio Freire wrote:

On Wed, Nov 21, 2012 at 12:32 PM, Andres Freund and...@anarazel.de wrote:

+1

WITH foo AS (SELECT ...) (barrier=on|off)?

9.3 introduces the syntax, defaulting to on
9.4 switches the default to off.

Why syntax? What about a guc?

collapse_cte_limit?

Because there are very good reasons to want to current behaviour. A guc
is a global either/or so I don't see it helping much.

set collapse_cte_limit=8;
with blah as (blah) select blah;

Not global at all.



Then you have to unset it again, which is ugly. You might even want it 
applying to *part* of a query, not the whole thing, so this strikes me 
as a dead end.


cheers

andrew


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Claudio Freire
On Wed, Nov 21, 2012 at 1:35 PM, Andrew Dunstan and...@dunslane.net wrote:
 Why syntax? What about a guc?

 collapse_cte_limit?

 Because there are very good reasons to want to current behaviour. A guc
 is a global either/or so I don't see it helping much.

 set collapse_cte_limit=8;
 with blah as (blah) select blah;

 Not global at all.


 Then you have to unset it again, which is ugly. You might even want it
 applying to *part* of a query, not the whole thing, so this strikes me as a
 dead end.

Really?

Because I've seen here people that want it generally (because
Oracle/MSSQL/your favourite db does it), and people that don't want it
(generally because they need it). I haven't seen any mention to mixing
fenced and unfenced usage.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Tom Lane
Claudio Freire klaussfre...@gmail.com writes:
 collapse_cte_limit?

The join collapse limits address a completely different problem (ie,
explosion of planning time with too many relations), and are pretty much
useless as a model for this.  As multiple people told you already,
optimization fences are typically wanted for only specific subqueries.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Andrew Dunstan


On 11/21/2012 02:30 PM, Gavin Flower wrote:

WITH FENCE foo AS (SELECT ...)
default?


WITHOUT FENCE foo AS (SELECT ...) :-)
Nah!
I prefer this, but it is too specific to 'WITH',
and very unSQL standardish!

Alternatively one of the following

 1. WITH UNFENCED foo AS (SELECT ...)
 2. WITH NO FENCE foo AS (SELECT ...)
 3. WITH NOT FENCE foo AS (SELECT ...)

I loke the firsat variant, but the 3rd is
most SQL standardish!



As Tom (I think) pointed out, we should not have a syntax tied to CTEs.

cheers

andrew



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Craig Ringer
On 11/22/2012 03:30 AM, Gavin Flower wrote:
 On 22/11/12 04:56, Heikki Linnakangas wrote:
 On 21.11.2012 17:42, Gavin Flower wrote:
 On 22/11/12 04:32, Andres Freund wrote:
 On 2012-11-21 10:21:16 -0500, Andrew Dunstan wrote:
 I wasn't talking about removing it. My point was that if the
 optimization
 fence around CTEs is removed a lot of people will need to rework apps
 where
 they have used them for that purpose. And I continue to think that
 spelling
 it OFFSET 0 is horribly obscure.
 +1

 FWIW, I'm happy with OFFSET 0. Granted, it's pretty obscure, but
 that's what we've historically recommended, and it's pretty ugly to
 have to specify a fence like that in the first place. Whenever you
 have to resort to it, you ought have a comment in the query
 explaining why you need to force the planner like that, anyway.

 WITH foo AS (SELECT ...) (barrier=on|off)?

 9.3 introduces the syntax, defaulting to on
 9.4 switches the default to off.

 WITH foo AS (SELECT ...) (fence=on|off)?

 WITH foo AS (SELECT ...) (optimisation_fence=on|off)?

 If we are to invent a new syntax for this, can we please come up with
 something that's more widely applicable than just the WITH syntax.
 Something that you could use to replace OFFSET 0 in a subquery, too.

 - Heikki
 WITH FENCE foo AS (SELECT ...)
 default?
That doesn't bind tightly enough to a specific CTE term. Consider:

WITH
  FENCE foo AS (SELECT ...),
  bar AS (SELECT ...)
SELECT * FROM bar;

Are we fencing just foo? Or all expressions?


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



Re: [PERFORM] Poor performance using CTE

2012-11-21 Thread Craig Ringer
On 11/22/2012 08:38 AM, Gavin Flower wrote:
 I suspect most people are blissfully unaware of CTE's being fenced, or
 at least not really sure what it means. So I suspect NOT FENCE would
 be the better default.
It's also probably more standard, and a better fit with what other DBs do.

Pg would still need to detect conditions like the use of functions with
side effects or (obviously) INSERT/UPDATE/DELETE wCTEs and not push
conditions down into them / pull conditions up from them, etc. That's
how I read the standard, though; must have the same effect as if the
queries were executed as written, so Pg is free to transform them so
long as it doesn't change the results.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-20 Thread Merlin Moncure
On Tue, Nov 13, 2012 at 2:57 PM, David Greco
david_gr...@harte-hanks.com wrote:
 Have a query using a CTE that is performing very poorly. The equivalent
 query against the same data in an Oracle database runs in under 1 second, in
 Postgres  it takes 2000 seconds.



 The smp_pkg.get_invoice_charges queries fedexinvoices for some data and
 normalizes it into a SETOF some record type. It is declared to be STABLE.
 Fedexinvoices consists of about 1.3M rows of medium width. Fedexinvoices.id
 is the primary key on that table, and trim(fedexinvoices.trackno) is indexed
 via the function trim.



 The plan for the equivalent query in Oracle is much smaller and simpler. No
 sequential (or full table) scans on fedexinvoices.







 WITH charges as (

 SELECT fi2.id, smp_pkg.get_invoice_charges(fi2.id)
 charge_info from fedexinvoices fi2

 )

 select fedexinvoices.* from

 fedexinvoices

 inner join charges on charges.id = fedexinvoices.id AND
 (charges.charge_info).charge_name IN ('ADDRESS CORRECTION CHARGE','ADDRESS
 CORRECTION')

 where

 trim(fedexinvoices.trackno)='799159791643'

 ;



 Explain Analyze output, I abbreviated some of the column lists for brevity:



 Nested Loop  (cost=457380.38..487940.77 rows=1 width=1024) (actual
 time=1978019.858..1978019.858 rows=0 loops=1)

   Output: fedexinvoices.id, ………

   Join Filter: (fedexinvoices.id = charges.id)

   Buffers: shared hit=20387611, temp written=94071

   CTE charges

 -  Seq Scan on hits.fedexinvoices fi2  (cost=0.00..457380.38
 rows=1350513 width=8) (actual time=0.613..1964632.763 rows=9007863 loops=1)

   Output: fi2.id, smp_pkg.get_invoice_charges(fi2.id,
 NULL::character varying)

   Buffers: shared hit=20387606

   -  Index Scan using fedexinvoices_trim_track_idx on hits.fedexinvoices
 (cost=0.00..5.46 rows=1 width=1024) (actual time=0.024..0.026 rows=1
 loops=1)

 Output: fedexinvoices.id, ………

 Index Cond: (btrim((fedexinvoices.trackno)::text) =
 '799159791643'::text)

 Buffers: shared hit=5

   -  CTE Scan on charges  (cost=0.00..30386.54 rows=13471 width=8) (actual
 time=1978019.827..1978019.827 rows=0 loops=1)

 Output: charges.id, charges.charge_info

 Filter: (((charges.charge_info).charge_name)::text = ANY ('{ADDRESS
 CORRECTION CHARGE,ADDRESS CORRECTION}'::text[]))

 Buffers: shared hit=20387606, temp written=94071

 Total runtime: 1978214.743 ms

The problem here is very clear.  Oracle is optimizing through the CTE.
 PostgreSQL does not do this by design -- CTE's are used as a forced
materialization step.

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-20 Thread Claudio Freire
On Tue, Nov 20, 2012 at 12:04 PM, Merlin Moncure mmonc...@gmail.com wrote:
 The problem here is very clear.  Oracle is optimizing through the CTE.
  PostgreSQL does not do this by design -- CTE's are used as a forced
 materialization step.

While I love that design (it lets me solve lots of problems for huge
queries), wouldn't pushing constraints into the CTE be a rather safe
optimization?


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-20 Thread Merlin Moncure
On Tue, Nov 20, 2012 at 9:10 AM, Claudio Freire klaussfre...@gmail.com wrote:
 On Tue, Nov 20, 2012 at 12:04 PM, Merlin Moncure mmonc...@gmail.com wrote:
 The problem here is very clear.  Oracle is optimizing through the CTE.
  PostgreSQL does not do this by design -- CTE's are used as a forced
 materialization step.

 While I love that design (it lets me solve lots of problems for huge
 queries), wouldn't pushing constraints into the CTE be a rather safe
 optimization?

sure, or rewrite query as classic join.

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-20 Thread Claudio Freire
On Tue, Nov 20, 2012 at 12:23 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Nov 20, 2012 at 9:10 AM, Claudio Freire klaussfre...@gmail.com 
 wrote:
 On Tue, Nov 20, 2012 at 12:04 PM, Merlin Moncure mmonc...@gmail.com wrote:
 The problem here is very clear.  Oracle is optimizing through the CTE.
  PostgreSQL does not do this by design -- CTE's are used as a forced
 materialization step.

 While I love that design (it lets me solve lots of problems for huge
 queries), wouldn't pushing constraints into the CTE be a rather safe
 optimization?

 sure, or rewrite query as classic join.

I meant for postgres to do automatically. Rewriting as a join wouldn't
work as an optimization fence the way we're used to, but pushing
constraints upwards can only help (especially if highly selective).


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-20 Thread Merlin Moncure
On Wed, Nov 14, 2012 at 8:03 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 On 15 November 2012 01:46, Andrew Dunstan and...@dunslane.net wrote:
 It cuts both ways. I have used CTEs a LOT precisely because this behaviour
 lets me get better plans. Without that I'll be back to using the offset 0
 hack.

 Is the OFFSET 0 hack really so bad? We've been telling people to do
 that for years, so it's already something that we've effectively
 committed to.

IMSNHO, 'OFFSET 0' is completely unreadable black magic.  I agree with
Andrew: CTEs allow for manual composition of queries and can be the
best tool when the planner is outsmarting itself.  In the old days,
we'd extract data to a temp table and join against that: CTE are
essentially a formalization of that technique.  I like things the way
they are; if CTE are hurting your plan, that's an indication you're
using them inappropriately.

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-20 Thread Claudio Freire
On Tue, Nov 20, 2012 at 4:22 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Nov 14, 2012 at 8:03 PM, Peter Geoghegan pe...@2ndquadrant.com 
 wrote:
 On 15 November 2012 01:46, Andrew Dunstan and...@dunslane.net wrote:
 It cuts both ways. I have used CTEs a LOT precisely because this behaviour
 lets me get better plans. Without that I'll be back to using the offset 0
 hack.

 Is the OFFSET 0 hack really so bad? We've been telling people to do
 that for years, so it's already something that we've effectively
 committed to.

 IMSNHO, 'OFFSET 0' is completely unreadable black magic.  I agree with
 Andrew: CTEs allow for manual composition of queries and can be the
 best tool when the planner is outsmarting itself.  In the old days,
 we'd extract data to a temp table and join against that: CTE are
 essentially a formalization of that technique.  I like things the way
 they are; if CTE are hurting your plan, that's an indication you're
 using them inappropriately.

I agree, **BUT**, I cannot imagine how pushing constraints to the CTE
(under adequate conditions) could be anything but beneficial.

It *could* just be a lack of imagination on my part. But if it were
not, then it'd be nice for it to be done automatically (since this
particular CTE behavior bites enough people already).


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-20 Thread Jon Nelson
My perspective on this is that CTEs *should* be just like creating a
temporary table and then joining to it, but without the
materialization costs. In that respect, they seem like they should be
like nifty VIEWs. If I wanted the behavior of materialization and then
join, I'd do that explicitly with temporary tables, but using CTEs as
an explicit optimization barrier feels like the explaining away
surprising behavior.

As can be seen by the current conversation, not everyone is convinced
that CTEs ought to be an explicit optimization barrier, and setting
that behavior as somehow desirable or explicit (rather than merely an
implementation detail) feels shortsighted to me. I would be delighted
to find that in some future version of PostgreSQL, but  if that is not
to be, at the very least, the verbiage surrounding CTEs might want to
include (perhaps prominently) something along the lines of CTEs are
currently an optimization barrier, but this is an implementation
detail and may change in future versions.  Perhaps even including a
small blurb about what an optimization barrier even means (my
understanding is that it merely forces materialization of that part of
the query).

That's just my perspective, coming at the use of CTEs not as a
PostgreSQL developer, but as somebody who learned about CTEs and
started using them - only to discover surprising behavior.

On Tue, Nov 20, 2012 at 1:22 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Nov 14, 2012 at 8:03 PM, Peter Geoghegan pe...@2ndquadrant.com 
 wrote:
 On 15 November 2012 01:46, Andrew Dunstan and...@dunslane.net wrote:
 It cuts both ways. I have used CTEs a LOT precisely because this behaviour
 lets me get better plans. Without that I'll be back to using the offset 0
 hack.

 Is the OFFSET 0 hack really so bad? We've been telling people to do
 that for years, so it's already something that we've effectively
 committed to.

 IMSNHO, 'OFFSET 0' is completely unreadable black magic.  I agree with
 Andrew: CTEs allow for manual composition of queries and can be the
 best tool when the planner is outsmarting itself.  In the old days,
 we'd extract data to a temp table and join against that: CTE are
 essentially a formalization of that technique.  I like things the way
 they are; if CTE are hurting your plan, that's an indication you're
 using them inappropriately.

 merlin


 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



-- 
Jon


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-20 Thread Merlin Moncure
On Tue, Nov 20, 2012 at 1:53 PM, Jon Nelson jnelson+pg...@jamponi.net wrote:
 As can be seen by the current conversation, not everyone is convinced
that CTEs ought to be an explicit optimization barrier

On Tue, Nov 20, 2012 at 1:26 PM, Claudio Freire klaussfre...@gmail.com wrote:
 It *could* just be a lack of imagination on my part. But if it were
 not, then it'd be nice for it to be done automatically (since this
 particular CTE behavior bites enough people already).

Sure.  I just find it personally hard to find a good demarcation line
between A: queries where pushing quals through are universally
beneficial and wanted and B: queries where we are inserting an
explicit materialization step to avoid planner issues, particularly
where there is substantial overlap with between A and C: queries that
are written with a CTE and arguably shouldn't be.

Put another way, I find CTE to express: 'this then that' where joins
express 'this with that'.  So current behavior is not surprising at
all. All that said, there could be a narrow class of low hanging cases
(such as the OP's) that could be sniped...I'm just skeptical.

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-20 Thread Claudio Freire
On Tue, Nov 20, 2012 at 5:24 PM, Merlin Moncure mmonc...@gmail.com wrote:
 On Tue, Nov 20, 2012 at 1:53 PM, Jon Nelson jnelson+pg...@jamponi.net wrote:
 As can be seen by the current conversation, not everyone is convinced
 that CTEs ought to be an explicit optimization barrier

 On Tue, Nov 20, 2012 at 1:26 PM, Claudio Freire klaussfre...@gmail.com 
 wrote:
 It *could* just be a lack of imagination on my part. But if it were
 not, then it'd be nice for it to be done automatically (since this
 particular CTE behavior bites enough people already).

 Sure.  I just find it personally hard to find a good demarcation line
 between A: queries where pushing quals through are universally
 beneficial and wanted and B: queries where we are inserting an
 explicit materialization step to avoid planner issues, particularly
 where there is substantial overlap with between A and C: queries that
 are written with a CTE and arguably shouldn't be.

 Put another way, I find CTE to express: 'this then that' where joins
 express 'this with that'.  So current behavior is not surprising at
 all. All that said, there could be a narrow class of low hanging cases
 (such as the OP's) that could be sniped...I'm just skeptical.

It could work very well towards CTE-including views, where the quals
cannot be added in the view but would be present when the view is
expanded in final queries.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-20 Thread Tom Lane
Jon Nelson jnelson+pg...@jamponi.net writes:
 ... Perhaps even including a
 small blurb about what an optimization barrier even means (my
 understanding is that it merely forces materialization of that part of
 the query).

FWIW, it has nothing to do with materialization; it means that we don't
push conditions down into that subquery, nor pull subexpressions up out
of it, nor rearrange join order across the subquery boundary.  In short
the subquery is planned separately from the outer query.  But it could
then be run by the executor in the usual tuple-at-a-time fashion,
without materializing the whole subquery result.

It is true that CTEScan nodes materialize the subquery output (ie copy
it into a tuplestore), but that's to support multiple CTEScans reading
the same CTE.  One of the optimizations we *should* put in place
sometime is skipping the tuplestore if there's only one CTEScan on the
CTE.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-20 Thread Craig Ringer
On 11/21/2012 12:06 AM, Claudio Freire wrote:
 I meant for postgres to do automatically. Rewriting as a join wouldn't
 work as an optimization fence the way we're used to, but pushing
 constraints upwards can only help (especially if highly selective).
Because people are now used to using CTEs as query hints, it'd probably
cause performance regressions in working queries. Perhaps more
importantly, Pg would have to prove that doing so didn't change queries
that invoked functions with side-effects to avoid changing the results
of currently valid queries.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-20 Thread Claudio Freire
On Tue, Nov 20, 2012 at 8:38 PM, Craig Ringer cr...@2ndquadrant.com wrote:
 On 11/21/2012 12:06 AM, Claudio Freire wrote:
 I meant for postgres to do automatically. Rewriting as a join wouldn't
 work as an optimization fence the way we're used to, but pushing
 constraints upwards can only help (especially if highly selective).
 Because people are now used to using CTEs as query hints, it'd probably
 cause performance regressions in working queries. Perhaps more
 importantly, Pg would have to prove that doing so didn't change queries
 that invoked functions with side-effects to avoid changing the results
 of currently valid queries.

Fair point. Will look into it a bit.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-20 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 On 11/21/2012 12:06 AM, Claudio Freire wrote:
 I meant for postgres to do automatically. Rewriting as a join wouldn't
 work as an optimization fence the way we're used to, but pushing
 constraints upwards can only help (especially if highly selective).

 Because people are now used to using CTEs as query hints, it'd probably
 cause performance regressions in working queries. Perhaps more
 importantly, Pg would have to prove that doing so didn't change queries
 that invoked functions with side-effects to avoid changing the results
 of currently valid queries.

We could trivially arrange to keep the current semantics if the CTE
query contains any volatile functions (or of course if it's
INSERT/UPDATE/DELETE).  I think we'd also need to not optimize if
it's invoked from more than one place in the outer query.

I think the more interesting question is what cases wouldn't be covered
by such a rule.  Typically you need to use OFFSET 0 in situations where
the planner has guessed wrong about costs or rowcounts, and I think
people are likely using WITH for that as well.  Should we be telling
people that they ought to insert OFFSET 0 in WITH queries if they want
to be sure there's an optimization fence?

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-19 Thread Gavin Flower

On 15/11/12 15:03, Peter Geoghegan wrote:

On 15 November 2012 01:46, Andrew Dunstan and...@dunslane.net wrote:

It cuts both ways. I have used CTEs a LOT precisely because this behaviour
lets me get better plans. Without that I'll be back to using the offset 0
hack.

Is the OFFSET 0 hack really so bad? We've been telling people to do
that for years, so it's already something that we've effectively
committed to.

How about adding the keywords FENCED and NOT FENCED to the SQL 
definition of CTE's - with FENCED being the default?



Cheers,
Gavin



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-14 Thread Andrew Dunstan


On 11/14/2012 10:23 AM, David Greco wrote:


Have a query using a CTE that is performing very poorly. The 
equivalent query against the same data in an Oracle database runs in 
under 1 second, in Postgres  it takes 2000 seconds.


The smp_pkg.get_invoice_charges queries fedexinvoices for some data 
and normalizes it into a SETOF some record type. It is declared to be 
STABLE. Fedexinvoices consists of about 1.3M rows of medium width. 
Fedexinvoices.id is the primary key on that table, and 
trim(fedexinvoices.trackno) is indexed via the function trim.


The plan for the equivalent query in Oracle is much smaller and 
simpler. No sequential (or full table) scans on fedexinvoices.


WITH charges as (

SELECT fi2.id, smp_pkg.get_invoice_charges(fi2.id) 
charge_info from fedexinvoices fi2


)

select fedexinvoices.* from

fedexinvoices

inner join charges on charges.id = fedexinvoices.id AND 
(charges.charge_info).charge_name IN ('ADDRESS CORRECTION 
CHARGE','ADDRESS CORRECTION')


where

trim(fedexinvoices.trackno)='799159791643'

;




Can you explain what you're actually trying to do here? The query looks 
rather odd. Why are you joining this table (or an extract from it) to 
itself?



In any case, you could almost certainly recast it and have it run fast 
by first filtering on the tracking number.



cheers

andrew


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-14 Thread David Greco
You're right. I was translating an oracle query , but looks like PG will allow 
some syntax that is different. Trying to find entries in fedexinvoices where 
smp_pkg.get_invoice_charges(id) returns a record containing charge_name in 
('ADDRESS CORRECTION CHARGE','ADDRESS CORRECTION'). Should return the 
fedexinvoices row and the row from smp_pkg.get_invoice_charges that contains 
the address correction.


Something like this, though this is syntactically incorrect as 
smp_pkg.get_invoice_charges returns a set:


select fedexinvoices.*, (smp_pkg.get_invoice_charges(id)).*
from fedexinvoices
WHERE
trim(fedexinvoices.trackno)='799159791643'
and
(smp_pkg.get_invoice_charges(id)).charge_name IN ('ADDRESS CORRECTION 
CHARGE','ADDRESS CORRECTION')




-Original Message-
From: Andrew Dunstan [mailto:and...@dunslane.net] 
Sent: Wednesday, November 14, 2012 10:51 AM
To: David Greco
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Poor performance using CTE


On 11/14/2012 10:23 AM, David Greco wrote:

 Have a query using a CTE that is performing very poorly. The 
 equivalent query against the same data in an Oracle database runs in 
 under 1 second, in Postgres  it takes 2000 seconds.

 The smp_pkg.get_invoice_charges queries fedexinvoices for some data 
 and normalizes it into a SETOF some record type. It is declared to be 
 STABLE. Fedexinvoices consists of about 1.3M rows of medium width.
 Fedexinvoices.id is the primary key on that table, and
 trim(fedexinvoices.trackno) is indexed via the function trim.

 The plan for the equivalent query in Oracle is much smaller and 
 simpler. No sequential (or full table) scans on fedexinvoices.

 WITH charges as (

 SELECT fi2.id, smp_pkg.get_invoice_charges(fi2.id)
 charge_info from fedexinvoices fi2

 )

 select fedexinvoices.* from

 fedexinvoices

 inner join charges on charges.id = fedexinvoices.id AND 
 (charges.charge_info).charge_name IN ('ADDRESS CORRECTION 
 CHARGE','ADDRESS CORRECTION')

 where

 trim(fedexinvoices.trackno)='799159791643'

 ;



Can you explain what you're actually trying to do here? The query looks rather 
odd. Why are you joining this table (or an extract from it) to itself?


In any case, you could almost certainly recast it and have it run fast by first 
filtering on the tracking number.


cheers

andrew




-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Poor performance using CTE

2012-11-14 Thread Andrew Dunstan


On 11/14/2012 10:56 AM, David Greco wrote:

You're right. I was translating an oracle query , but looks like PG will allow 
some syntax that is different. Trying to find entries in fedexinvoices where 
smp_pkg.get_invoice_charges(id) returns a record containing charge_name in 
('ADDRESS CORRECTION CHARGE','ADDRESS CORRECTION'). Should return the 
fedexinvoices row and the row from smp_pkg.get_invoice_charges that contains 
the address correction.


Something like this, though this is syntactically incorrect as 
smp_pkg.get_invoice_charges returns a set:


select fedexinvoices.*, (smp_pkg.get_invoice_charges(id)).*
from fedexinvoices
WHERE
trim(fedexinvoices.trackno)='799159791643'
and
(smp_pkg.get_invoice_charges(id)).charge_name IN ('ADDRESS CORRECTION 
CHARGE','ADDRESS CORRECTION')



First, please don't top-post when someone has replied underneath your 
post. It makes the thread totally unreadable. See 
http://idallen.com/topposting.html


You could do something like this:

WITH invoices as
(
   select *
   from fedexinvoices
   where trim(fedexinvoices.trackno)='799159791643'
),

charges as
(
   SELECT fi2.id, smp_pkg.get_invoice_charges(fi2.id) charge_info
   from fedexinvoices fi2 join invoices i on i.id = f12.id
)

select invoices.*
from invoices
inner join charges on charges.id = invoices.id
AND (charges.charge_info).charge_name IN ('ADDRESS CORRECTION 
CHARGE','ADDRESS CORRECTION')


;


Or probably something way simpler but I just did this fairly quickly and 
mechanically



cheers

andrew


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-14 Thread David Greco
-Original Message-
From: Andrew Dunstan [mailto:and...@dunslane.net] 
Sent: Wednesday, November 14, 2012 11:08 AM
To: David Greco
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Poor performance using CTE


On 11/14/2012 10:56 AM, David Greco wrote:
 You're right. I was translating an oracle query , but looks like PG will 
 allow some syntax that is different. Trying to find entries in fedexinvoices 
 where smp_pkg.get_invoice_charges(id) returns a record containing charge_name 
 in ('ADDRESS CORRECTION CHARGE','ADDRESS CORRECTION'). Should return the 
 fedexinvoices row and the row from smp_pkg.get_invoice_charges that contains 
 the address correction.


 Something like this, though this is syntactically incorrect as 
 smp_pkg.get_invoice_charges returns a set:


 select fedexinvoices.*, (smp_pkg.get_invoice_charges(id)).*
 from fedexinvoices
 WHERE
 trim(fedexinvoices.trackno)='799159791643'
 and
 (smp_pkg.get_invoice_charges(id)).charge_name IN ('ADDRESS CORRECTION 
 CHARGE','ADDRESS CORRECTION')


First, please don't top-post when someone has replied underneath your post. It 
makes the thread totally unreadable. See http://idallen.com/topposting.html

You could do something like this:

WITH invoices as
(
select *
from fedexinvoices
where trim(fedexinvoices.trackno)='799159791643'
),

charges as
(
SELECT fi2.id, smp_pkg.get_invoice_charges(fi2.id) charge_info
from fedexinvoices fi2 join invoices i on i.id = f12.id
)

select invoices.*
from invoices
inner join charges on charges.id = invoices.id
 AND (charges.charge_info).charge_name IN ('ADDRESS CORRECTION 
CHARGE','ADDRESS CORRECTION')

;


Or probably something way simpler but I just did this fairly quickly and 
mechanically


cheers

andrew







Thanks, that did the trick. Though I'm still not clear as to why. 






-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-14 Thread Tom Lane
David Greco david_gr...@harte-hanks.com writes:
 Thanks, that did the trick. Though I'm still not clear as to why. 

PG treats WITH as an optimization fence --- the WITH query will be
executed pretty much as-is.  It may be that Oracle flattens the query
somehow; though if you're using black-box functions in both cases,
it's not obvious where the optimizer could get any purchase that way.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-14 Thread Craig Ringer
On 11/15/2012 12:29 AM, Tom Lane wrote:
 David Greco david_gr...@harte-hanks.com writes:
 Thanks, that did the trick. Though I'm still not clear as to why. 
 PG treats WITH as an optimization fence --- the WITH query will be
 executed pretty much as-is.  It may be that Oracle flattens the query
 somehow; though if you're using black-box functions in both cases,
 it's not obvious where the optimizer could get any purchase that way.


I was looking through the latest spec drafts I have access to and
couldn't find any reference to Pg's optimisation-fence-for-CTEs
behaviour being required by the standard, though I've repeatedly seen it
said that there is such a requirement.

Do you know where it's specified?

All I can see is that the optimised result must have the same effect as
the original. That'd mean that wCTEs and CTE terms that use VOLATILE
functions or functions with side-effects couldn't be optimised into
other queries. Simple CTEs could be, though, and there are times I've
really wished I could use a CTE but I've had to use a set-returning
subquery to get reasonable plans.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-14 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 I was looking through the latest spec drafts I have access to and
 couldn't find any reference to Pg's optimisation-fence-for-CTEs
 behaviour being required by the standard, though I've repeatedly seen it
 said that there is such a requirement.

I don't believe it's required by the standard (it's hard to see how it
could be, when query optimization is a topic outside the spec to start
with).  However, we allow INSERT/UPDATE/DELETE RETURNING inside WITH,
and for those I think you really need to treat WITH as an optimization
fence.  It's a lot more debatable for SELECT; there are some advantages
to providing a fence this way but there are definitely downsides too.
I could see adjusting that definition in the future, as we get more
experience with use of CTEs.

regards, tom lane


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-14 Thread Andrew Dunstan


On 11/14/2012 08:17 PM, Craig Ringer wrote:

On 11/15/2012 12:29 AM, Tom Lane wrote:

David Greco david_gr...@harte-hanks.com writes:

Thanks, that did the trick. Though I'm still not clear as to why.

PG treats WITH as an optimization fence --- the WITH query will be
executed pretty much as-is.  It may be that Oracle flattens the query
somehow; though if you're using black-box functions in both cases,
it's not obvious where the optimizer could get any purchase that way.


I was looking through the latest spec drafts I have access to and
couldn't find any reference to Pg's optimisation-fence-for-CTEs
behaviour being required by the standard, though I've repeatedly seen it
said that there is such a requirement.

Do you know where it's specified?

All I can see is that the optimised result must have the same effect as
the original. That'd mean that wCTEs and CTE terms that use VOLATILE
functions or functions with side-effects couldn't be optimised into
other queries. Simple CTEs could be, though, and there are times I've
really wished I could use a CTE but I've had to use a set-returning
subquery to get reasonable plans.



It cuts both ways. I have used CTEs a LOT precisely because this 
behaviour lets me get better plans. Without that I'll be back to using 
the offset 0 hack.


cheers

andrew


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: SOLVED - RE: [PERFORM] Poor performance using CTE

2012-11-14 Thread Peter Geoghegan
On 15 November 2012 01:46, Andrew Dunstan and...@dunslane.net wrote:
 It cuts both ways. I have used CTEs a LOT precisely because this behaviour
 lets me get better plans. Without that I'll be back to using the offset 0
 hack.

Is the OFFSET 0 hack really so bad? We've been telling people to do
that for years, so it's already something that we've effectively
committed to.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance