Re: [HACKERS] Future directions for inheritance-hierarchy statistics

2015-03-21 Thread Jim Nasby

On 3/18/15 8:26 AM, Robert Haas wrote:

In
fact, EnterpriseDB has run into a number of customer situations where
planning time even for non-inheritance queries is substantially higher
than, shall we say, a competing commercial product.


If it's the commercial product I'm thinking of, they use multiple levels 
of caching to avoid both parse costs as well as plan costs. It's always 
impressed me that we didn't have to resort to such shenanigans, but 
perhaps there's only so long we can avoid them.

--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Future directions for inheritance-hierarchy statistics

2015-03-18 Thread Robert Haas
On Tue, Mar 17, 2015 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 This would have one significant drawback, which is that planning for
 large inheritance trees (many children) would probably get noticeably
 slower.  (But in the common case that constraint exclusion limits a
 query to scanning just one or a few children, the hit would be small.)

 That's a pretty big drawback.  I'm not sure whether it's big enough to
 sink the whole idea, but we really need to make planning time on large
 inheritance trees cheaper, not more expensive.

 Ah, but note the point about how there's no added cost for partitions that
 are removed by constraint exclusion.  That should mean that in practical
 use it's not a huge problem.  (If you're going to scan K partitions, you
 should not be surprised that planning time is O(K).  It will be anyway
 thanks to other things such as index selection.)

 Also, you're ignoring the prospect of getting better estimates and hence
 better plans through having stats that dynamically adapt to the set of
 partitions being scanned.  Given the lousy state of maintenance of
 whole-tree stats, I really think that this consideration might outweigh
 even a significant planning-time hit.  Shaving planning time by producing
 crappy estimates isn't usually a good tradeoff.

Perhaps so, but I know that the planning time of large inheritance
trees has been a major issue for some of EnterpriseDB's customers.  In
fact, EnterpriseDB has run into a number of customer situations where
planning time even for non-inheritance queries is substantially higher
than, shall we say, a competing commercial product.  With inheritance,
even people who aren't making comparisons with other products start to
get unhappy.  I've always been very pleased with the quality of plans
that our planner generates, but it's becoming increasingly clear to me
that at least one other product is able to provide good plans at a
significantly lower CPU cost, and inheritance is particular trouble
spot.  I don't know exactly what we ought to do about that and perhaps
it's to one side of the issue you're raising here, but I do think it's
an issue that we (the PostgreSQL community) ought to be thinking
about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Future directions for inheritance-hierarchy statistics

2015-03-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Mar 17, 2015 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Also, you're ignoring the prospect of getting better estimates and hence
 better plans through having stats that dynamically adapt to the set of
 partitions being scanned.  Given the lousy state of maintenance of
 whole-tree stats, I really think that this consideration might outweigh
 even a significant planning-time hit.  Shaving planning time by producing
 crappy estimates isn't usually a good tradeoff.

 Perhaps so, but I know that the planning time of large inheritance
 trees has been a major issue for some of EnterpriseDB's customers.  In
 fact, EnterpriseDB has run into a number of customer situations where
 planning time even for non-inheritance queries is substantially higher
 than, shall we say, a competing commercial product.  With inheritance,
 even people who aren't making comparisons with other products start to
 get unhappy.  I've always been very pleased with the quality of plans
 that our planner generates, but it's becoming increasingly clear to me
 that at least one other product is able to provide good plans at a
 significantly lower CPU cost, and inheritance is particular trouble
 spot.  I don't know exactly what we ought to do about that and perhaps
 it's to one side of the issue you're raising here, but I do think it's
 an issue that we (the PostgreSQL community) ought to be thinking
 about.

Well, we know that the current approach to inheritance isn't very well
attuned to standard partitioning situations, because it treats every
inheritance child as a de novo problem.  I continue to maintain that
the right fix for that is a partitioning feature that forbids any schema
variation across partitions, which the planner would use to avoid doing
O(N) work when dealing with an N-partition table.  Worrying about
changes that would already be involving less than O(N) work is rather
pointless in this context, IMO.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Future directions for inheritance-hierarchy statistics

2015-03-17 Thread Robert Haas
For some reason, I didn't get Tom's email, only this reply.

On Tue, Mar 17, 2015 at 3:44 AM, Etsuro Fujita
fujita.ets...@lab.ntt.co.jp wrote:
 On 2015/03/17 5:18, Tom Lane wrote:
 A few days ago I posted a very-much-WIP patch for making the planner
 dynamically combine statistics for each member of an appendrel:
 http://www.postgresql.org/message-id/22598.1425686...@sss.pgh.pa.us

 That patch was only intended to handle the case of an appendrel generated
 by a UNION ALL construct.  But it occurs to me that we could easily
 change it to also apply to appendrels generated from inheritance trees.
 Then we'd no longer need the whole-inheritance-tree statistics that
 ANALYZE currently produces, because we'd only ever look at per-table
 statistics in pg_statistic.

 This would have one significant drawback, which is that planning for
 large inheritance trees (many children) would probably get noticeably
 slower.  (But in the common case that constraint exclusion limits a
 query to scanning just one or a few children, the hit would be small.)

That's a pretty big drawback.  I'm not sure whether it's big enough to
sink the whole idea, but we really need to make planning time on large
inheritance trees cheaper, not more expensive.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Future directions for inheritance-hierarchy statistics

2015-03-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 For some reason, I didn't get Tom's email, only this reply.
 On 2015/03/17 5:18, Tom Lane wrote:
 This would have one significant drawback, which is that planning for
 large inheritance trees (many children) would probably get noticeably
 slower.  (But in the common case that constraint exclusion limits a
 query to scanning just one or a few children, the hit would be small.)

 That's a pretty big drawback.  I'm not sure whether it's big enough to
 sink the whole idea, but we really need to make planning time on large
 inheritance trees cheaper, not more expensive.

Ah, but note the point about how there's no added cost for partitions that
are removed by constraint exclusion.  That should mean that in practical
use it's not a huge problem.  (If you're going to scan K partitions, you
should not be surprised that planning time is O(K).  It will be anyway
thanks to other things such as index selection.)

Also, you're ignoring the prospect of getting better estimates and hence
better plans through having stats that dynamically adapt to the set of
partitions being scanned.  Given the lousy state of maintenance of
whole-tree stats, I really think that this consideration might outweigh
even a significant planning-time hit.  Shaving planning time by producing
crappy estimates isn't usually a good tradeoff.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Future directions for inheritance-hierarchy statistics

2015-03-17 Thread Etsuro Fujita
On 2015/03/17 5:18, Tom Lane wrote:
 A few days ago I posted a very-much-WIP patch for making the planner
 dynamically combine statistics for each member of an appendrel:
 http://www.postgresql.org/message-id/22598.1425686...@sss.pgh.pa.us
 
 That patch was only intended to handle the case of an appendrel generated
 by a UNION ALL construct.  But it occurs to me that we could easily
 change it to also apply to appendrels generated from inheritance trees.
 Then we'd no longer need the whole-inheritance-tree statistics that
 ANALYZE currently produces, because we'd only ever look at per-table
 statistics in pg_statistic.
 
 This would have one significant drawback, which is that planning for
 large inheritance trees (many children) would probably get noticeably
 slower.  (But in the common case that constraint exclusion limits a
 query to scanning just one or a few children, the hit would be small.)
 
 On the other hand, there would be two very significant benefits.
 First, that we would automatically get statistics that account for
 partitions being eliminated by constraint exclusion, because only the
 non-eliminated partitions are present in the appendrel.  And second,
 that we'd be able to forget the whole problem of getting autovacuum
 to create whole-inheritance-tree stats.  Right now I'm doubtful that
 typical users are getting good up-to-date stats at all for queries of
 this sort, because autovacuum will only update those stats if it decides
 it needs to analyze the parent table.  Which is commonly empty, so that
 there's never a reason to fire an analyze on it.  (We'd left this as
 a problem to be solved later when we put in the whole-tree stats
 feature in 9.0, but no progress has been made on solving it.)
 
 So I think that going in this direction is clearly a win and we ought
 to pursue it.  It's not happening for 9.5 of course, because there's
 still a great deal of work to do before anything like this would be
 committable.  But I would like to establish a consensus that this
 would be a sensible thing to do in 9.6.
 
 The reason I bring it up now is that the inheritance-for-foreign-tables
 patch has some code that I don't much like for controlling what happens
 with those whole-tree stats when some of the children are foreign tables
 that lack ANALYZE support.  If the long-term plan is that whole-tree
 stats are going away altogether, then it won't be terribly important
 exactly what happens in that case, so we can just do some simple/easy
 kluge in the short term and not have to have an argument about what's
 the best thing to do.

That seems like a good idea.

Best regards,
Etsuro Fujita


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Future directions for inheritance-hierarchy statistics

2015-03-16 Thread Tom Lane
A few days ago I posted a very-much-WIP patch for making the planner
dynamically combine statistics for each member of an appendrel:
http://www.postgresql.org/message-id/22598.1425686...@sss.pgh.pa.us

That patch was only intended to handle the case of an appendrel generated
by a UNION ALL construct.  But it occurs to me that we could easily
change it to also apply to appendrels generated from inheritance trees.
Then we'd no longer need the whole-inheritance-tree statistics that
ANALYZE currently produces, because we'd only ever look at per-table
statistics in pg_statistic.

This would have one significant drawback, which is that planning for
large inheritance trees (many children) would probably get noticeably
slower.  (But in the common case that constraint exclusion limits a
query to scanning just one or a few children, the hit would be small.)

On the other hand, there would be two very significant benefits.
First, that we would automatically get statistics that account for
partitions being eliminated by constraint exclusion, because only the
non-eliminated partitions are present in the appendrel.  And second,
that we'd be able to forget the whole problem of getting autovacuum
to create whole-inheritance-tree stats.  Right now I'm doubtful that
typical users are getting good up-to-date stats at all for queries of
this sort, because autovacuum will only update those stats if it decides
it needs to analyze the parent table.  Which is commonly empty, so that
there's never a reason to fire an analyze on it.  (We'd left this as
a problem to be solved later when we put in the whole-tree stats
feature in 9.0, but no progress has been made on solving it.)

So I think that going in this direction is clearly a win and we ought
to pursue it.  It's not happening for 9.5 of course, because there's
still a great deal of work to do before anything like this would be
committable.  But I would like to establish a consensus that this
would be a sensible thing to do in 9.6.

The reason I bring it up now is that the inheritance-for-foreign-tables
patch has some code that I don't much like for controlling what happens
with those whole-tree stats when some of the children are foreign tables
that lack ANALYZE support.  If the long-term plan is that whole-tree
stats are going away altogether, then it won't be terribly important
exactly what happens in that case, so we can just do some simple/easy
kluge in the short term and not have to have an argument about what's
the best thing to do.

Comments?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers