Re: [HACKERS] Future directions for inheritance-hierarchy statistics
On 3/18/15 8:26 AM, Robert Haas wrote: In fact, EnterpriseDB has run into a number of customer situations where planning time even for non-inheritance queries is substantially higher than, shall we say, a competing commercial product. If it's the commercial product I'm thinking of, they use multiple levels of caching to avoid both parse costs as well as plan costs. It's always impressed me that we didn't have to resort to such shenanigans, but perhaps there's only so long we can avoid them. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future directions for inheritance-hierarchy statistics
On Tue, Mar 17, 2015 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote: This would have one significant drawback, which is that planning for large inheritance trees (many children) would probably get noticeably slower. (But in the common case that constraint exclusion limits a query to scanning just one or a few children, the hit would be small.) That's a pretty big drawback. I'm not sure whether it's big enough to sink the whole idea, but we really need to make planning time on large inheritance trees cheaper, not more expensive. Ah, but note the point about how there's no added cost for partitions that are removed by constraint exclusion. That should mean that in practical use it's not a huge problem. (If you're going to scan K partitions, you should not be surprised that planning time is O(K). It will be anyway thanks to other things such as index selection.) Also, you're ignoring the prospect of getting better estimates and hence better plans through having stats that dynamically adapt to the set of partitions being scanned. Given the lousy state of maintenance of whole-tree stats, I really think that this consideration might outweigh even a significant planning-time hit. Shaving planning time by producing crappy estimates isn't usually a good tradeoff. Perhaps so, but I know that the planning time of large inheritance trees has been a major issue for some of EnterpriseDB's customers. In fact, EnterpriseDB has run into a number of customer situations where planning time even for non-inheritance queries is substantially higher than, shall we say, a competing commercial product. With inheritance, even people who aren't making comparisons with other products start to get unhappy. I've always been very pleased with the quality of plans that our planner generates, but it's becoming increasingly clear to me that at least one other product is able to provide good plans at a significantly lower CPU cost, and inheritance is particular trouble spot. I don't know exactly what we ought to do about that and perhaps it's to one side of the issue you're raising here, but I do think it's an issue that we (the PostgreSQL community) ought to be thinking about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future directions for inheritance-hierarchy statistics
Robert Haas robertmh...@gmail.com writes: On Tue, Mar 17, 2015 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote: Also, you're ignoring the prospect of getting better estimates and hence better plans through having stats that dynamically adapt to the set of partitions being scanned. Given the lousy state of maintenance of whole-tree stats, I really think that this consideration might outweigh even a significant planning-time hit. Shaving planning time by producing crappy estimates isn't usually a good tradeoff. Perhaps so, but I know that the planning time of large inheritance trees has been a major issue for some of EnterpriseDB's customers. In fact, EnterpriseDB has run into a number of customer situations where planning time even for non-inheritance queries is substantially higher than, shall we say, a competing commercial product. With inheritance, even people who aren't making comparisons with other products start to get unhappy. I've always been very pleased with the quality of plans that our planner generates, but it's becoming increasingly clear to me that at least one other product is able to provide good plans at a significantly lower CPU cost, and inheritance is particular trouble spot. I don't know exactly what we ought to do about that and perhaps it's to one side of the issue you're raising here, but I do think it's an issue that we (the PostgreSQL community) ought to be thinking about. Well, we know that the current approach to inheritance isn't very well attuned to standard partitioning situations, because it treats every inheritance child as a de novo problem. I continue to maintain that the right fix for that is a partitioning feature that forbids any schema variation across partitions, which the planner would use to avoid doing O(N) work when dealing with an N-partition table. Worrying about changes that would already be involving less than O(N) work is rather pointless in this context, IMO. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future directions for inheritance-hierarchy statistics
For some reason, I didn't get Tom's email, only this reply. On Tue, Mar 17, 2015 at 3:44 AM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp wrote: On 2015/03/17 5:18, Tom Lane wrote: A few days ago I posted a very-much-WIP patch for making the planner dynamically combine statistics for each member of an appendrel: http://www.postgresql.org/message-id/22598.1425686...@sss.pgh.pa.us That patch was only intended to handle the case of an appendrel generated by a UNION ALL construct. But it occurs to me that we could easily change it to also apply to appendrels generated from inheritance trees. Then we'd no longer need the whole-inheritance-tree statistics that ANALYZE currently produces, because we'd only ever look at per-table statistics in pg_statistic. This would have one significant drawback, which is that planning for large inheritance trees (many children) would probably get noticeably slower. (But in the common case that constraint exclusion limits a query to scanning just one or a few children, the hit would be small.) That's a pretty big drawback. I'm not sure whether it's big enough to sink the whole idea, but we really need to make planning time on large inheritance trees cheaper, not more expensive. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future directions for inheritance-hierarchy statistics
Robert Haas robertmh...@gmail.com writes: For some reason, I didn't get Tom's email, only this reply. On 2015/03/17 5:18, Tom Lane wrote: This would have one significant drawback, which is that planning for large inheritance trees (many children) would probably get noticeably slower. (But in the common case that constraint exclusion limits a query to scanning just one or a few children, the hit would be small.) That's a pretty big drawback. I'm not sure whether it's big enough to sink the whole idea, but we really need to make planning time on large inheritance trees cheaper, not more expensive. Ah, but note the point about how there's no added cost for partitions that are removed by constraint exclusion. That should mean that in practical use it's not a huge problem. (If you're going to scan K partitions, you should not be surprised that planning time is O(K). It will be anyway thanks to other things such as index selection.) Also, you're ignoring the prospect of getting better estimates and hence better plans through having stats that dynamically adapt to the set of partitions being scanned. Given the lousy state of maintenance of whole-tree stats, I really think that this consideration might outweigh even a significant planning-time hit. Shaving planning time by producing crappy estimates isn't usually a good tradeoff. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Future directions for inheritance-hierarchy statistics
On 2015/03/17 5:18, Tom Lane wrote: A few days ago I posted a very-much-WIP patch for making the planner dynamically combine statistics for each member of an appendrel: http://www.postgresql.org/message-id/22598.1425686...@sss.pgh.pa.us That patch was only intended to handle the case of an appendrel generated by a UNION ALL construct. But it occurs to me that we could easily change it to also apply to appendrels generated from inheritance trees. Then we'd no longer need the whole-inheritance-tree statistics that ANALYZE currently produces, because we'd only ever look at per-table statistics in pg_statistic. This would have one significant drawback, which is that planning for large inheritance trees (many children) would probably get noticeably slower. (But in the common case that constraint exclusion limits a query to scanning just one or a few children, the hit would be small.) On the other hand, there would be two very significant benefits. First, that we would automatically get statistics that account for partitions being eliminated by constraint exclusion, because only the non-eliminated partitions are present in the appendrel. And second, that we'd be able to forget the whole problem of getting autovacuum to create whole-inheritance-tree stats. Right now I'm doubtful that typical users are getting good up-to-date stats at all for queries of this sort, because autovacuum will only update those stats if it decides it needs to analyze the parent table. Which is commonly empty, so that there's never a reason to fire an analyze on it. (We'd left this as a problem to be solved later when we put in the whole-tree stats feature in 9.0, but no progress has been made on solving it.) So I think that going in this direction is clearly a win and we ought to pursue it. It's not happening for 9.5 of course, because there's still a great deal of work to do before anything like this would be committable. But I would like to establish a consensus that this would be a sensible thing to do in 9.6. The reason I bring it up now is that the inheritance-for-foreign-tables patch has some code that I don't much like for controlling what happens with those whole-tree stats when some of the children are foreign tables that lack ANALYZE support. If the long-term plan is that whole-tree stats are going away altogether, then it won't be terribly important exactly what happens in that case, so we can just do some simple/easy kluge in the short term and not have to have an argument about what's the best thing to do. That seems like a good idea. Best regards, Etsuro Fujita -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Future directions for inheritance-hierarchy statistics
A few days ago I posted a very-much-WIP patch for making the planner dynamically combine statistics for each member of an appendrel: http://www.postgresql.org/message-id/22598.1425686...@sss.pgh.pa.us That patch was only intended to handle the case of an appendrel generated by a UNION ALL construct. But it occurs to me that we could easily change it to also apply to appendrels generated from inheritance trees. Then we'd no longer need the whole-inheritance-tree statistics that ANALYZE currently produces, because we'd only ever look at per-table statistics in pg_statistic. This would have one significant drawback, which is that planning for large inheritance trees (many children) would probably get noticeably slower. (But in the common case that constraint exclusion limits a query to scanning just one or a few children, the hit would be small.) On the other hand, there would be two very significant benefits. First, that we would automatically get statistics that account for partitions being eliminated by constraint exclusion, because only the non-eliminated partitions are present in the appendrel. And second, that we'd be able to forget the whole problem of getting autovacuum to create whole-inheritance-tree stats. Right now I'm doubtful that typical users are getting good up-to-date stats at all for queries of this sort, because autovacuum will only update those stats if it decides it needs to analyze the parent table. Which is commonly empty, so that there's never a reason to fire an analyze on it. (We'd left this as a problem to be solved later when we put in the whole-tree stats feature in 9.0, but no progress has been made on solving it.) So I think that going in this direction is clearly a win and we ought to pursue it. It's not happening for 9.5 of course, because there's still a great deal of work to do before anything like this would be committable. But I would like to establish a consensus that this would be a sensible thing to do in 9.6. The reason I bring it up now is that the inheritance-for-foreign-tables patch has some code that I don't much like for controlling what happens with those whole-tree stats when some of the children are foreign tables that lack ANALYZE support. If the long-term plan is that whole-tree stats are going away altogether, then it won't be terribly important exactly what happens in that case, so we can just do some simple/easy kluge in the short term and not have to have an argument about what's the best thing to do. Comments? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers