Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-12-05 Thread Merlin Moncure
On Fri, Dec 5, 2014 at 12:46 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 30 September 2014 at 05:53, Simon Riggs si...@2ndquadrant.com wrote:
 On 29 September 2014 16:00, Merlin Moncure mmonc...@gmail.com wrote:
 On Fri, Sep 26, 2014 at 3:06 AM, Simon Riggs si...@2ndquadrant.com wrote:
 The problem, as I see it, is different. We assume that if there are
 100 distinct values and you use LIMIT 1 that you would only need to
 scan 1% of rows. We assume that the data is arranged in the table in a
 very homogenous layout. When data is not, and it seldom is, we get
 problems.

 Hm, good point -- 'data proximity'.  At least in theory, can't this be
 measured and quantified?  For example, given a number of distinct
 values, you could estimate the % of pages read (or maybe non
 sequential seeks relative to the number of pages) you'd need to read
 all instances of a particular value in the average (or perhaps the
 worst) case.   One way of trying to calculate that would be to look at
 proximity of values in sampled pages (and maybe a penalty assigned for
 high update activity relative to table size).  Data proximity would
 then become a cost coefficient to the benefits of LIMIT.

 The necessary first step to this is to realise that we can't simply
 apply the LIMIT as a reduction in query cost, in all cases.

 The way I'm seeing it, you can't assume the LIMIT will apply to any
 IndexScan that doesn't have an index condition. If it has just a
 filter, or nothing at all, just an ordering then it could easily scan
 the whole index if the stats are wrong.

 So plans like this could be wrong, by assuming the scan will end
 earlier because of the LIMIT than it actually will.

 Limit
   IndexScan (no index cond)

 Limit
   NestJoin
 IndexScan (no index cond)
 SomeScan

 Limit
   NestJoin
 NestJoin
   IndexScan (no index cond)
   SomeScan
SomeScan

 and deeper...

 I'm looking for a way to identify and exclude such plans, assuming
 that this captures at least some of the problem plans.

 After looking at this for some time I now have a patch that solves this.

 It relies on the observation that index scans with no bounded quals
 don't play nicely with LIMIT. The solution relies upon the point that
 LIMIT does not reduce the startup cost of plans, only the total cost.
 So we can solve the problem by keeping the total cost estimate, just
 move some of that into startup cost so LIMIT does not reduce costs as
 much as before.

 It's a simple patch, but it solves the test cases I know about and
 does almost nothing to planning time.

 I tried much less subtle approaches involving direct prevention of
 LIMIT pushdown but the code was much too complex for my liking.

Neat -- got any test cases (would this have prevented OP's problem)?

merlin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Yet another abort-early plan disaster on 9.3

2014-12-05 Thread Simon Riggs
On 6 December 2014 at 00:45, Merlin Moncure mmonc...@gmail.com wrote:

 Neat -- got any test cases (would this have prevented OP's problem)?

No test case was posted, so I am unable to confirm.

A test case I produced that appears to be the same issue is fixed.

I await confirmation from the OP.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Query doesn't use index on hstore column

2014-12-05 Thread Michael Barker

 Well, first off, a btree index is fairly useless for this query,
 because btree has no concept that the hstore has any sub-structure.
 A GIN index or GIST index could work though.  Secondly, you have to
 remember that indexable WHERE conditions in Postgres are *always* of
 the form WHERE indexed_column indexable_operator some_comparison_value.


And the student was enlightened

Cheers, seeing sensible explain plans now.


 You might care to read
 http://www.postgresql.org/docs/9.4/static/indexes.html
 to get a better handle on what Postgres indexes can and can't do.


Will do, thanks again.

Mike.