Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-04 Thread Martijn van Oosterhout
Fascinating.

On Fri, Jul 04, 2014 at 10:47:06AM +0800, gotoschool6g wrote:
 slow query(8531 ms):
 SELECT ST_Distance_Sphere(shape,ST_GeomFromText('POINT(116.41386186784513 
 40.12211338311868)')) FROM road order by id LIMIT 1;
 
 explain output:
 Limit  (cost=4653.48..4653.48 rows=1 width=3612)
   -  Sort  (cost=4653.48..4683.06 rows=11832 width=3612)
 Sort Key: id
 -  Seq Scan on road  (cost=0.00..4594.32 rows=11832 width=3612)
 
 fast query(16ms):
 select ST_Distance_Sphere(shape,ST_GeomFromText('POINT(116.41386186784513 
 40.12211338311868)')) from (SELECT shape FROM road order by id  LIMIT 1) a
 
 explain output:
 Subquery Scan on a  (cost=1695.48..1695.74 rows=1 width=3608)
   -  Limit  (cost=1695.48..1695.48 rows=1 width=3612)
 -  Sort  (cost=1695.48..1725.06 rows=11832 width=3612)
   Sort Key: road.id
   -  Seq Scan on road  (cost=0.00..1636.32 rows=11832 
 width=3612)

So Postgres knows perfectly well that it's expensive, it just doesn't
appear to understand it has the option of moving the calculation above
the limit.

In this case though, it seems an index on road(id) would make it
instant in any case.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-03 Thread gotoschool6g
slow query(8531 ms):
SELECT ST_Distance_Sphere(shape,ST_GeomFromText('POINT(116.41386186784513 
40.12211338311868)')) FROM road order by id LIMIT 1;

explain output:
Limit  (cost=4653.48..4653.48 rows=1 width=3612)
  -  Sort  (cost=4653.48..4683.06 rows=11832 width=3612)
Sort Key: id
-  Seq Scan on road  (cost=0.00..4594.32 rows=11832 width=3612)

fast query(16ms):
select ST_Distance_Sphere(shape,ST_GeomFromText('POINT(116.41386186784513 
40.12211338311868)')) from (SELECT shape FROM road order by id  LIMIT 1) a

explain output:
Subquery Scan on a  (cost=1695.48..1695.74 rows=1 width=3608)
  -  Limit  (cost=1695.48..1695.48 rows=1 width=3612)
-  Sort  (cost=1695.48..1725.06 rows=11832 width=3612)
  Sort Key: road.id
  -  Seq Scan on road  (cost=0.00..1636.32 rows=11832 width=3612)

CREATE TABLE road
(
  shape geometry,
  id integer
)
WITH (
  OIDS=FALSE
);

There are redundant call when sorting?


 On Tue, Jul 1, 2014 at 2:16 PM, Martijn van Oosterhout 
 klep...@svana.org wrote: 
  On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote: 
  The simplified scene: 
  select slowfunction(s) from a order by b limit 1; 
  is slow than 
  select slowfunction(s) from (select s from a order by b limit 1) as z; 
  if there are many records in table 'a'. 
  
  
  The real scene. Function  ST_Distance_Sphere is slow, the query: 
  SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road 
  order by c limit 1; 
  is slow than: 
  select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT s 
  from road order by c limit 1) as a; 
  There are about 7000 records in 'road'. 
  
  I think to help here I think we need the EXPLAIN ANALYSE output for 
  both queries. 
  
 Well, I think the problem is a well understood one: there is no 
 guarantee that functions-in-select-list are called exactly once per 
 output row.  This is documented -- for example see here: 
 http://www.postgresql.org/docs/9.1/static/explicit-locking.html#ADVISORY-LOCKS.
  
 In short, if you want very precise control of function evaluation use 
 a subquery, or, if you're really paranoid, a CTE. 

I'm probably dense, but I'm not sure I understand. Or it is that the 
slowfunction() is called prior to the sort? That seems insane. 

Have a nice day, 
--  
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/ 
 He who writes carelessly confesses thereby at the very outset that he does 
 not attach much importance to his own thoughts. 
   -- Arthur Schopenhauer 
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-02 Thread Martijn van Oosterhout
On Tue, Jul 01, 2014 at 02:36:55PM -0500, Merlin Moncure wrote:
 On Tue, Jul 1, 2014 at 2:16 PM, Martijn van Oosterhout
 klep...@svana.org wrote:
  On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote:
  The simplified scene:
  select slowfunction(s) from a order by b limit 1;
  is slow than
  select slowfunction(s) from (select s from a order by b limit 1) as z;
  if there are many records in table 'a'.
 
 
  The real scene. Function  ST_Distance_Sphere is slow, the query:
  SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road 
  order by c limit 1;
  is slow than:
  select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT s 
  from road order by c limit 1) as a;
  There are about 7000 records in 'road'.
 
  I think to help here I think we need the EXPLAIN ANALYSE output for
  both queries.
 
 Well, I think the problem is a well understood one: there is no
 guarantee that functions-in-select-list are called exactly once per
 output row.  This is documented -- for example see here:
 http://www.postgresql.org/docs/9.1/static/explicit-locking.html#ADVISORY-LOCKS.
 In short, if you want very precise control of function evaluation use
 a subquery, or, if you're really paranoid, a CTE.

I'm probably dense, but I'm not sure I understand. Or it is that the
slowfunction() is called prior to the sort? That seems insane.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-02 Thread David G Johnston
Martijn van Oosterhout wrote
 On Tue, Jul 01, 2014 at 02:36:55PM -0500, Merlin Moncure wrote:
 On Tue, Jul 1, 2014 at 2:16 PM, Martijn van Oosterhout
 lt;

 kleptog@

 gt; wrote:
  On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote:
  The simplified scene:
  select slowfunction(s) from a order by b limit 1;
  is slow than
  select slowfunction(s) from (select s from a order by b limit 1) as z;
  if there are many records in table 'a'.
 
 
  The real scene. Function  ST_Distance_Sphere is slow, the query:
  SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road
 order by c limit 1;
  is slow than:
  select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from
 (SELECT s from road order by c limit 1) as a;
  There are about 7000 records in 'road'.
 
  I think to help here I think we need the EXPLAIN ANALYSE output for
  both queries.
 
 Well, I think the problem is a well understood one: there is no
 guarantee that functions-in-select-list are called exactly once per
 output row.  This is documented -- for example see here:
 http://www.postgresql.org/docs/9.1/static/explicit-locking.html#ADVISORY-LOCKS.
 In short, if you want very precise control of function evaluation use
 a subquery, or, if you're really paranoid, a CTE.
 
 I'm probably dense, but I'm not sure I understand. Or it is that the
 slowfunction() is called prior to the sort? That seems insane.

The basic reality is that limit applies to the final set of rows that could
be output.  Since stuff like group by and distinct require knowledge of the
exact values of every output column all expressions must necessarily be
evaluated before limit.

If you want to pick just 10 rows and then process them you need a subquery.

David J.






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Can-simplify-limit-1-with-slow-function-tp5809997p5810297.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-02 Thread Tom Lane
David G Johnston david.g.johns...@gmail.com writes:
 Martijn van Oosterhout wrote
 I'm probably dense, but I'm not sure I understand. Or it is that the
 slowfunction() is called prior to the sort? That seems insane.

 The basic reality is that limit applies to the final set of rows that could
 be output.

It's not so much the limit as that the sort has to happen before the
limit, and yes, evaluation of the targetlist happens before the sort.

This is fundamental to the SQL conceptual model; remember that SQL92 had
SELECT slowfunction(), ... ORDER BY 1, which certainly requires the
function to be evaluated before the sort happens.  And there's nothing in
the conceptual model suggesting that different targetlist entries should
be evaluated at different times, so just ordering by something other than
the slowfunction() entry doesn't get you out of that.

I'm not sure how much of this there is chapter and verse for in the
SQL standard, but ISTM the stage sequencing we lay out in our SELECT
reference page is pretty much forced by the standard.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-02 Thread Martijn van Oosterhout
On Wed, Jul 02, 2014 at 04:17:13PM -0400, Tom Lane wrote:
 David G Johnston david.g.johns...@gmail.com writes:
  Martijn van Oosterhout wrote
  I'm probably dense, but I'm not sure I understand. Or it is that the
  slowfunction() is called prior to the sort? That seems insane.
 
  The basic reality is that limit applies to the final set of rows that could
  be output.
 
 It's not so much the limit as that the sort has to happen before the
 limit, and yes, evaluation of the targetlist happens before the sort.

I guess I assumed the column c was indexable, and it that case I
beleive the slowfunction() would indeed only be called once.

 This is fundamental to the SQL conceptual model; remember that SQL92 had
 SELECT slowfunction(), ... ORDER BY 1, which certainly requires the
 function to be evaluated before the sort happens.  And there's nothing in
 the conceptual model suggesting that different targetlist entries should
 be evaluated at different times, so just ordering by something other than
 the slowfunction() entry doesn't get you out of that.
 
 I'm not sure how much of this there is chapter and verse for in the
 SQL standard, but ISTM the stage sequencing we lay out in our SELECT
 reference page is pretty much forced by the standard.

In the conceptual model the limit must happen after the select. But as
an optimisation moving the evaluation above the limit node (when
possible) should always be a win.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-02 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 On Wed, Jul 02, 2014 at 04:17:13PM -0400, Tom Lane wrote:
 It's not so much the limit as that the sort has to happen before the
 limit, and yes, evaluation of the targetlist happens before the sort.

 I guess I assumed the column c was indexable, and it that case I
 beleive the slowfunction() would indeed only be called once.

There are cases where we can avoid an explicit sort step by relying on
some earlier phase of the processing pipeline to generate the rows in the
right order to begin with.  Evidently this wasn't one of them though :-(.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-01 Thread Martijn van Oosterhout
On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote:
 The simplified scene: 
 select slowfunction(s) from a order by b limit 1;
 is slow than
 select slowfunction(s) from (select s from a order by b limit 1) as z;
 if there are many records in table 'a'.
 
 
 The real scene. Function  ST_Distance_Sphere is slow, the query:
 SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road order 
 by c limit 1;
 is slow than:
 select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT s 
 from road order by c limit 1) as a;
 There are about 7000 records in 'road'.  

I think to help here I think we need the EXPLAIN ANALYSE output for
both queries.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-01 Thread Merlin Moncure
On Tue, Jul 1, 2014 at 2:16 PM, Martijn van Oosterhout
klep...@svana.org wrote:
 On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote:
 The simplified scene:
 select slowfunction(s) from a order by b limit 1;
 is slow than
 select slowfunction(s) from (select s from a order by b limit 1) as z;
 if there are many records in table 'a'.


 The real scene. Function  ST_Distance_Sphere is slow, the query:
 SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road order 
 by c limit 1;
 is slow than:
 select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT s 
 from road order by c limit 1) as a;
 There are about 7000 records in 'road'.

 I think to help here I think we need the EXPLAIN ANALYSE output for
 both queries.

Well, I think the problem is a well understood one: there is no
guarantee that functions-in-select-list are called exactly once per
output row.  This is documented -- for example see here:
http://www.postgresql.org/docs/9.1/static/explicit-locking.html#ADVISORY-LOCKS.
In short, if you want very precise control of function evaluation use
a subquery, or, if you're really paranoid, a CTE.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-01 Thread David G Johnston
Merlin Moncure-2 wrote
 On Tue, Jul 1, 2014 at 2:16 PM, Martijn van Oosterhout
 lt;

 kleptog@

 gt; wrote:
 On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote:
 The simplified scene:
 select slowfunction(s) from a order by b limit 1;
 is slow than
 select slowfunction(s) from (select s from a order by b limit 1) as z;
 if there are many records in table 'a'.


 The real scene. Function  ST_Distance_Sphere is slow, the query:
 SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road
 order by c limit 1;
 is slow than:
 select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT
 s from road order by c limit 1) as a;
 There are about 7000 records in 'road'.

 I think to help here I think we need the EXPLAIN ANALYSE output for
 both queries.
 
 Well, I think the problem is a well understood one: there is no
 guarantee that functions-in-select-list are called exactly once per
 output row.  This is documented -- for example see here:
 http://www.postgresql.org/docs/9.1/static/explicit-locking.html#ADVISORY-LOCKS.
 In short, if you want very precise control of function evaluation use
 a subquery, or, if you're really paranoid, a CTE.
 
 merlin

I would have to disagree on the this is documented comment - the linked
section on advisory locks does not constitute documentation of the fact that
limit can be applied after expressions in the select-list are evaluated.

http://www.postgresql.org/docs/9.3/static/sql-select.html

In the select command documentation item 5 covers select-list evaluation
while item 9 covers limit thus implying what we are saying - though keep in
mind each select statement gets processed independently and possibly in a
correlated fashion (i.e. potentially multiple times).

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Can-simplify-limit-1-with-slow-function-tp5809997p5810061.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-01 Thread Merlin Moncure
On Tue, Jul 1, 2014 at 3:06 PM, David G Johnston
david.g.johns...@gmail.com wrote:
 Merlin Moncure-2 wrote
 On Tue, Jul 1, 2014 at 2:16 PM, Martijn van Oosterhout
 lt;

 kleptog@

 gt; wrote:
 On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote:
 The simplified scene:
 select slowfunction(s) from a order by b limit 1;
 is slow than
 select slowfunction(s) from (select s from a order by b limit 1) as z;
 if there are many records in table 'a'.


 The real scene. Function  ST_Distance_Sphere is slow, the query:
 SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road
 order by c limit 1;
 is slow than:
 select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT
 s from road order by c limit 1) as a;
 There are about 7000 records in 'road'.

 I think to help here I think we need the EXPLAIN ANALYSE output for
 both queries.

 Well, I think the problem is a well understood one: there is no
 guarantee that functions-in-select-list are called exactly once per
 output row.  This is documented -- for example see here:
 http://www.postgresql.org/docs/9.1/static/explicit-locking.html#ADVISORY-LOCKS.
 In short, if you want very precise control of function evaluation use
 a subquery, or, if you're really paranoid, a CTE.

 merlin

 I would have to disagree on the this is documented comment - the linked
 section on advisory locks does not constitute documentation of the fact that
 limit can be applied after expressions in the select-list are evaluated.

 http://www.postgresql.org/docs/9.3/static/sql-select.html

 In the select command documentation item 5 covers select-list evaluation
 while item 9 covers limit thus implying what we are saying - though keep in
 mind each select statement gets processed independently and possibly in a
 correlated fashion (i.e. potentially multiple times).

Sure, although I did not claim that..the select documentation *does*
cover this behavior but I find the syntax driven doc pages to be
fairly arcane and unhelpful -- they don't say (for the most part)
avoid this or do that.  I pointed out this particular section
because it proved an example that matched the OP's problem case.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers