Re: [HACKERS] View Index and UNION

2013-05-27 Thread Stefan Keller
Hi Tom

You are right: UNION ALL is correct in terms of contents (tables
contents are disjunct) and of performance (no separate sort required
theoretically).
In my specific case even with UNION ALL the planner still chose a "Seq Scan".
Note that there is a KNN index with "ORDER BY ... <-> ..." involved.
I have to dig into my tests in order to give you the EXPLAIN ANALYZE.

Yours, Stefan


2013/5/26 Tom Lane :
> Stefan Keller  writes:
>> Given following schema:
>
>> 1. TABLE a and TABLE b, each with INDEX on attribute geom.
>
>> 2. A VIEW with union:
>
>> CREATE VIEW myview AS
>>   SELECT * FROM a
>>   UNION
>>   SELECT * FROM b;
>
>> 3. And a simple query with KNN index and a coordinate "mypos" :
>
>> SELECT * FROM myview
>> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom
>
> I think this would work out-of-the-box in 9.1 or later, if you
> made the view use UNION ALL instead of UNION.
>
> regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] View Index and UNION

2013-05-26 Thread Tom Lane
Stefan Keller  writes:
> Given following schema:

> 1. TABLE a and TABLE b, each with INDEX on attribute geom.

> 2. A VIEW with union:

> CREATE VIEW myview AS
>   SELECT * FROM a
>   UNION
>   SELECT * FROM b;

> 3. And a simple query with KNN index and a coordinate "mypos" :

> SELECT * FROM myview
> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom

I think this would work out-of-the-box in 9.1 or later, if you
made the view use UNION ALL instead of UNION.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] View Index and UNION

2013-05-26 Thread William King
I appear to have been able to replicate what you are talking about, but
it required explicitly binding the order by in different ways. See
attached files.

William King
Senior Engineer
Quentus Technologies, INC
1037 NE 65th St Suite 273
Seattle, WA 98115
Main:   (877) 211-9337
Office: (206) 388-4772
Cell:   (253) 686-5518
william.k...@quentustech.com

On 05/26/2013 02:22 AM, Stefan Keller wrote:
> Yes, it actually does, but the planner chooses a seq scan to prepare for that.
> 
> -S.
> 
> 2013/5/26 William King :
>> Could this scenario not be handled by a step that orders the two tables
>> independently, then for the view interleaves the presorted results?
>> Merging two sorted sets into a single sorted set is usually a trivial
>> task, and it could still take advantage of the existing indexes.
>>
>> William King
>> Senior Engineer
>> Quentus Technologies, INC
>> 1037 NE 65th St Suite 273
>> Seattle, WA 98115
>> Main:   (877) 211-9337
>> Office: (206) 388-4772
>> Cell:   (253) 686-5518
>> william.k...@quentustech.com
>>
>> On 05/25/2013 05:35 PM, Stefan Keller wrote:
>>> Hi
>>>
>>> I've encountered a fundamental problem which - to me - can only be
>>> solved with an (future/possible) real index on views in PostgreSQL
>>> (like the exist already in MS SQL Server and Ora):
>>>
>>> Given following schema:
>>>
>>> 1. TABLE a and TABLE b, each with INDEX on attribute geom.
>>>
>>> 2. A VIEW with union:
>>>
>>> CREATE VIEW myview AS
>>>   SELECT * FROM a
>>>   UNION
>>>   SELECT * FROM b;
>>>
>>> 3. And a simple query with KNN index and a coordinate "mypos" :
>>>
>>> SELECT * FROM myview
>>> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom
>>>
>>> Now, the problem is, that for the "order by" it is not enough that
>>> each on the two tables calculate the ordering separately: We want a
>>> total ordering over all involved tables!
>>>
>>> In fact, the planner realizes that and chooses a seq scan over all
>>> tuples of table a and b - which is slow and suboptimal!
>>>
>>> To me, that's a use case where we would wish to have a distinct index on 
>>> views.
>>>
>>> Any opinions on this?
>>>
>>> Yours, Stefan
>>>
>>>
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
quentusrex=# \i sql/view_index_union.sql
CREATE TABLE
CREATE INDEX
CREATE TABLE
CREATE INDEX
INSERT 0 5
INSERT 0 5
CREATE VIEW
 QUERY PLAN 


 Index Scan using named_a_idx on a  (cost=0.41..4329.78 rows=51291 width=36) 
(actual time=0.078..33.854 rows=5 loops=1)
 Total runtime: 36.226 ms
(2 rows)

 QUERY PLAN 


 Index Scan using named_b_idx on b  (cost=0.41..4385.78 rows=51291 width=36) 
(actual time=0.036..27.166 rows=5 loops=1)
 Total runtime: 29.418 ms
(2 rows)

  QUERY PLAN
  
--
 Sort  (cost=27371.05..27627.51 rows=102582 width=36) (actual 
time=435.566..535.213 rows=10 loops=1)
   Sort Key: a.named
   Sort Method: external merge  Disk: 4576kB
   ->  Unique  (cost=14230.75..15000.12 rows=102582 width=36) (actual 
time=75.540..131.131 rows=10 loops=1)
 ->  Sort  (cost=14230.75..14487.21 rows=102582 width=36) (actual 
time=75.539..102.016 rows=10 loops=1)
   Sort Key: a.id, a.named
   Sort Method: external merge  Disk: 4584kB
   ->  Append  (cost=0.00..2885.64 rows=102582 width=36) (actual 
time=0.005..22.103 rows=10 loops=1)
 ->  Seq Scan on a  (cost=0.00..929.91 rows=51291 width=36) 
(actual time=0.005..6.389 rows=5 loops=1)
 ->  Seq Scan on b  (cost=0.00..929.91 rows=51291 width=36) 
(actual time=0.003..5.811 rows=5 loops=1)
 Total runtime: 541.763 ms
(11 rows)

  QUERY PLAN
  
--
 Sort  (cost=26345.23..26601.69 rows=102582 width=36) (actual 
time=435.116..534.755 rows=10 loops=1)
   Sort Key: a.named
   Sort Method: external merge  Disk: 4576kB
   ->  Unique  (cost=14230.75..15000.12 rows=102582 width=36) (actual 
time=75.025..130.706 rows=10 loops=1)
 ->  Sort  (c

Re: [HACKERS] View Index and UNION

2013-05-26 Thread Stefan Keller
Yes, it actually does, but the planner chooses a seq scan to prepare for that.

-S.

2013/5/26 William King :
> Could this scenario not be handled by a step that orders the two tables
> independently, then for the view interleaves the presorted results?
> Merging two sorted sets into a single sorted set is usually a trivial
> task, and it could still take advantage of the existing indexes.
>
> William King
> Senior Engineer
> Quentus Technologies, INC
> 1037 NE 65th St Suite 273
> Seattle, WA 98115
> Main:   (877) 211-9337
> Office: (206) 388-4772
> Cell:   (253) 686-5518
> william.k...@quentustech.com
>
> On 05/25/2013 05:35 PM, Stefan Keller wrote:
>> Hi
>>
>> I've encountered a fundamental problem which - to me - can only be
>> solved with an (future/possible) real index on views in PostgreSQL
>> (like the exist already in MS SQL Server and Ora):
>>
>> Given following schema:
>>
>> 1. TABLE a and TABLE b, each with INDEX on attribute geom.
>>
>> 2. A VIEW with union:
>>
>> CREATE VIEW myview AS
>>   SELECT * FROM a
>>   UNION
>>   SELECT * FROM b;
>>
>> 3. And a simple query with KNN index and a coordinate "mypos" :
>>
>> SELECT * FROM myview
>> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom
>>
>> Now, the problem is, that for the "order by" it is not enough that
>> each on the two tables calculate the ordering separately: We want a
>> total ordering over all involved tables!
>>
>> In fact, the planner realizes that and chooses a seq scan over all
>> tuples of table a and b - which is slow and suboptimal!
>>
>> To me, that's a use case where we would wish to have a distinct index on 
>> views.
>>
>> Any opinions on this?
>>
>> Yours, Stefan
>>
>>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] View Index and UNION

2013-05-25 Thread William King
Could this scenario not be handled by a step that orders the two tables
independently, then for the view interleaves the presorted results?
Merging two sorted sets into a single sorted set is usually a trivial
task, and it could still take advantage of the existing indexes.

William King
Senior Engineer
Quentus Technologies, INC
1037 NE 65th St Suite 273
Seattle, WA 98115
Main:   (877) 211-9337
Office: (206) 388-4772
Cell:   (253) 686-5518
william.k...@quentustech.com

On 05/25/2013 05:35 PM, Stefan Keller wrote:
> Hi
> 
> I've encountered a fundamental problem which - to me - can only be
> solved with an (future/possible) real index on views in PostgreSQL
> (like the exist already in MS SQL Server and Ora):
> 
> Given following schema:
> 
> 1. TABLE a and TABLE b, each with INDEX on attribute geom.
> 
> 2. A VIEW with union:
> 
> CREATE VIEW myview AS
>   SELECT * FROM a
>   UNION
>   SELECT * FROM b;
> 
> 3. And a simple query with KNN index and a coordinate "mypos" :
> 
> SELECT * FROM myview
> ORDER BY ST_Geomfromtext(mypos) <-> myview.geom
> 
> Now, the problem is, that for the "order by" it is not enough that
> each on the two tables calculate the ordering separately: We want a
> total ordering over all involved tables!
> 
> In fact, the planner realizes that and chooses a seq scan over all
> tuples of table a and b - which is slow and suboptimal!
> 
> To me, that's a use case where we would wish to have a distinct index on 
> views.
> 
> Any opinions on this?
> 
> Yours, Stefan
> 
> 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] View Index and UNION

2013-05-25 Thread Stefan Keller
Hi

I've encountered a fundamental problem which - to me - can only be
solved with an (future/possible) real index on views in PostgreSQL
(like the exist already in MS SQL Server and Ora):

Given following schema:

1. TABLE a and TABLE b, each with INDEX on attribute geom.

2. A VIEW with union:

CREATE VIEW myview AS
  SELECT * FROM a
  UNION
  SELECT * FROM b;

3. And a simple query with KNN index and a coordinate "mypos" :

SELECT * FROM myview
ORDER BY ST_Geomfromtext(mypos) <-> myview.geom

Now, the problem is, that for the "order by" it is not enough that
each on the two tables calculate the ordering separately: We want a
total ordering over all involved tables!

In fact, the planner realizes that and chooses a seq scan over all
tuples of table a and b - which is slow and suboptimal!

To me, that's a use case where we would wish to have a distinct index on views.

Any opinions on this?

Yours, Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers