Re: Optimising a two column OR check

2019-10-12 Thread Andrew Gierth
> "MichaelDBA" == MichaelDBA   writes:

 MichaelDBA> Yep, you're right, Andrew, adding a couple rows made it do
 MichaelDBA> the index only scan.  I reckon I got misled by turning off
 MichaelDBA> sequential scans, thinking that actual rows were not
 MichaelDBA> important anymore.  Overly simplistic reasonings can get
 MichaelDBA> one into trouble, lol.

We do some odd stuff with the statistics estimates for completely empty
tables because (a) it's not common in practice for a table to be always
empty (i.e. the emptiness is usually transient) and (b) if you take the
emptiness of a table at face value, you end up generating insanely bad
plans for certain FK check queries that may not get replanned quickly
enough to mitigate the performance impact.

-- 
Andrew (irc:RhodiumToad)




Re: Optimising a two column OR check

2019-10-12 Thread MichaelDBA
Yep, you're right, Andrew, adding a couple rows made it do the index 
only scan.  I reckon I got misled by turning off sequential scans, 
thinking that actual rows were not important anymore.  Overly simplistic 
reasonings can get one into trouble, lol.


Regards,
Michael Vitale


Andrew Gierth wrote on 10/12/2019 11:46 AM:

"MichaelDBA" == MichaelDBA   writes:

  MichaelDBA> Nope, vacuumed it and still got the bitmap index scans.

Let's see your explains. Here's mine:

# set enable_seqscan=false;  -- because I only have a few rows
SET
# insert into friend values (1,2),(2,5);
INSERT 0 2
# vacuum analyze friend;
VACUUM
# explain analyze SELECT user1_id FROM friend WHERE user2_id=2 UNION ALL select 
user2_id FROM friend WHERE user1_id=2;
QUERY PLAN

  Append  (cost=0.13..8.32 rows=2 width=4) (actual time=0.009..0.014 rows=2 
loops=1)
->  Index Only Scan using friend_user2_id_user1_id_idx on friend  
(cost=0.13..4.15 rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1)
  Index Cond: (user2_id = 2)
  Heap Fetches: 0
->  Index Only Scan using friend_pkey on friend friend_1  (cost=0.13..4.15 
rows=1 width=4) (actual time=0.003..0.004 rows=1 loops=1)
  Index Cond: (user1_id = 2)
  Heap Fetches: 0
  Planning Time: 0.271 ms
  Execution Time: 0.045 ms
(9 rows)

Note that you have to put some actual rows in the table; if it is
completely empty, you'll not get a representative result.







Re: Optimising a two column OR check

2019-10-12 Thread Andrew Gierth
> "MichaelDBA" == MichaelDBA   writes:

 MichaelDBA> Nope, vacuumed it and still got the bitmap index scans.

Let's see your explains. Here's mine:

# set enable_seqscan=false;  -- because I only have a few rows
SET
# insert into friend values (1,2),(2,5);
INSERT 0 2
# vacuum analyze friend;
VACUUM
# explain analyze SELECT user1_id FROM friend WHERE user2_id=2 UNION ALL select 
user2_id FROM friend WHERE user1_id=2;
   QUERY PLAN   


 Append  (cost=0.13..8.32 rows=2 width=4) (actual time=0.009..0.014 rows=2 
loops=1)
   ->  Index Only Scan using friend_user2_id_user1_id_idx on friend  
(cost=0.13..4.15 rows=1 width=4) (actual time=0.009..0.009 rows=1 loops=1)
 Index Cond: (user2_id = 2)
 Heap Fetches: 0
   ->  Index Only Scan using friend_pkey on friend friend_1  (cost=0.13..4.15 
rows=1 width=4) (actual time=0.003..0.004 rows=1 loops=1)
 Index Cond: (user1_id = 2)
 Heap Fetches: 0
 Planning Time: 0.271 ms
 Execution Time: 0.045 ms
(9 rows)

Note that you have to put some actual rows in the table; if it is
completely empty, you'll not get a representative result.

-- 
Andrew (irc:RhodiumToad)




Re: Optimising a two column OR check

2019-10-12 Thread MichaelDBA
Perhaps the fix by Alexander Kuzmenkov in V11 added this VM 
consideration for having a preference of bitmap index scan over an index 
only scan.  Looks like I'm goin' down the rabbit hole...


Regards,
Michael Vitale

MichaelDBA wrote on 10/12/2019 11:35 AM:

Nope, vacuumed it and still got the bitmap index scans.

Andrew Gierth wrote on 10/12/2019 11:33 AM:

"MichaelDBA" == MichaelDBA   writes:

MichaelDBA> BTW, to Andrew, the UNION ALL alternative still results in
  MichaelDBA> bitmap index scans from my testing.

You probably forgot to vacuum the table.









Re: Optimising a two column OR check

2019-10-12 Thread MichaelDBA

Nope, vacuumed it and still got the bitmap index scans.

Andrew Gierth wrote on 10/12/2019 11:33 AM:

"MichaelDBA" == MichaelDBA   writes:

  MichaelDBA> BTW, to Andrew, the UNION ALL alternative still results in
  MichaelDBA> bitmap index scans from my testing.

You probably forgot to vacuum the table.







Re: Optimising a two column OR check

2019-10-12 Thread MichaelDBA

Yikes, apologies to all, my wording is the opposite of what I meant!

Index only scans are preferred for infrequently updated ones, not 
heavily updated ones where the visibility map is updated often.


Regards,
Michael Vitale


MichaelDBA wrote on 10/12/2019 11:27 AM:
Another thing to consider is the visibility map.  From what I 
understand, index only scans are preferred for heavily updated tables, 
not infrequently updated ones.  Even though index only scans imply 
ONLY they really aren't in the sense that they may need to visit the 
Visibility Map for the heap. This can be costly and the planner may 
remove index only scan consideration if the VM has tuples that are not 
visible.


BTW, to Andrew, the UNION ALL alternative still results in bitmap 
index scans from my testing.


Regards,
Michael Vitale



Jeff Janes wrote on 10/12/2019 11:17 AM:
On Sat, Oct 12, 2019 at 10:43 AM Justin Pryzby > wrote:


On Sat, Oct 12, 2019 at 04:39:56PM +0200, Ivan Voras wrote:
> With seqscan disabled, I get this plan on 9.6:
>  Bitmap Heap Scan on friend  (cost=8.42..19.01 rows=14 width=8)
...
> I expected to get an index-only scan in this situation, as that
would be a
> very common query. Is there a way to actually make this sort of
query
> resolvable with an index-only scan? Maybe a different table
structure would
> help?


It would have to scan the entire index to find the cases where 
user2_id=42 but user1_id is not constrained. Technically User1_id 
would be constrained to be less than 42, but I don't think the 
planner will take that into account.



The v11 release notes have this relevant item:

https://www.postgresql.org/docs/11/release-11.html
|Allow bitmap scans to perform index-only scans when possible
(Alexander Kuzmenkov)


But this is not one of those cases.  It is only possible when the 
only data needed is whether the row exists or not.


Cheers,

Jeff






Re: Optimising a two column OR check

2019-10-12 Thread Andrew Gierth
> "MichaelDBA" == MichaelDBA   writes:

 MichaelDBA> BTW, to Andrew, the UNION ALL alternative still results in
 MichaelDBA> bitmap index scans from my testing.

You probably forgot to vacuum the table.

-- 
Andrew (irc:RhodiumToad)




Re: Optimising a two column OR check

2019-10-12 Thread MichaelDBA
Another thing to consider is the visibility map.  From what I 
understand, index only scans are preferred for heavily updated tables, 
not infrequently updated ones.  Even though index only scans imply ONLY 
they really aren't in the sense that they may need to visit the 
Visibility Map for the heap. This can be costly and the planner may 
remove index only scan consideration if the VM has tuples that are not 
visible.


BTW, to Andrew, the UNION ALL alternative still results in bitmap index 
scans from my testing.


Regards,
Michael Vitale



Jeff Janes wrote on 10/12/2019 11:17 AM:
On Sat, Oct 12, 2019 at 10:43 AM Justin Pryzby > wrote:


On Sat, Oct 12, 2019 at 04:39:56PM +0200, Ivan Voras wrote:
> With seqscan disabled, I get this plan on 9.6:
>  Bitmap Heap Scan on friend  (cost=8.42..19.01 rows=14 width=8)
...
> I expected to get an index-only scan in this situation, as that
would be a
> very common query. Is there a way to actually make this sort of
query
> resolvable with an index-only scan? Maybe a different table
structure would
> help?


It would have to scan the entire index to find the cases where 
user2_id=42 but user1_id is not constrained. Technically User1_id 
would be constrained to be less than 42, but I don't think the planner 
will take that into account.



The v11 release notes have this relevant item:

https://www.postgresql.org/docs/11/release-11.html
|Allow bitmap scans to perform index-only scans when possible
(Alexander Kuzmenkov)


But this is not one of those cases.  It is only possible when the only 
data needed is whether the row exists or not.


Cheers,

Jeff




Re: Optimising a two column OR check

2019-10-12 Thread Jeff Janes
On Sat, Oct 12, 2019 at 10:43 AM Justin Pryzby  wrote:

> On Sat, Oct 12, 2019 at 04:39:56PM +0200, Ivan Voras wrote:
> > With seqscan disabled, I get this plan on 9.6:
> >  Bitmap Heap Scan on friend  (cost=8.42..19.01 rows=14 width=8)
> ...
> > I expected to get an index-only scan in this situation, as that would be
> a
> > very common query. Is there a way to actually make this sort of query
> > resolvable with an index-only scan? Maybe a different table structure
> would
> > help?
>

It would have to scan the entire index to find the cases where   user2_id=42
but user1_id is not constrained.  Technically User1_id would be constrained
to be less than 42, but I don't think the planner will take that into
account.


> The v11 release notes have this relevant item:
>
> https://www.postgresql.org/docs/11/release-11.html
> |Allow bitmap scans to perform index-only scans when possible (Alexander
> Kuzmenkov)
>
>
But this is not one of those cases.  It is only possible when the only data
needed is whether the row exists or not.

Cheers,

Jeff


Re: Optimising a two column OR check

2019-10-12 Thread Andrew Gierth
> "Ivan" == Ivan Voras  writes:

 Ivan> Hello,
 Ivan> There's a "users" table with the following structure:

 Ivan> CREATE TABLE "user" (
 Ivan>   id SERIAL PRIMARY KEY,
 Ivan>  -- other fields
 Ivan> );

 Ivan> and there's a "friends" table with the following structure:

 Ivan> CREATE TABLE friend (
 Ivan>   user1_id INTEGER NOT NULL REFERENCES "user"(id),
 Ivan>   user2_id INTEGER NOT NULL REFERENCES "user"(id),
 Ivan>  -- other fields
 Ivan>   CHECK (user1_id < user2_id),
 Ivan>   PRIMARY KEY (user1_id, user2_id)
 Ivan> );

 Ivan> And I'm running this query:

 Ivan> SELECT user1_id,user2_id FROM friend WHERE user1_id=42 OR user2_id=42;

To get friends of user 42:

SELECT user1_id FROM friend WHERE user2_id=42
UNION ALL
SELECT user2_id FROM friend WHERE user1_id=42;

assuming you create the (user2_id,user1_id) index, this should get you
an Append of two index-only scans. We can use UNION ALL here rather than
UNION because the table constraints ensure there are no duplicates.

-- 
Andrew (irc:RhodiumToad)




Re: Optimising a two column OR check

2019-10-12 Thread Justin Pryzby
On Sat, Oct 12, 2019 at 04:39:56PM +0200, Ivan Voras wrote:
> With seqscan disabled, I get this plan on 9.6:
>  Bitmap Heap Scan on friend  (cost=8.42..19.01 rows=14 width=8)
...
> I expected to get an index-only scan in this situation, as that would be a
> very common query. Is there a way to actually make this sort of query
> resolvable with an index-only scan? Maybe a different table structure would
> help?

The v11 release notes have this relevant item:

https://www.postgresql.org/docs/11/release-11.html
|Allow bitmap scans to perform index-only scans when possible (Alexander 
Kuzmenkov)

Justin