Re: [HACKERS] Optimizer boolean syntax

2002-11-23 Thread Daniele Orlandi
Tom Lane wrote:


Only two of them are logically equivalent.  Consider NULL.


Ohhh IS NOT TRUE or IS NOT FALSE also match NULL, I never knew this :)


Even for the first two, assuming equivalence requires hard-wiring an
assumption about the behavior of the bool = bool operator; which is
a user-redefinable operator.

  I'm not totally comfortable with the idea.

Ok, I see your point and the problems that may arise, but I hope wou 
will agree with me that from the point of view of the user, both clauses 
have the same meaning and the index usage should be consistant with it.

Unfortunatelly I don't know very well PostgreSQL internals, so I may be 
saying a load of bullshits, but wouldn't be possible to consider any 
evaluation of a bool expression in the form of bool=bool with true as 
the second 'bool'[1] ? At least as a TODO item ?


Thanks!
Bye!

[1] Eventually including the var IS TRUE and var IS FALSE (not var 
IS NOT ...) which already are special syntax cases if I am not wrong.

--
 Daniele Orlandi
 Planet Srl


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Optimizer boolean syntax

2002-11-22 Thread scott.marlowe
On Thu, 21 Nov 2002, Stephan Szabo wrote:

 
 On Thu, 21 Nov 2002, Christopher Kings-Lynne wrote:
 
 col isn't of the general form indexkey op constant or constant op
 indexkey which I presume it's looking for given the comments in
 indxpath.c.  I'm not sure what the best way to make it work would be
  given
 that presumably we'd want to make col IS TRUE/FALSE use an index at
  the
 same time (since that appears to not do so as well).
   
Not that I see the point of indexing booleans, but hey :)
  
   also, in reference to my last message, even if the % was 50/50, if the
   table was such that the bool was in a table next to a text field with 20k
   or text in it, an index on the bool would be much faster to go through
   than to seq scan the table.
 
  Hmmm...I'm not sure about that.  Postgres's storage strategry with text will
  be to keep it in a side table (or you can use ALTER TABLE/SET STORAGE) and
  it will only be retrieved if it's in the select parameters.
 
 True, but replace that text with 1500 integers. :)
 
 The only problem with the partial index solution is that it seems to still
 only work for the same method of asking for the result, so if you make an
 index where col=true, using col IS TRUE or col in a query doesn't seem to
 use it.

True.  I always use the syntax:

select * from table where field IS TRUE

OR IS FALSE  for consistency.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread Robert Treat
Using the famous WAG tech, in your first query the optimizer has to
evaluate monitored for each record to determine its value.

Robert Treat

On Thu, 2002-11-21 at 13:39, Daniele Orlandi wrote:
 
 Are those two syntaxes eqivalent ?
 
 select * from users where monitored;
 select * from users where monitored=true;
 
 If the answer is yes, the optimimer probably doesn't agree with you :)
 
 Tested on RC1:
 
 template1=# create table a (a boolean, b text);
 CREATE TABLE
 
 
  inserted ~18000 rows with just one true (just to make an index scan 
   meaningful)
 
 template1=# vacuum analyze a;
 VACUUM
 template1=# explain select * from a where a;
   QUERY PLAN
 
   Seq Scan on a  (cost=0.00..802.64 rows=1 width=11)
 Filter: a
 (2 rows)
 
 template1=# explain select * from a where a=true;
QUERY PLAN
 --
   Index Scan using a_a on a  (cost=0.00..2.01 rows=1 width=11)
 Index Cond: (a = true)
 (2 rows)
 
 Bye!
 
 -- 
   Daniele Orlandi
   Planet Srl
 
 
 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?
 
 http://www.postgresql.org/users-lounge/docs/faq.html




---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread Stephan Szabo
On Thu, 21 Nov 2002, Daniele Orlandi wrote:

 Are those two syntaxes eqivalent ?

 select * from users where monitored;
 select * from users where monitored=true;

 If the answer is yes, the optimimer probably doesn't agree with you :)

That depends on the definition of equivalent.  They presumably give the
same answer (I'm assuming monitored is a boolean), but the latter has
something that's considered an indexable condition and I believe the
former does not (even with enable_seqscan=off the former syntax
appears to give a sequence scan, usually a good sign it's not considered
indexable).


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread Christopher Kings-Lynne
  Are those two syntaxes eqivalent ?
 
  select * from users where monitored;
  select * from users where monitored=true;
 
  If the answer is yes, the optimimer probably doesn't agree with you :)

 That depends on the definition of equivalent.  They presumably give the
 same answer (I'm assuming monitored is a boolean), but the latter has
 something that's considered an indexable condition and I believe the
 former does not (even with enable_seqscan=off the former syntax
 appears to give a sequence scan, usually a good sign it's not considered
 indexable).

I think his point is that they _should_ be equivalent.  Surely there's
something in the optimiser that discards '=true' stuff, like 'a=a' should be
discarded?

Chris



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread Stephan Szabo

On Thu, 21 Nov 2002, Christopher Kings-Lynne wrote:

   Are those two syntaxes eqivalent ?
  
   select * from users where monitored;
   select * from users where monitored=true;
  
   If the answer is yes, the optimimer probably doesn't agree with you :)
 
  That depends on the definition of equivalent.  They presumably give the
  same answer (I'm assuming monitored is a boolean), but the latter has
  something that's considered an indexable condition and I believe the
  former does not (even with enable_seqscan=off the former syntax
  appears to give a sequence scan, usually a good sign it's not considered
  indexable).

 I think his point is that they _should_ be equivalent.  Surely there's
 something in the optimiser that discards '=true' stuff, like 'a=a' should be
 discarded?

I figure that's what he meant, but it isn't what was said. ;)

col isn't of the general form indexkey op constant or constant op
indexkey which I presume it's looking for given the comments in
indxpath.c.  I'm not sure what the best way to make it work would be given
that presumably we'd want to make col IS TRUE/FALSE use an index at the
same time (since that appears to not do so as well).



---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread Christopher Kings-Lynne
  I think his point is that they _should_ be equivalent.  Surely there's
  something in the optimiser that discards '=true' stuff, like 'a=a'
should be
  discarded?

 I figure that's what he meant, but it isn't what was said. ;)

 col isn't of the general form indexkey op constant or constant op
 indexkey which I presume it's looking for given the comments in
 indxpath.c.  I'm not sure what the best way to make it work would be given
 that presumably we'd want to make col IS TRUE/FALSE use an index at the
 same time (since that appears to not do so as well).

Not that I see the point of indexing booleans, but hey :)

Chris



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread scott.marlowe
On Thu, 21 Nov 2002, Christopher Kings-Lynne wrote:

   I think his point is that they _should_ be equivalent.  Surely there's
   something in the optimiser that discards '=true' stuff, like 'a=a'
 should be
   discarded?
 
  I figure that's what he meant, but it isn't what was said. ;)
 
  col isn't of the general form indexkey op constant or constant op
  indexkey which I presume it's looking for given the comments in
  indxpath.c.  I'm not sure what the best way to make it work would be given
  that presumably we'd want to make col IS TRUE/FALSE use an index at the
  same time (since that appears to not do so as well).
 
 Not that I see the point of indexing booleans, but hey :)

While full indexes do seem futile, partial indexes can be quite useful.

select articles from forum where approved is false

if 99.9% of all articles are approved would be quite common.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread scott.marlowe
On Thu, 21 Nov 2002, Christopher Kings-Lynne wrote:

   I think his point is that they _should_ be equivalent.  Surely there's
   something in the optimiser that discards '=true' stuff, like 'a=a'
 should be
   discarded?
 
  I figure that's what he meant, but it isn't what was said. ;)
 
  col isn't of the general form indexkey op constant or constant op
  indexkey which I presume it's looking for given the comments in
  indxpath.c.  I'm not sure what the best way to make it work would be given
  that presumably we'd want to make col IS TRUE/FALSE use an index at the
  same time (since that appears to not do so as well).
 
 Not that I see the point of indexing booleans, but hey :)

also, in reference to my last message, even if the % was 50/50, if the 
table was such that the bool was in a table next to a text field with 20k 
or text in it, an index on the bool would be much faster to go through 
than to seq scan the table.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread Christopher Kings-Lynne
  Not that I see the point of indexing booleans, but hey :)
 
 If one of the values is much more infrequent than the other, you can
 probably get a substantial win using a partial index, can't you?

Yes, I thought of the partial index after I wrote that email :)

Chris



---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread Christopher Kings-Lynne
   col isn't of the general form indexkey op constant or constant op
   indexkey which I presume it's looking for given the comments in
   indxpath.c.  I'm not sure what the best way to make it work would be
given
   that presumably we'd want to make col IS TRUE/FALSE use an index at
the
   same time (since that appears to not do so as well).
 
  Not that I see the point of indexing booleans, but hey :)

 also, in reference to my last message, even if the % was 50/50, if the
 table was such that the bool was in a table next to a text field with 20k
 or text in it, an index on the bool would be much faster to go through
 than to seq scan the table.

Hmmm...I'm not sure about that.  Postgres's storage strategry with text will
be to keep it in a side table (or you can use ALTER TABLE/SET STORAGE) and
it will only be retrieved if it's in the select parameters.

Chris


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread Stephan Szabo

On Thu, 21 Nov 2002, Christopher Kings-Lynne wrote:

col isn't of the general form indexkey op constant or constant op
indexkey which I presume it's looking for given the comments in
indxpath.c.  I'm not sure what the best way to make it work would be
 given
that presumably we'd want to make col IS TRUE/FALSE use an index at
 the
same time (since that appears to not do so as well).
  
   Not that I see the point of indexing booleans, but hey :)
 
  also, in reference to my last message, even if the % was 50/50, if the
  table was such that the bool was in a table next to a text field with 20k
  or text in it, an index on the bool would be much faster to go through
  than to seq scan the table.

 Hmmm...I'm not sure about that.  Postgres's storage strategry with text will
 be to keep it in a side table (or you can use ALTER TABLE/SET STORAGE) and
 it will only be retrieved if it's in the select parameters.

True, but replace that text with 1500 integers. :)

The only problem with the partial index solution is that it seems to still
only work for the same method of asking for the result, so if you make an
index where col=true, using col IS TRUE or col in a query doesn't seem to
use it.


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])



Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread Daniele Orlandi
Stephan Szabo wrote:

On Thu, 21 Nov 2002, Daniele Orlandi wrote:



Are those two syntaxes eqivalent ?

select * from users where monitored;
select * from users where monitored=true;

If the answer is yes, the optimimer probably doesn't agree with you :)



That depends on the definition of equivalent.


By equivalent I mean means the same thing so, behaves in the same way.

I consider the former syntax to be cleaner and I would tend to use it 
most of times.

For what concerns partial indexes, I agree, it's a better approach for 
this kind of indexing and I did some test:

-
ctonet=# create index users_monitored on users (monitored) where monitored;
CREATE
ctonet=# explain select * from users where monitored;
NOTICE:  QUERY PLAN:

Index Scan using users_monitored on users  (cost=0.00..9.44 rows=6 
width=186)

EXPLAIN

Nice, it appears to use the index, but:


ctonet=# explain select * from users where monitored=true;
NOTICE:  QUERY PLAN:

Seq Scan on users  (cost=0.00..8298.84 rows=59 width=186)

EXPLAIN
-

The problem is the opposite... so, effectively, seems that the optimizer 
considers monitored and monitored=true as two different expressions...

The viceversa is analog and we also can see that the syntax monitored 
is true is considered different from the other two syntaxes:

---
ctonet=# drop index users_monitored;
DROP
ctonet=# create index users_monitored on users (monitored) where 
monitored=true;
CREATE
ctonet=# explain select * from users where monitored=true;
NOTICE:  QUERY PLAN:

Index Scan using users_monitored on users  (cost=0.00..9.45 rows=6 
width=186)

EXPLAIN
ctonet=# explain select * from users where monitored;
NOTICE:  QUERY PLAN:

Seq Scan on users  (cost=0.00..8077.07 rows=59 width=186)

EXPLAIN

ctonet=# create index users_monitored on users (monitored) where 
monitored=true;
CREATE
ctonet=# explain select * from users where monitored is true;
NOTICE:  QUERY PLAN:

Seq Scan on users  (cost=0.00..8077.07 rows=59 width=186)

EXPLAIN
-

What I propose is that all those syntaxes are made equivalent (by, for 
example, rewriting boolean comparisons to a common form) in order to 
have a more consistent index usage.

Bye!

--
 Daniele Orlandi
 Planet Srl


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html


Re: [HACKERS] Optimizer boolean syntax

2002-11-21 Thread Tom Lane
Daniele Orlandi [EMAIL PROTECTED] writes:
 The problem is the opposite... so, effectively, seems that the optimizer 
 considers monitored and monitored=true as two different expressions...

Check.

 The viceversa is analog and we also can see that the syntax monitored 
 is true is considered different from the other two syntaxes:

As it should be.

 What I propose is that all those syntaxes are made equivalent

Only two of them are logically equivalent.  Consider NULL.

Even for the first two, assuming equivalence requires hard-wiring an
assumption about the behavior of the bool = bool operator; which is
a user-redefinable operator.  I'm not totally comfortable with the idea.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly