-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

while changing a column from base255 encoded text (all except null byte) to
bytea, I found the following bug in Postgresql's LIKE operator with indexes
(it follows a more detailed description then my old mails in -bugs and
- -general, including the proof of the bug):


The index condition in the query plan for "where bytea_column like 'a%'" is:

   Index Cond: (bytea_col >= 'a'::bytea) AND (bytea_col < 'b'::bytea))
   Filter: (bcol ~~ 'a%'::bytea)

This is correct.


The index condition in the query plan for "bytea_column like '\\141%'" ("a"
in octal is 141) is exaclty the same, including filter condition. 

   Index Cond: ((bcol >= 'a'::bytea) AND (bcol < 'b'::bytea))
   Filter: (bcol ~~ 'a%'::bytea)

This is also correct.


The index condition in the query plan for "bytea_column like '\\001%'" is:

   Index Cond: (bcol = '0'::bytea)
   Filter: (bcol ~~ '\\001%'::bytea)


THIS IS WRONG! Isn't it?


If the byte is displayable in ASCII, then all is OK. If not, it seems that
Postgres takes the first character of the octal number and uses this as
comparison parameter.
With "ä" (344) it takes "3" ...


When index scan is disabled or from other reasons seqscan is used, the
query plan and the result is correct.

The result differs, if index is used or not used.

I guess there is too much conversion between different character sets etc.


A piece of test SQL and the results are attached.

My Version is: 
PostgreSQL 7.4 on i386-portbld-freebsd4.8, compiled by GCC 2.95.4

The same was with 7.3.4


Ciao
  Alvar


- -- 
** Alvar C.H. Freude -- http://alvar.a-blast.org/ -- http://odem.org/
**   Berufsverbot? http://odem.org/aktuelles/staatsanwalt.de.html
**   ODEM.org-Tour: http://tour.odem.org/
**   Informationsgesellschaft: http://www.wsis-koordinierungskreis.de/

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (FreeBSD)

iD8DBQE/z6YbOndlH63J86wRAr+qAKCo6yi3/0HGO13IkKP2KbyH147kMACeKq7T
WEKPu3dNKnesLqQUd9puyh0=
=Sivh
-----END PGP SIGNATURE-----
rollback;
begin;


create table test (bcol bytea);
create index tst_idx on test(bcol);

insert into test values ('abc');
insert into test values ('abcdef');
insert into test values ('abc\\003');
insert into test values ('abc\\002');
insert into test values ('abc\\004');
insert into test values ('abc\\005');
insert into test values ('\\001abc\\006');
insert into test values ('\\001xabc\\006');
insert into test values ('\\001\\002abc\\006');
insert into test values ('\\002\\003abc\\006');
insert into test values ('\\001abc\\006');
insert into test values ('\\001xabc\\006');
insert into test values ('\\001\\002abc\\006');
insert into test values ('\\000\\001\\002abc\\006');
insert into test values ('\\002\\003abc\\006');

set enable_indexscan=on;
set enable_seqscan=off;

select * from test where bcol like '\\001%';
select * from test where bcol like '\\001\002%';
select * from test where bcol like 'a%';
select * from test where bcol like '\\141%';
select * from test where bcol like 'ä%';
select * from test where bcol like '\\344%';
select * from test where bcol like '\1%';

explain analyze select * from test where bcol like '\\001%';
explain analyze select * from test where bcol like '\\001\002%';
explain analyze select * from test where bcol like 'a%';
explain analyze select * from test where bcol like '\\141%';
explain analyze select * from test where bcol like 'ä%';
explain analyze select * from test where bcol like '\\344%';
explain analyze select * from test where bcol like '\1%';


set enable_indexscan=off;
set enable_seqscan=on;


select * from test where bcol like '\\001%';
select * from test where bcol like '\\001\002%';
select * from test where bcol like 'a%';
select * from test where bcol like '\\141%';
select * from test where bcol like 'ä%';
select * from test where bcol like '\\344%';
select * from test where bcol like '\1%';


explain analyze select * from test where bcol like '\\001%';
explain analyze select * from test where bcol like '\\001\002%';
explain analyze select * from test where bcol like 'a%';
explain analyze select * from test where bcol like '\\141%';
explain analyze select * from test where bcol like 'ä%';
explain analyze select * from test where bcol like '\\344%';
explain analyze select * from test where bcol like '\1%';




Attachment: pgp00000.pgp
Description: PGP signature

db=> \i /tmp/sql-bug.txt
ROLLBACK
BEGIN
CREATE TABLE
CREATE INDEX
INSERT 184807 1
INSERT 184808 1
INSERT 184809 1
INSERT 184810 1
INSERT 184811 1
INSERT 184812 1
INSERT 184813 1
INSERT 184814 1
INSERT 184815 1
INSERT 184816 1
INSERT 184817 1
INSERT 184818 1
INSERT 184819 1
INSERT 184820 1
INSERT 184821 1
SET
SET
 bcol
------
(0 Zeilen)

 bcol
------
(0 Zeilen)

  bcol
---------
 abc
 abc\002
 abc\003
 abc\004
 abc\005
 abcdef
(6 Zeilen)

  bcol
---------
 abc
 abc\002
 abc\003
 abc\004
 abc\005
 abcdef
(6 Zeilen)

 bcol
------
(0 Zeilen)

 bcol
------
(0 Zeilen)

 bcol
------
(0 Zeilen)

                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Index Scan using tst_idx on test  (cost=0.00..17.07 rows=5 width=32) (actual 
time=0.047..0.047 rows=0 loops=1)
   Index Cond: (bcol = '0'::bytea)
   Filter: (bcol ~~ '\\001%'::bytea)
 Total runtime: 0.151 ms
(4 Zeilen)

                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Index Scan using tst_idx on test  (cost=0.00..17.07 rows=5 width=32) (actual 
time=0.048..0.048 rows=0 loops=1)
   Index Cond: (bcol = '00'::bytea)
   Filter: (bcol ~~ '\\001\\002%'::bytea)
 Total runtime: 0.151 ms
(4 Zeilen)

                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Index Scan using tst_idx on test  (cost=0.00..17.07 rows=5 width=32) (actual 
time=0.079..0.134 rows=6 loops=1)
   Index Cond: ((bcol >= 'a'::bytea) AND (bcol < 'b'::bytea))
   Filter: (bcol ~~ 'a%'::bytea)
 Total runtime: 0.275 ms
(4 Zeilen)

                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Index Scan using tst_idx on test  (cost=0.00..17.07 rows=5 width=32) (actual 
time=0.077..0.152 rows=6 loops=1)
   Index Cond: ((bcol >= 'a'::bytea) AND (bcol < 'b'::bytea))
   Filter: (bcol ~~ 'a%'::bytea)
 Total runtime: 0.535 ms
(4 Zeilen)

                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Index Scan using tst_idx on test  (cost=0.00..17.07 rows=5 width=32) (actual 
time=0.047..0.047 rows=0 loops=1)
   Index Cond: (bcol = '3'::bytea)
   Filter: (bcol ~~ '\\344%'::bytea)
 Total runtime: 0.149 ms
(4 Zeilen)

                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Index Scan using tst_idx on test  (cost=0.00..17.07 rows=5 width=32) (actual 
time=0.046..0.046 rows=0 loops=1)
   Index Cond: (bcol = '3'::bytea)
   Filter: (bcol ~~ '\\344%'::bytea)
 Total runtime: 0.147 ms
(4 Zeilen)

                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Index Scan using tst_idx on test  (cost=0.00..17.07 rows=5 width=32) (actual 
time=0.048..0.048 rows=0 loops=1)
   Index Cond: (bcol = '0'::bytea)
   Filter: (bcol ~~ '\\001%'::bytea)
 Total runtime: 0.150 ms
(4 Zeilen)

SET
SET
      bcol
-----------------
 \001abc\006
 \001xabc\006
 \001\002abc\006
 \001abc\006
 \001xabc\006
 \001\002abc\006
(6 Zeilen)

      bcol
-----------------
 \001\002abc\006
 \001\002abc\006
(2 Zeilen)

  bcol
---------
 abc
 abcdef
 abc\003
 abc\002
 abc\004
 abc\005
(6 Zeilen)

  bcol
---------
 abc
 abcdef
 abc\003
 abc\002
 abc\004
 abc\005
(6 Zeilen)

 bcol
------
(0 Zeilen)

 bcol
------
(0 Zeilen)

      bcol
-----------------
 \001abc\006
 \001xabc\006
 \001\002abc\006
 \001abc\006
 \001xabc\006
 \001\002abc\006
(6 Zeilen)

                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..22.50 rows=5 width=32) (actual time=0.038..0.085 rows=6 
loops=1)
   Filter: (bcol ~~ '\\001%'::bytea)
 Total runtime: 0.182 ms
(3 Zeilen)

                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..22.50 rows=5 width=32) (actual time=0.043..0.065 rows=2 
loops=1)
   Filter: (bcol ~~ '\\001\\002%'::bytea)
 Total runtime: 0.141 ms
(3 Zeilen)

                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..22.50 rows=5 width=32) (actual time=0.038..0.096 rows=6 
loops=1)
   Filter: (bcol ~~ 'a%'::bytea)
 Total runtime: 0.201 ms
(3 Zeilen)

                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..22.50 rows=5 width=32) (actual time=0.042..0.100 rows=6 
loops=1)
   Filter: (bcol ~~ 'a%'::bytea)
 Total runtime: 0.211 ms
(3 Zeilen)

                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..22.50 rows=5 width=32) (actual time=0.051..0.051 rows=0 
loops=1)
   Filter: (bcol ~~ '\\344%'::bytea)
 Total runtime: 0.115 ms
(3 Zeilen)

                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..22.50 rows=5 width=32) (actual time=0.051..0.051 rows=0 
loops=1)
   Filter: (bcol ~~ '\\344%'::bytea)
 Total runtime: 0.113 ms
(3 Zeilen)

                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Seq Scan on test  (cost=0.00..22.50 rows=5 width=32) (actual time=0.039..0.087 rows=6 
loops=1)
   Filter: (bcol ~~ '\\001%'::bytea)
 Total runtime: 0.187 ms
(3 Zeilen)


Attachment: pgp00001.pgp
Description: PGP signature

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faqs/FAQ.html

Reply via email to