-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi,
while changing a column from base255 encoded text (all except null byte) to bytea, I found the following bug in Postgresql's LIKE operator with indexes (it follows a more detailed description then my old mails in -bugs and - -general, including the proof of the bug): The index condition in the query plan for "where bytea_column like 'a%'" is: Index Cond: (bytea_col >= 'a'::bytea) AND (bytea_col < 'b'::bytea)) Filter: (bcol ~~ 'a%'::bytea) This is correct. The index condition in the query plan for "bytea_column like '\\141%'" ("a" in octal is 141) is exaclty the same, including filter condition. Index Cond: ((bcol >= 'a'::bytea) AND (bcol < 'b'::bytea)) Filter: (bcol ~~ 'a%'::bytea) This is also correct. The index condition in the query plan for "bytea_column like '\\001%'" is: Index Cond: (bcol = '0'::bytea) Filter: (bcol ~~ '\\001%'::bytea) THIS IS WRONG! Isn't it? If the byte is displayable in ASCII, then all is OK. If not, it seems that Postgres takes the first character of the octal number and uses this as comparison parameter. With "ä" (344) it takes "3" ... When index scan is disabled or from other reasons seqscan is used, the query plan and the result is correct. The result differs, if index is used or not used. I guess there is too much conversion between different character sets etc. A piece of test SQL and the results are attached. My Version is: PostgreSQL 7.4 on i386-portbld-freebsd4.8, compiled by GCC 2.95.4 The same was with 7.3.4 Ciao Alvar - -- ** Alvar C.H. Freude -- http://alvar.a-blast.org/ -- http://odem.org/ ** Berufsverbot? http://odem.org/aktuelles/staatsanwalt.de.html ** ODEM.org-Tour: http://tour.odem.org/ ** Informationsgesellschaft: http://www.wsis-koordinierungskreis.de/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (FreeBSD) iD8DBQE/z6YbOndlH63J86wRAr+qAKCo6yi3/0HGO13IkKP2KbyH147kMACeKq7T WEKPu3dNKnesLqQUd9puyh0= =Sivh -----END PGP SIGNATURE-----
rollback; begin; create table test (bcol bytea); create index tst_idx on test(bcol); insert into test values ('abc'); insert into test values ('abcdef'); insert into test values ('abc\\003'); insert into test values ('abc\\002'); insert into test values ('abc\\004'); insert into test values ('abc\\005'); insert into test values ('\\001abc\\006'); insert into test values ('\\001xabc\\006'); insert into test values ('\\001\\002abc\\006'); insert into test values ('\\002\\003abc\\006'); insert into test values ('\\001abc\\006'); insert into test values ('\\001xabc\\006'); insert into test values ('\\001\\002abc\\006'); insert into test values ('\\000\\001\\002abc\\006'); insert into test values ('\\002\\003abc\\006'); set enable_indexscan=on; set enable_seqscan=off; select * from test where bcol like '\\001%'; select * from test where bcol like '\\001\002%'; select * from test where bcol like 'a%'; select * from test where bcol like '\\141%'; select * from test where bcol like 'ä%'; select * from test where bcol like '\\344%'; select * from test where bcol like '\1%'; explain analyze select * from test where bcol like '\\001%'; explain analyze select * from test where bcol like '\\001\002%'; explain analyze select * from test where bcol like 'a%'; explain analyze select * from test where bcol like '\\141%'; explain analyze select * from test where bcol like 'ä%'; explain analyze select * from test where bcol like '\\344%'; explain analyze select * from test where bcol like '\1%'; set enable_indexscan=off; set enable_seqscan=on; select * from test where bcol like '\\001%'; select * from test where bcol like '\\001\002%'; select * from test where bcol like 'a%'; select * from test where bcol like '\\141%'; select * from test where bcol like 'ä%'; select * from test where bcol like '\\344%'; select * from test where bcol like '\1%'; explain analyze select * from test where bcol like '\\001%'; explain analyze select * from test where bcol like '\\001\002%'; explain analyze select * from test where bcol like 'a%'; explain analyze select * from test where bcol like '\\141%'; explain analyze select * from test where bcol like 'ä%'; explain analyze select * from test where bcol like '\\344%'; explain analyze select * from test where bcol like '\1%';
pgp00000.pgp
Description: PGP signature
db=> \i /tmp/sql-bug.txt ROLLBACK BEGIN CREATE TABLE CREATE INDEX INSERT 184807 1 INSERT 184808 1 INSERT 184809 1 INSERT 184810 1 INSERT 184811 1 INSERT 184812 1 INSERT 184813 1 INSERT 184814 1 INSERT 184815 1 INSERT 184816 1 INSERT 184817 1 INSERT 184818 1 INSERT 184819 1 INSERT 184820 1 INSERT 184821 1 SET SET bcol ------ (0 Zeilen) bcol ------ (0 Zeilen) bcol --------- abc abc\002 abc\003 abc\004 abc\005 abcdef (6 Zeilen) bcol --------- abc abc\002 abc\003 abc\004 abc\005 abcdef (6 Zeilen) bcol ------ (0 Zeilen) bcol ------ (0 Zeilen) bcol ------ (0 Zeilen) QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Index Scan using tst_idx on test (cost=0.00..17.07 rows=5 width=32) (actual time=0.047..0.047 rows=0 loops=1) Index Cond: (bcol = '0'::bytea) Filter: (bcol ~~ '\\001%'::bytea) Total runtime: 0.151 ms (4 Zeilen) QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Index Scan using tst_idx on test (cost=0.00..17.07 rows=5 width=32) (actual time=0.048..0.048 rows=0 loops=1) Index Cond: (bcol = '00'::bytea) Filter: (bcol ~~ '\\001\\002%'::bytea) Total runtime: 0.151 ms (4 Zeilen) QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Index Scan using tst_idx on test (cost=0.00..17.07 rows=5 width=32) (actual time=0.079..0.134 rows=6 loops=1) Index Cond: ((bcol >= 'a'::bytea) AND (bcol < 'b'::bytea)) Filter: (bcol ~~ 'a%'::bytea) Total runtime: 0.275 ms (4 Zeilen) QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Index Scan using tst_idx on test (cost=0.00..17.07 rows=5 width=32) (actual time=0.077..0.152 rows=6 loops=1) Index Cond: ((bcol >= 'a'::bytea) AND (bcol < 'b'::bytea)) Filter: (bcol ~~ 'a%'::bytea) Total runtime: 0.535 ms (4 Zeilen) QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Index Scan using tst_idx on test (cost=0.00..17.07 rows=5 width=32) (actual time=0.047..0.047 rows=0 loops=1) Index Cond: (bcol = '3'::bytea) Filter: (bcol ~~ '\\344%'::bytea) Total runtime: 0.149 ms (4 Zeilen) QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Index Scan using tst_idx on test (cost=0.00..17.07 rows=5 width=32) (actual time=0.046..0.046 rows=0 loops=1) Index Cond: (bcol = '3'::bytea) Filter: (bcol ~~ '\\344%'::bytea) Total runtime: 0.147 ms (4 Zeilen) QUERY PLAN ---------------------------------------------------------------------------------------------------------------- Index Scan using tst_idx on test (cost=0.00..17.07 rows=5 width=32) (actual time=0.048..0.048 rows=0 loops=1) Index Cond: (bcol = '0'::bytea) Filter: (bcol ~~ '\\001%'::bytea) Total runtime: 0.150 ms (4 Zeilen) SET SET bcol ----------------- \001abc\006 \001xabc\006 \001\002abc\006 \001abc\006 \001xabc\006 \001\002abc\006 (6 Zeilen) bcol ----------------- \001\002abc\006 \001\002abc\006 (2 Zeilen) bcol --------- abc abcdef abc\003 abc\002 abc\004 abc\005 (6 Zeilen) bcol --------- abc abcdef abc\003 abc\002 abc\004 abc\005 (6 Zeilen) bcol ------ (0 Zeilen) bcol ------ (0 Zeilen) bcol ----------------- \001abc\006 \001xabc\006 \001\002abc\006 \001abc\006 \001xabc\006 \001\002abc\006 (6 Zeilen) QUERY PLAN ------------------------------------------------------------------------------------------------ Seq Scan on test (cost=0.00..22.50 rows=5 width=32) (actual time=0.038..0.085 rows=6 loops=1) Filter: (bcol ~~ '\\001%'::bytea) Total runtime: 0.182 ms (3 Zeilen) QUERY PLAN ------------------------------------------------------------------------------------------------ Seq Scan on test (cost=0.00..22.50 rows=5 width=32) (actual time=0.043..0.065 rows=2 loops=1) Filter: (bcol ~~ '\\001\\002%'::bytea) Total runtime: 0.141 ms (3 Zeilen) QUERY PLAN ------------------------------------------------------------------------------------------------ Seq Scan on test (cost=0.00..22.50 rows=5 width=32) (actual time=0.038..0.096 rows=6 loops=1) Filter: (bcol ~~ 'a%'::bytea) Total runtime: 0.201 ms (3 Zeilen) QUERY PLAN ------------------------------------------------------------------------------------------------ Seq Scan on test (cost=0.00..22.50 rows=5 width=32) (actual time=0.042..0.100 rows=6 loops=1) Filter: (bcol ~~ 'a%'::bytea) Total runtime: 0.211 ms (3 Zeilen) QUERY PLAN ------------------------------------------------------------------------------------------------ Seq Scan on test (cost=0.00..22.50 rows=5 width=32) (actual time=0.051..0.051 rows=0 loops=1) Filter: (bcol ~~ '\\344%'::bytea) Total runtime: 0.115 ms (3 Zeilen) QUERY PLAN ------------------------------------------------------------------------------------------------ Seq Scan on test (cost=0.00..22.50 rows=5 width=32) (actual time=0.051..0.051 rows=0 loops=1) Filter: (bcol ~~ '\\344%'::bytea) Total runtime: 0.113 ms (3 Zeilen) QUERY PLAN ------------------------------------------------------------------------------------------------ Seq Scan on test (cost=0.00..22.50 rows=5 width=32) (actual time=0.039..0.087 rows=6 loops=1) Filter: (bcol ~~ '\\001%'::bytea) Total runtime: 0.187 ms (3 Zeilen)
pgp00001.pgp
Description: PGP signature
---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html