Re: [PATCHES] [HACKERS] like/ilike improvements

Andrew Dunstan Fri, 01 Jun 2007 19:07:44 -0700


Tom Lane wrote:

Andrew Dunstan <[EMAIL PROTECTED]> writes:

OK, here is a patch that I think incorporates all the ideas discussed(including part of Mark Mielke's suggestion about optimising %_). Thereis now no special treatment of UTF8 other than its use of a fasterNextChar macro.


Looks mostly pretty good.  I would suggest replacing tests "tlen == 0"
and "plen == 0" with "<= 0", just so the code doesn't go completely
insane if presented with invalidly-encoded data that causes it to step
beyond the end of data.  Also, this comment is not really good enough:

!               /*
!                * It is safe to use NextByte instead of NextChar here, even for
! * multi-byte character sets, because we are not following! * immediately after a wildcard character.
!                */
!               NextByte(t, tlen);
!               NextByte(p, plen);
        }


I'd suggest adding something like "If we are in the middle of a
multibyte character, we must already have matched at least one byte of
the character from both text and pattern; so we cannot get out-of-sync
on character boundaries.  And we know that no backend-legal encoding
allows ASCII characters such as '%' to appear as non-first bytes of
characters, so we won't mistakenly detect a new wildcard."


Done, and committed.

cheers

andrew

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

Re: [PATCHES] [HACKERS] like/ilike improvements

Reply via email to