Re: Support LIKE with nondeterministic collations

Peter Eisentraut Fri, 15 Nov 2024 07:51:25 -0800

On 15.11.24 05:26, jian he wrote:

/*
* Now build a substring of the text and try to match it against
* the subpattern.  t is the start of the text, t1 is one past the
* last byte.  We start with a zero-length string.
*/
t1 = t
t1len = tlen;
for (;;)
{
int cmp;
CHECK_FOR_INTERRUPTS();
cmp = pg_strncoll(subpat, subpatlen, t, (t1 - t), locale);


select '.foo.' LIKE '_oo' COLLATE ign_punct;
pg_strncoll's iteration of the first 4 argument values.
oo      2       foo. 0
oo      2       foo. 1
oo      2       foo. 2
oo      2       foo. 3
oo      2       foo. 4

seems there is a shortcut/optimization.
if subpat don't have wildcard(percent sign, underscore)
then we can have less pg_strncoll calls?

How would you do that? You need to try all combinations to find onethat matches.

minimum case to trigger error within GenericMatchText
since no related tests.
create table t1(a text collate case_insensitive, b text collate "C");
insert into t1 values ('a','a');
select a like b from t1;


This results in

ERROR:  42P22: could not determine which collation to use for LIKE
HINT:  Use the COLLATE clause to set the collation explicitly.

which is the expected behavior.

at 9.7.1. LIKE  section, we still don't know what "wildcard" is.
we mentioned it at 9.7.2.
maybe we can add a sentence at the end of:
     <para>
      If <replaceable>pattern</replaceable> does not contain percent
      signs or underscores, then the pattern only represents the string
      itself; in that case <function>LIKE</function> acts like the
      equals operator.  An underscore (<literal>_</literal>) in
      <replaceable>pattern</replaceable> stands for (matches) any single
      character; a percent sign (<literal>%</literal>) matches any sequence
      of zero or more characters.
     </para>

saying underscore and percent sign are wildcards in LIKE.
other than that, I can understand the doc.


Ok, I agree that could be clarified.

Re: Support LIKE with nondeterministic collations

Reply via email to