On Fri, Nov 15, 2024 at 11:42 PM Peter Eisentraut <pe...@eisentraut.org> wrote: > > On 15.11.24 05:26, jian he wrote: > > /* > > * Now build a substring of the text and try to match it against > > * the subpattern. t is the start of the text, t1 is one past the > > * last byte. We start with a zero-length string. > > */ > > t1 = t > > t1len = tlen; > > for (;;) > > { > > int cmp; > > CHECK_FOR_INTERRUPTS(); > > cmp = pg_strncoll(subpat, subpatlen, t, (t1 - t), locale); > > > > select '.foo.' LIKE '_oo' COLLATE ign_punct; > > pg_strncoll's iteration of the first 4 argument values. > > oo 2 foo. 0 > > oo 2 foo. 1 > > oo 2 foo. 2 > > oo 2 foo. 3 > > oo 2 foo. 4 > > > > seems there is a shortcut/optimization. > > if subpat don't have wildcard(percent sign, underscore) > > then we can have less pg_strncoll calls? > > How would you do that? You need to try all combinations to find one > that matches. >
we can optimize when trailing (last character) is not wildcards. SELECT 'Ha12foo' LIKE '%foo' COLLATE ignore_accents; within the for loop for(;;) { int cmp; CHECK_FOR_INTERRUPTS(); .... } pg_strncoll comparison will become Ha12foo foo a12foo foo 12foo foo 2foo foo foo foo it's safe because in MatchText we have: else if (*p == '%') { while (tlen > 0) { if (GETCHAR(*t, locale) == firstpat || (locale && !locale->deterministic)) { int matched = MatchText(t, tlen, p, plen, locale); if (matched != LIKE_FALSE) return matched; /* TRUE or ABORT */ } NextChar(t, tlen); } } please check attached. > > minimum case to trigger error within GenericMatchText > > since no related tests. > > create table t1(a text collate case_insensitive, b text collate "C"); > > insert into t1 values ('a','a'); > > select a like b from t1; > > This results in > > ERROR: 42P22: could not determine which collation to use for LIKE > HINT: Use the COLLATE clause to set the collation explicitly. > > which is the expected behavior. > sorry, didn't mention it clearly, i mean we can add it to the regress test.
v8-0001-LIKE-with-nondeterministic-collations-no-trail.no-cfbot
Description: Binary data