Re: Lower or Upper case for F.33. pg_trgm
Daniel Gustafsson writes: > Looking at this I'm leaning towards paring down the diff posted upthread with > pretty much this, I think that will provide value while avoid causing > confusion. WFM. > As a related side note, there are four instances of "case insensitive{ly}" in > the docs with all other instances using "case-insensitive{ly}". I'm inclined > to fix those four to use a dash while at it to be consistent across all pages. +1 regards, tom lane
Re: Lower or Upper case for F.33. pg_trgm
> On 16 Aug 2022, at 15:53, Tom Lane wrote: > > Erik Rijkers writes: >> (bluntly stating 'similarity comparisons are case-insensitive' - >> although I'm not really sure..) > > Perhaps like "similarity comparisons are case-insensitive in a > standard build of pg_trgm", if you want to nod to the existence > of a compile option without going into detail. Looking at this I'm leaning towards paring down the diff posted upthread with pretty much this, I think that will provide value while avoid causing confusion. As a related side note, there are four instances of "case insensitive{ly}" in the docs with all other instances using "case-insensitive{ly}". I'm inclined to fix those four to use a dash while at it to be consistent across all pages. -- Daniel Gustafsson https://vmware.com/ pg_trgm_case.diff Description: Binary data
Re: Lower or Upper case for F.33. pg_trgm
Sounds good to me. Am Di., 16. Aug. 2022 um 15:53 Uhr schrieb Tom Lane : > Erik Rijkers writes: > > (bluntly stating 'similarity comparisons are case-insensitive' - > > although I'm not really sure..) > > Perhaps like "similarity comparisons are case-insensitive in a > standard build of pg_trgm", if you want to nod to the existence > of a compile option without going into detail. > > regards, tom lane >
Re: Lower or Upper case for F.33. pg_trgm
Erik Rijkers writes: > (bluntly stating 'similarity comparisons are case-insensitive' - > although I'm not really sure..) Perhaps like "similarity comparisons are case-insensitive in a standard build of pg_trgm", if you want to nod to the existence of a compile option without going into detail. regards, tom lane
Re: Lower or Upper case for F.33. pg_trgm
Op 16-08-2022 om 13:46 schreef Daniel Gustafsson: On 16 Aug 2022, at 12:54, Erik Rijkers wrote: Op 16-08-2022 om 12:36 schreef Daniel Gustafsson: On 16 Aug 2022, at 12:17, PG Doc comments form wrote: I have a question regarding the trigram algorithm and I can not find any information about it in your documentation: Maybe we should add something about this? Yeah, it's a bit strange that none of the following strings yield any info on that page: 'case', 'sensitiv', 'upper', 'lower', and that there is no mention of the ~ versus ~* difference. Maybe worth to (already in pgtrgm.html) give the simple hint: ~ is case-sensitive ~* is case-insensitive In any case a link to functions-matching.html seems indicated. Yeah, I think there is room for improvements here. Are you up for drafting a patch for this? How is this? (bluntly stating 'similarity comparisons are case-insensitive' - although I'm not really sure..) Erik -- Daniel Gustafsson https://vmware.com/ --- ./doc/src/sgml/pgtrgm.sgml.orig 2022-08-16 14:50:08.586555358 +0200 +++ ./doc/src/sgml/pgtrgm.sgml 2022-08-16 14:56:39.358617804 +0200 @@ -416,6 +416,8 @@ the above-described similarity operators, and additionally support trigram-based index searches for LIKE, ILIKE, ~, ~* and = queries. + The similarity comparisons are case-insensitive, but these queries can be + case-sensitive (see ). Inequality operators are not supported. Note that those indexes may not be as efficient as regular B-tree indexes for equality operator. @@ -534,7 +536,8 @@ Beginning in PostgreSQL 9.3, these index types also support index searches for regular-expression matches - (~ and ~* operators), for example + (~ and ~* operators, resp. case-sensitive and + case-insensitive), for example SELECT * FROM test_trgm WHERE t ~ '(foo|bar)';
Re: Lower or Upper case for F.33. pg_trgm
Thanks for your fast response. Is this a question for me? I am fine with a short hint regarding the default. A link to another documentation is also fine. Am Di., 16. Aug. 2022 um 13:46 Uhr schrieb Daniel Gustafsson < dan...@yesql.se>: > > On 16 Aug 2022, at 12:54, Erik Rijkers wrote: > > > > Op 16-08-2022 om 12:36 schreef Daniel Gustafsson: > >>> On 16 Aug 2022, at 12:17, PG Doc comments form > wrote: > >>> I have a question regarding the trigram algorithm and I can not find > any > >>> information about it in your documentation: > >> Maybe we should add something about this? > > > > Yeah, it's a bit strange that none of the following strings yield any > info on that page: 'case', 'sensitiv', 'upper', 'lower', and that there is > no mention of the ~ versus ~* difference. > > > > Maybe worth to (already in pgtrgm.html) give the simple hint: > > ~ is case-sensitive > > ~* is case-insensitive > > > > In any case a link to functions-matching.html seems indicated. > > Yeah, I think there is room for improvements here. Are you up for > drafting a > patch for this? > > -- > Daniel Gustafsson https://vmware.com/ > >
Re: Lower or Upper case for F.33. pg_trgm
> On 16 Aug 2022, at 12:54, Erik Rijkers wrote: > > Op 16-08-2022 om 12:36 schreef Daniel Gustafsson: >>> On 16 Aug 2022, at 12:17, PG Doc comments form >>> wrote: >>> I have a question regarding the trigram algorithm and I can not find any >>> information about it in your documentation: >> Maybe we should add something about this? > > Yeah, it's a bit strange that none of the following strings yield any info on > that page: 'case', 'sensitiv', 'upper', 'lower', and that there is no > mention of the ~ versus ~* difference. > > Maybe worth to (already in pgtrgm.html) give the simple hint: > ~ is case-sensitive > ~* is case-insensitive > > In any case a link to functions-matching.html seems indicated. Yeah, I think there is room for improvements here. Are you up for drafting a patch for this? -- Daniel Gustafsson https://vmware.com/
Re: Lower or Upper case for F.33. pg_trgm
Op 16-08-2022 om 12:36 schreef Daniel Gustafsson: On 16 Aug 2022, at 12:17, PG Doc comments form wrote: I have a question regarding the trigram algorithm and I can not find any information about it in your documentation: Maybe we should add something about this? Yeah, it's a bit strange that none of the following strings yield any info on that page: 'case', 'sensitiv', 'upper', 'lower', and that there is no mention of the ~ versus ~* difference. Maybe worth to (already in pgtrgm.html) give the simple hint: ~ is case-sensitive ~* is case-insensitive In any case a link to functions-matching.html seems indicated. Erik Rijkers Do you distinguish between lower and uppercase? Or do you consider all words in lowercase? There is support for compiling pg_trgm case sensitive, but it's by default case insensitive. # SELECT word_similarity('word', 'WORD'); word_similarity - 1 (1 row) Happy to get a short feedback from you, I would recommend the pg_general mailinglist as that will be a safer way to get general questions answered. -- Daniel Gustafsson https://vmware.com/
Re: Lower or Upper case for F.33. pg_trgm
> On 16 Aug 2022, at 12:17, PG Doc comments form wrote: > I have a question regarding the trigram algorithm and I can not find any > information about it in your documentation: Maybe we should add something about this? > Do you distinguish between lower and uppercase? Or do you consider all words > in lowercase? There is support for compiling pg_trgm case sensitive, but it's by default case insensitive. # SELECT word_similarity('word', 'WORD'); word_similarity - 1 (1 row) > Happy to get a short feedback from you, I would recommend the pg_general mailinglist as that will be a safer way to get general questions answered. -- Daniel Gustafsson https://vmware.com/
Lower or Upper case for F.33. pg_trgm
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/14/pgtrgm.html Description: Hey guys, I have a question regarding the trigram algorithm and I can not find any information about it in your documentation: Do you distinguish between lower and uppercase? Or do you consider all words in lowercase? Happy to get a short feedback from you, Greetings, Marc