Re: Lower or Upper case for F.33. pg_trgm

2022-08-16 Thread Tom Lane
Daniel Gustafsson  writes:
> Looking at this I'm leaning towards paring down the diff posted upthread with
> pretty much this, I think that will provide value while avoid causing
> confusion.

WFM.

> As a related side note, there are four instances of "case insensitive{ly}" in
> the docs with all other instances using "case-insensitive{ly}".  I'm inclined
> to fix those four to use a dash while at it to be consistent across all pages.

+1

regards, tom lane




Re: Lower or Upper case for F.33. pg_trgm

2022-08-16 Thread Daniel Gustafsson
> On 16 Aug 2022, at 15:53, Tom Lane  wrote:
> 
> Erik Rijkers  writes:
>> (bluntly stating 'similarity comparisons are case-insensitive' - 
>> although I'm not really sure..)
> 
> Perhaps like "similarity comparisons are case-insensitive in a
> standard build of pg_trgm", if you want to nod to the existence
> of a compile option without going into detail.

Looking at this I'm leaning towards paring down the diff posted upthread with
pretty much this, I think that will provide value while avoid causing
confusion.

As a related side note, there are four instances of "case insensitive{ly}" in
the docs with all other instances using "case-insensitive{ly}".  I'm inclined
to fix those four to use a dash while at it to be consistent across all pages.

--
Daniel Gustafsson   https://vmware.com/



pg_trgm_case.diff
Description: Binary data


Re: Lower or Upper case for F.33. pg_trgm

2022-08-16 Thread Marc M.
Sounds good to me.

Am Di., 16. Aug. 2022 um 15:53 Uhr schrieb Tom Lane :

> Erik Rijkers  writes:
> > (bluntly stating 'similarity comparisons are case-insensitive' -
> > although I'm not really sure..)
>
> Perhaps like "similarity comparisons are case-insensitive in a
> standard build of pg_trgm", if you want to nod to the existence
> of a compile option without going into detail.
>
> regards, tom lane
>


Re: Lower or Upper case for F.33. pg_trgm

2022-08-16 Thread Tom Lane
Erik Rijkers  writes:
> (bluntly stating 'similarity comparisons are case-insensitive' - 
> although I'm not really sure..)

Perhaps like "similarity comparisons are case-insensitive in a
standard build of pg_trgm", if you want to nod to the existence
of a compile option without going into detail.

regards, tom lane




Re: Lower or Upper case for F.33. pg_trgm

2022-08-16 Thread Erik Rijkers

Op 16-08-2022 om 13:46 schreef Daniel Gustafsson:

On 16 Aug 2022, at 12:54, Erik Rijkers  wrote:

Op 16-08-2022 om 12:36 schreef Daniel Gustafsson:

On 16 Aug 2022, at 12:17, PG Doc comments form  wrote:
I have a question regarding the trigram algorithm and I can not find any
information about it in your documentation:

Maybe we should add something about this?


Yeah, it's a bit strange that none of the following strings yield any info on 
that page:  'case', 'sensitiv', 'upper', 'lower', and that there is no mention 
of the  ~  versus  ~*  difference.

Maybe worth to (already in pgtrgm.html) give the simple hint:
  ~  is case-sensitive
  ~* is case-insensitive

In any case a link to  functions-matching.html  seems indicated.


Yeah, I think there is room for improvements here.  Are you up for drafting a
patch for this?



How is this?

(bluntly stating 'similarity comparisons are case-insensitive' - 
although I'm not really sure..)



Erik


--
Daniel Gustafsson   https://vmware.com/
--- ./doc/src/sgml/pgtrgm.sgml.orig	2022-08-16 14:50:08.586555358 +0200
+++ ./doc/src/sgml/pgtrgm.sgml	2022-08-16 14:56:39.358617804 +0200
@@ -416,6 +416,8 @@
the above-described similarity operators, and additionally support
trigram-based index searches for LIKE, ILIKE,
~, ~* and = queries.
+   The similarity comparisons are case-insensitive, but these queries can be
+   case-sensitive (see ).
Inequality operators are not supported.
Note that those indexes may not be as efficient as regular B-tree indexes
for equality operator.
@@ -534,7 +536,8 @@
   
Beginning in PostgreSQL 9.3, these index types also support
index searches for regular-expression matches
-   (~ and ~* operators), for example
+   (~ and ~* operators, resp. case-sensitive and
+   case-insensitive), for example
 
 SELECT * FROM test_trgm WHERE t ~ '(foo|bar)';
 


Re: Lower or Upper case for F.33. pg_trgm

2022-08-16 Thread Marc M.
Thanks for your fast response.

Is this a question for me? I am fine with a short hint regarding the
default.
A link to another documentation is also fine.

Am Di., 16. Aug. 2022 um 13:46 Uhr schrieb Daniel Gustafsson <
dan...@yesql.se>:

> > On 16 Aug 2022, at 12:54, Erik Rijkers  wrote:
> >
> > Op 16-08-2022 om 12:36 schreef Daniel Gustafsson:
> >>> On 16 Aug 2022, at 12:17, PG Doc comments form 
> wrote:
> >>> I have a question regarding the trigram algorithm and I can not find
> any
> >>> information about it in your documentation:
> >> Maybe we should add something about this?
> >
> > Yeah, it's a bit strange that none of the following strings yield any
> info on that page:  'case', 'sensitiv', 'upper', 'lower', and that there is
> no mention of the  ~  versus  ~*  difference.
> >
> > Maybe worth to (already in pgtrgm.html) give the simple hint:
> >  ~  is case-sensitive
> >  ~* is case-insensitive
> >
> > In any case a link to  functions-matching.html  seems indicated.
>
> Yeah, I think there is room for improvements here.  Are you up for
> drafting a
> patch for this?
>
> --
> Daniel Gustafsson   https://vmware.com/
>
>


Re: Lower or Upper case for F.33. pg_trgm

2022-08-16 Thread Daniel Gustafsson
> On 16 Aug 2022, at 12:54, Erik Rijkers  wrote:
> 
> Op 16-08-2022 om 12:36 schreef Daniel Gustafsson:
>>> On 16 Aug 2022, at 12:17, PG Doc comments form  
>>> wrote:
>>> I have a question regarding the trigram algorithm and I can not find any
>>> information about it in your documentation:
>> Maybe we should add something about this?
> 
> Yeah, it's a bit strange that none of the following strings yield any info on 
> that page:  'case', 'sensitiv', 'upper', 'lower', and that there is no 
> mention of the  ~  versus  ~*  difference.
> 
> Maybe worth to (already in pgtrgm.html) give the simple hint:
>  ~  is case-sensitive
>  ~* is case-insensitive
> 
> In any case a link to  functions-matching.html  seems indicated.

Yeah, I think there is room for improvements here.  Are you up for drafting a
patch for this?

--
Daniel Gustafsson   https://vmware.com/





Re: Lower or Upper case for F.33. pg_trgm

2022-08-16 Thread Erik Rijkers




Op 16-08-2022 om 12:36 schreef Daniel Gustafsson:

On 16 Aug 2022, at 12:17, PG Doc comments form  wrote:



I have a question regarding the trigram algorithm and I can not find any
information about it in your documentation:


Maybe we should add something about this?


Yeah, it's a bit strange that none of the following strings yield any 
info on that page:  'case', 'sensitiv', 'upper', 'lower', and that there 
is no mention of the  ~  versus  ~*  difference.


Maybe worth to (already in pgtrgm.html) give the simple hint:
  ~  is case-sensitive
  ~* is case-insensitive


In any case a link to  functions-matching.html  seems indicated.


Erik Rijkers





Do you distinguish between lower and uppercase? Or do you consider all words
in lowercase?


There is support for compiling pg_trgm case sensitive, but it's by default case
insensitive.

# SELECT word_similarity('word', 'WORD');
  word_similarity
-
1
(1 row)


Happy to get a short feedback from you,


I would recommend the pg_general mailinglist as that will be a safer way to get
general questions answered.

--
Daniel Gustafsson   https://vmware.com/








Re: Lower or Upper case for F.33. pg_trgm

2022-08-16 Thread Daniel Gustafsson
> On 16 Aug 2022, at 12:17, PG Doc comments form  wrote:

> I have a question regarding the trigram algorithm and I can not find any
> information about it in your documentation:

Maybe we should add something about this?

> Do you distinguish between lower and uppercase? Or do you consider all words
> in lowercase?

There is support for compiling pg_trgm case sensitive, but it's by default case
insensitive.

# SELECT word_similarity('word', 'WORD');
 word_similarity
-
   1
(1 row)

> Happy to get a short feedback from you,

I would recommend the pg_general mailinglist as that will be a safer way to get
general questions answered.

--
Daniel Gustafsson   https://vmware.com/





Lower or Upper case for F.33. pg_trgm

2022-08-16 Thread PG Doc comments form
The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/14/pgtrgm.html
Description:

Hey guys,

I have a question regarding the trigram algorithm and I can not find any
information about it in your documentation:

Do you distinguish between lower and uppercase? Or do you consider all words
in lowercase?

Happy to get a short feedback from you,

Greetings, Marc