[Koha-bugs] [Bug 35455] New: ICU does not strip = when indexing/searching

bugzilla-daemon Thu, 30 Nov 2023 17:50:45 -0800

https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35455


            Bug ID: 35455
           Summary: ICU does not strip = when indexing/searching
 Change sponsored?: ---
           Product: Koha
           Version: master
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P5 - low
         Component: Searching - Zebra
          Assignee: [email protected]
          Reporter: [email protected]

The CHR config file /etc/koha/zebradb/etc/word-phrase-utf.chr replaces = with a
space, but the ICU config files /etc/koha/zebradb/etc/words-icu.xml and
/etc/koha/zebradb/etc/phrases-icu.xml do not transform the = character.

I encountered this problem because the "Show analytics" link wasn't appearing
for a journal record, but it will affect other search contexts too.

Note on https://unicode-org.github.io/icu/userguide/strings/regexp.html it
says:

"The property expression \p{punct} differs in what it matches. Java matches
matches any of !"#$%&'()*+,-./:;<=>?@[\]^_`{|}~. From that list, ICU omits
$+<=>^`|~     ICU follows the recommendations from Unicode UTS-1"

Arguably we should remove <=>^`|~ during ICU normalization but I'm just
focusing on = for now, as the colon : gets changed to = during searches, and
subtitles frequently are introduced by : in English.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.
_______________________________________________
Koha-bugs mailing list
[email protected]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

[Koha-bugs] [Bug 35455] New: ICU does not strip = when indexing/searching

Reply via email to