[Koha-bugs] [Bug 27299] New: Zebra phrase register is incorrectly tokenized

bugzilla-daemon Tue, 22 Dec 2020 14:56:47 -0800

https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27299


            Bug ID: 27299
           Summary: Zebra phrase register is incorrectly tokenized
 Change sponsored?: ---
           Product: Koha
           Version: master
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P5 - low
         Component: Searching - Zebra
          Assignee: [email protected]
          Reporter: [email protected]

Recently, I noticed issues with "exact" matching for authority linking when
using Zebra ICU. 

I've documented those issues upstream on the idzebra project on Github:
https://github.com/indexdata/idzebra/issues/24

Adam Dickmeiss and I are still working through this issue, but it seems very
likely to me that the issue is that we are tokenizing strings for the "p"
register when we should not be. 

Looking at Zebra CHR, the "p" register is not tokenized. According to Zebra's
own documentation
(https://software.indexdata.com/zebra/doc/querymodel-zebra.html#querymodel-pqf-apt-mapping-structuretype),
the "p" register is supposed to be "Character normalized, but not tokenized
index for phrase matches". 

I'm still waiting for Adam to confirm my solution, but I've opened this bug
report to track things on the Koha side, and to include a patch which I hope
will resolve these problems.

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
_______________________________________________
Koha-bugs mailing list
[email protected]
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

[Koha-bugs] [Bug 27299] New: Zebra phrase register is incorrectly tokenized

Reply via email to