http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=13064
--- Comment #1 from Fridolin SOMERS <fridolyn.som...@biblibre.com> --- Created attachment 32136 --> http://bugs.koha-community.org/bugzilla3/attachment.cgi?id=32136&action=edit Bug 13064 - Indexing problem with ICU on control characters The ICU configuration files contains a rule to remove control characters : <transform rule="[:Control:] Any-Remove"/> This rule is before tokenization. The problem is that "[:Control:]" regex contains line feed, carriage return and tab. See http://www.regular-expressions.info/posixbrackets.html. So when several lines are indexed, last word of line is joined with first line of next line. Thoses words are then not searchable. For example : First line Second line This will become "First lineSecond line", tokenized as "First", "lineSecond" and "line". Test plan : - Use ICU in Zebra configuration - Choose an indexed field, like 300$a - Create a new record - Enter several lines in choosen field, like : First line Second line - Index this record => Without patch the search on "Second" does not return the record => With patch the search on "Second" returns the record - Same tests with tab and carriage return instead of line feed -- You are receiving this mail because: You are watching all bug changes. _______________________________________________ Koha-bugs mailing list Koha-bugs@lists.koha-community.org http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/