[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 Nick Clemens changed: What|Removed |Added See Also||https://bugs.koha-community ||.org/bugzilla3/show_bug.cgi ||?id=35621 -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 --- Comment #10 from David Cook --- (In reply to Katrin Fischer from comment #8) > You have to look at the full example in the links I posted. 3 lines: > > > > > > So yes, but then it uses that form to remove the diacritics: > https://www.compart.com/en/unicode/category/Mn Ahhh right. I should've been more thorough. I was thinking recently about how Zebra ICU has been seen as inferior to Elasticsearch ICU on the listserv. Looking at ftp://ftp.software.ibm.com/software/globalization/icu/3.6/icu-3_6-userguide.pdf, it looks like ICU actually originated in Java (ICU4J) and was later ported to C++ and C (ICU4C). According to https://wiki.koha-community.org/wiki/Record_Indexing_and_Retrieval_Options_for_Koha, the Zebra use of libicu is inferior to Lucence ICU which uses ICU4J. There's no evidence given for the claim, but it seems believable (especially considering global prominence of Solr and Elasticsearch). Looking at https://lucene.apache.org/core/4_4_0/analyzers-icu/index.html, it seems that writing systems can use dictionary based algorithms (for systems like Thai script, Chinese, etc). That explains a lot. I know a bit of Chinese, and I've wondered how indexers could handle such a context-dependent language... -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 Nick Clemens changed: What|Removed |Added Status|Failed QA |RESOLVED Resolution|--- |WORKSFORME --- Comment #9 from Nick Clemens --- You are correct Katrin - it looks like there was confusion about whether a site was using ICU when we wrote these patches. Testing on master everything works correctly under ICU without this patch. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 --- Comment #8 from Katrin Fischer --- (In reply to David Cook from comment #7) > (In reply to Katrin Fischer from comment #6) > > It means, the change here should not be necessary... Nick, can you please > > double check? > > Although wouldn't that NFD change make Žižek into something like... > Zizek? You have to look at the full example in the links I posted. 3 lines: So yes, but then it uses that form to remove the diacritics: https://www.compart.com/en/unicode/category/Mn -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 David Cook changed: What|Removed |Added CC||dc...@prosentient.com.au --- Comment #7 from David Cook --- (In reply to Katrin Fischer from comment #6) > It means, the change here should not be necessary... Nick, can you please > double check? Although wouldn't that NFD change make Žižek into something like... Zizek? -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 Katrin Fischer changed: What|Removed |Added Status|Needs Signoff |Failed QA --- Comment #6 from Katrin Fischer --- It means, the change here should not be necessary... Nick, can you please double check? -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 --- Comment #5 from Katrin Fischer --- (In reply to Katrin Fischer from comment #4) > I wonder if adding the rules is the best way of achieving this. You can add > a general rule for using the 'base letter'. We have been doing this I think. > Found a hint about the rule here: > > https://wiki.koha-community.org/wiki/ICU_do_not_undiacritic Also see the documentation here: http://userguide.icu-project.org/transforms/general And our sample files using it: https://wiki.koha-community.org/wiki/ICU_Chains_Library This makes it unnecessary to add transliteration rules for every character diacritic combination. -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 --- Comment #4 from Katrin Fischer --- I wonder if adding the rules is the best way of achieving this. You can add a general rule for using the 'base letter'. We have been doing this I think. Found a hint about the rule here: https://wiki.koha-community.org/wiki/ICU_do_not_undiacritic -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 --- Comment #3 from Michal Denar --- Sorry, I forgot some ... all here: I'm ready for test :-) -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 --- Comment #2 from Michal Denar --- Hi Nick, can we add some other czech and slavic letters into ICU too? I'm ready for test :-) -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 Michal Denar changed: What|Removed |Added CC||blac...@gmail.com -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 Nick Clemens changed: What|Removed |Added Status|NEW |Needs Signoff -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 --- Comment #1 from Nick Clemens --- Created attachment 109669 --> https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=109669=edit Bug 26390: Add transliteration for Z with caron in ICU chains Bug 26390: Add transliteration for Z with caron in ICU chains https://en.wikipedia.org/wiki/Caron From RT 52831. Uunder ICU chains most patrons cannot search for Slavoj Žižek TO test: 1 - Add a record with Slavoj Žižek as author 2 - Enable ICU chains https://wiki.koha-community.org/wiki/ICU_chains_configuration 3 - Ensure Koha is using zebra 4 - Restart all the things and reindex 5 - Try to search for 'Zizek' 6 - Not found 7 - Apply patch 8 - Restart all the things and reindex 9 - Try to search for Zizek 10 - It works! -- You are receiving this mail because: You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
[Koha-bugs] [Bug 26390] Add transliteration of Ž in ICU chains
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=26390 Nick Clemens changed: What|Removed |Added Assignee|koha-b...@lists.koha-commun |n...@bywatersolutions.com |ity.org | -- You are receiving this mail because: You are the assignee for the bug. You are watching all bug changes. ___ Koha-bugs mailing list Koha-bugs@lists.koha-community.org https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/