http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=14759
--- Comment #15 from David Cook <[email protected]> --- (In reply to Galen Charlton from comment #14) > Other way around: Text::Unaccent is not, as it would be much preferable, > emitting Perl Unicode strings; rather, it is emitting octet-sequences. Sorry, I must have been unclear; I meant to say that Text::Unaccent is emitting octet-sequences (hence why using encode() on the string returned by Text::Unaccent would create a Perl Unicode string). And that Perl itself was causing problems when it tried to create a new string from an octet sequence string and a Perl Unicode string. > A good pattern is aim for is using *only* Unicode strings within core code, > and relegating use of Encode and friends to input and output; Text::Unaccent > would get in the way of that. Fair enough. I'm not in favour of Text::Unaccent per se. I was curious why it seemed to mangle some strings, and I shared what answers I found. I suspect Unicode::Normalize will really be the way to go, as you suggest. It seems much more comprehensive than Text::Unaccent and Text::Unaccent::PurePerl. I imagine we just need feedback from people experienced in Arabic, Hebrew, and CJK languages. -- You are receiving this mail because: You are watching all bug changes. _______________________________________________ Koha-bugs mailing list [email protected] http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-bugs website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
