Edit report at https://bugs.php.net/bug.php?id=65361&edit=1
ID: 65361 User updated by: pascal dot chevrel at free dot fr Reported by: pascal dot chevrel at free dot fr Summary: Transliteration has uppercase problems with letter J in Serbian -Status: Feedback +Status: Open Type: Bug Package: Unicode Engine related Operating System: Linux PHP Version: 5.5.1 Block user comment: N Private report: N New Comment: "but with UTF-8 source it gives the translit you expect" That's not the case for me, do you have an example online showing my example working? A gist on github for example. Previous Comments: ------------------------------------------------------------------------ [2013-07-30 17:16:30] a...@php.net Ok, then it has to be ICU itself. I was testing on windows previously which has ICU 50, but ubuntu 13.04 ships with ICU 48 and I can repro what you say there. Which ICU version do you use? Most linux distros have 48 at the time. May be you could try a newer ICU, even 51? But even now from what I can see it's unlikely a PHP bug. Thanks. ------------------------------------------------------------------------ [2013-07-30 16:49:20] pascal dot chevrel at free dot fr All my sources are in utf8, I rechecked with the isutf8 bash command. ------------------------------------------------------------------------ [2013-07-30 16:43:45] a...@php.net Is your source cyrillic string UTF-8 encoded? No idea how to encode otherwise, but with UTF-8 source it gives the translit you expect. So that might be the key. ------------------------------------------------------------------------ [2013-07-30 14:44:15] pascal dot chevrel at free dot fr Description: ------------ The transliterator class does not work well when converting from Cyrillic Serbian to Latin Script Serbian. All the j letters in cyrillic are systematically converted to uppercase J in latin-script serbian while it should be lowercase j inside a word. Online conversion tools probably also based on ICU don't have this bug and do the conversion correctly. I am attaching a code sample that shows that bug. I tested that the bug exists in both PHP 5.4 and 5.5 Thanks! Test script: --------------- <?php $t = Transliterator::create('Serbian-Latin/BGN'); $source = 'ÐаÑгледаниÑи ÑаÑÑови'; echo '<ul>' . '<li>Cyrillic source: ' . $source . '</li>' . '<li>Expected transliteration: Najgledaniji sajtovi</li>' . '<li>Actual transliteration: ' . $t->transliterate($source) . '</li>' . '</ul>'; Expected result: ---------------- This string : ÐаÑгледаниÑи ÑаÑÑови Should be transliterated to: Najgledaniji sajtovi Actual result: -------------- But PHP transliterates it to: NaJgledaniJi saJtovi ------------------------------------------------------------------------ -- Edit this bug report at https://bugs.php.net/bug.php?id=65361&edit=1