Edit report at https://bugs.php.net/bug.php?id=65361&edit=1
ID: 65361
User updated by: pascal dot chevrel at free dot fr
Reported by: pascal dot chevrel at free dot fr
Summary: Transliteration has uppercase problems with letter J
in Serbian
-Status: Feedback
+Status: Open
Type: Bug
Package: Unicode Engine related
Operating System: Linux
PHP Version: 5.5.1
Block user comment: N
Private report: N
New Comment:
All my sources are in utf8, I rechecked with the isutf8 bash command.
Previous Comments:
------------------------------------------------------------------------
[2013-07-30 16:43:45] [email protected]
Is your source cyrillic string UTF-8 encoded? No idea how to encode otherwise,
but
with UTF-8 source it gives the translit you expect. So that might be the key.
------------------------------------------------------------------------
[2013-07-30 14:44:15] pascal dot chevrel at free dot fr
Description:
------------
The transliterator class does not work well when converting from Cyrillic
Serbian to Latin Script Serbian. All the j letters in cyrillic are
systematically converted to uppercase J in latin-script serbian while it should
be lowercase j inside a word.
Online conversion tools probably also based on ICU don't have this bug and do
the conversion correctly.
I am attaching a code sample that shows that bug. I tested that the bug exists
in both PHP 5.4 and 5.5
Thanks!
Test script:
---------------
<?php
$t = Transliterator::create('Serbian-Latin/BGN');
$source = 'ÐаÑгледаниÑи ÑаÑÑови';
echo '<ul>'
. '<li>Cyrillic source: ' . $source . '</li>'
. '<li>Expected transliteration: Najgledaniji sajtovi</li>'
. '<li>Actual transliteration: ' . $t->transliterate($source) . '</li>'
. '</ul>';
Expected result:
----------------
This string :
ÐаÑгледаниÑи ÑаÑÑови
Should be transliterated to:
Najgledaniji sajtovi
Actual result:
--------------
But PHP transliterates it to:
NaJgledaniJi saJtovi
------------------------------------------------------------------------
--
Edit this bug report at https://bugs.php.net/bug.php?id=65361&edit=1