https://bugs.exim.org/show_bug.cgi?id=1894

            Bug ID: 1894
           Summary: In UTF8 Locale Russian Cyrillic [а-я] range contains
                    only 32 of 33 letters
           Product: PCRE
           Version: 8.32
          Hardware: x86-64
                OS: Linux
            Status: NEW
          Severity: bug
          Priority: medium
         Component: Code
          Assignee: p...@hermes.cam.ac.uk
          Reporter: iko...@yandex.ru
                CC: pcre-dev@exim.org

Originally find on EL7 (with pcre-8.32) issue seems to be more common.

Modern Russian alphabet contains 33 letters.
Standard UTF8 rage covers 32 of most common, but misses one ('ё').

Standard
U+0410    А
…
U+044F  я

Exceptions:
U+0401    Ё
U+0451    ё
http://www.utf8-chartable.de/unicode-utf8-table.pl?start=1024

[а-я] range should include 'ё' (and [А-Я] — 'Ё') letter, but actually 
do not.

Forwarded here from downstream tracker, see
https://bugs.php.net/bug.php?id=73251

$valid_string_expr = '/^[а-я]+$/u';
var_dump(preg_match($valid_string_expr, $str));
$str = "ещё";
var_dump(preg_match($valid_string_expr, $str));

Second regexp fails, although should not.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to