https://bugs.exim.org/show_bug.cgi?id=1894

Petr Pisar <ppi...@redhat.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ppi...@redhat.com

--- Comment #1 from Petr Pisar <ppi...@redhat.com> ---
The [а-я] range does not mean all Cyrillic symbols. It means Unicode character
from range U+0430 to U+044F. And as you correctly noted, ё is out of the range
(U+0451). Therefore it should not match:

printf '/[а-я]*/8\nещё\n' | pcretest 
PCRE version 8.39 2016-06-14

  re> data>  0: \x{435}\x{449}
data> 

If you extend the range up to ё, it will match:

$ printf '/[а-ё]*/8\nещё\n' | pcretest 
PCRE version 8.39 2016-06-14

  re> data>  0: \x{435}\x{449}\x{451}
data> 

But there is a better way: Instead of Unicode range you can use Unicode script
name. This is because sometimes the Unicode ranges contain characters from
foreign scripts or an unassigned code points:

$ printf '/\p{Cyrillic}*/8\nещё\n' | pcretest 
PCRE version 8.39 2016-06-14

  re> data>  0: \x{435}\x{449}\x{451}
data>

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to