Re: [PHP-DEV] [RFC] Desire to move RFC add_str_begin_and_end_functions to a vote

Ben Ramsey Sun, 23 Jun 2019 08:30:09 -0700

> On Jun 23, 2019, at 05:35, Rowan Collins <rowan.coll...@gmail.com> wrote:
> 
> On 22 June 2019 20:56:24 BST, Ben Ramsey <b...@benramsey.com> wrote:
>> Perhaps it would only be an issue with the case-insensitive versions,
>> as Nikita points out? If so, can someone provide some example strings
>> where an mb_starts_with_ci() would return true, while
>> str_starts_with_ci() would return false?
> 
> 
> That's easy: any character that has a lower- and uppercase form, and is not 
> represented as one byte in the target encoding. For that matter, any such 
> character in the non-ASCII section of a single-byte encoding, since a 
> non-mbstring case insensitive flag would presumably leave everything other 
> than ASCII letters untouched.
> 
> So, any non-Latin script, like Greek or Cyrillic; any accented characters, 
> unless you're lucky and they're represented by ASCII-letter plus combining 
> modifier; the Turkish "i", which if I remember rightly has three forms not 
> two; and so on.



According to Google, "İyi akşamlar” is the Turkish phrase for “Good evening” 
(Turkish speakers, please correct me, if this wrong). However, using the 
existing mb_* functions, I can’t get mb_stripos() to return 0 when trying to 
see if the string “İYI AKŞAMLAR” begins with “i̇yi.”

I’m just using UTF-8, so maybe there’s an encoding issue here?

$string = 'İyi akşamlar';
$upper = mb_strtoupper($string);
$lowerChars = mb_strtolower(mb_substr($string, 0, 3));

var_dump($string, $upper, $lowerChars);
var_dump(mb_stripos($upper, $lowerChars));

signature.asc
Description: Message signed with OpenPGP

Re: [PHP-DEV] [RFC] Desire to move RFC add_str_begin_and_end_functions to a vote

Reply via email to