Hey all.

Just a short note why I voted against the current implementation of the
str_contains functionality.

While it is mainly aimed at being a mere convenience-function that could
also be easily implemented in userland it misses one main thing IMO when
handling unicode-strings: Normalization.

It is correct, that the binary representation of the string "äöüß"
within the string "Täöüßtstring" seems to be the same and that a simple
`strpos('Täöüßtstring', 'äöüß')` results in a not-false result.

But using unicode it might be that the two strings are using different
normalizations. So for the human eye the two strings look (almost)
identical but internaly they are completely different (and even
mb_strpos might not be able to detect the similarity).

See https://3v4l.org/fasO4 for more information.

As we are creating new functionality here it would have been great to
solve this issue. But as it is IMO merely a convenience add on that can
easily be implemented in userland I vote against it.

Cheers

Andreas

Am 17.02.20 um 15:23 schrieb Rowan Tommins:
> On Mon, 17 Feb 2020 at 13:38, Pierre Joye <pierre....@gmail.com> wrote:
> 
>>
>> Btw, while some mbstring references I I mentioned, I do like the ICU search
>> implementation as well.
>>
>> http://userguide.icu-project.org/collation/icu-string-search-service
>>
>> It handles a lot of cases based on locales.
>>
> 
> 
> That's a lovely example of why treating Unicode as a character encoding is
> the wrong mindset.
> 
> I would love to see more people using ext/intl rather than ext/mbstring,
> and more ICU features like this being included.
> 
> Regards,
> 

-- 
                                                              ,,,
                                                             (o o)
+---------------------------------------------------------ooO-(_)-Ooo-+
| Andreas Heigl                                                       |
| mailto:andr...@heigl.org                  N 50°22'59.5" E 08°23'58" |
| http://andreas.heigl.org                       http://hei.gl/wiFKy7 |
+---------------------------------------------------------------------+
| http://hei.gl/root-ca                                               |
+---------------------------------------------------------------------+

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to