Re: [PHP-DEV] Multibyte strings

2022-02-11 Thread Pierre Joye
good morning, On Sat, Feb 12, 2022, 3:47 AM Rowan Tommins wrote: > On 11/02/2022 18:42, Michał wrote: > > Considering the given example, the description from the documentation > > of strlen function: "Returns the length of the given string". > > > Which is exactly what it does. Using Unicode

Re: [PHP-DEV] Multibyte strings

2022-02-11 Thread Rowan Tommins
On 11/02/2022 18:42, Michał wrote: Considering the given example, the description from the documentation of strlen function: "Returns the length of the given string". Which is exactly what it does. Using Unicode terminology [see https://unicode.org/glossary], here are a few different things

Re: [PHP-DEV] Multibyte strings

2022-02-11 Thread Mel Dafert
On 11 February 2022 07:26:45 CET, "Michał" wrote: >Hi everyone. >It's a known fact that nowadays most websites use at least UTF-8 >encoding. Unfortunately PHP itself has stopped a bit in the previous >century. Is there any reason why the mbstring extension cannot be >introduced to core in the

Re: [PHP-DEV] Multibyte strings

2022-02-11 Thread Michał
This++. Unicode is not a static standard definition of all characters. New emoji are being added to the specification daily and while a glyph like  might look like a single "character" to a set of human eyes, and indeed in Unicode 6.0 is a single codepoint (U+1F46A), prior to Unicode 6.0

Re: [PHP-DEV] Multibyte strings

2022-02-11 Thread Michał
W dniu 11.02.2022 o 16:41, Kirill Nesmeyanov pisze: ``` $string = ‘Hell  or  world!’; echo ‘Bytes: ’ . \strlen($string) . "\n"; echo ‘Chars: ‘ . \mb_strlen($string); ``` Thanks Kirill for Your answer. I totally agree that stream and text functions are two different things. However, in the

Re: [PHP-DEV] Multibyte strings

2022-02-11 Thread Sara Golemon
On Fri, Feb 11, 2022 at 3:14 AM Rowan Tommins wrote: > There's also I think a myth in people's minds that something like > "string length" has a single meaning, and PHP gets it "wrong" for > multibyte strings; > This++. Unicode is not a static standard definition of all characters. New emoji

Re: [PHP-DEV] Multibyte strings

2022-02-11 Thread Sara Golemon
On Fri, Feb 11, 2022 at 12:26 AM Michał wrote: > It's a known fact that nowadays most websites use at least UTF-8 > encoding. Unfortunately PHP itself has stopped a bit in the previous > century. Is there any reason why the mbstring extension cannot be > introduced to core in the next major

Re: [PHP-DEV] Multibyte strings

2022-02-11 Thread Kirill Nesmeyanov
>Пятница, 11 февраля 2022, 9:27 +03:00 от Michał : >  >Hi everyone. >It's a known fact that nowadays most websites use at least UTF-8 >encoding. Unfortunately PHP itself has stopped a bit in the previous >century. Is there any reason why the mbstring extension cannot be >introduced to core in the

Re: [PHP-DEV] run-tests SKIPIF caching can be problematic for third-party extensions

2022-02-11 Thread Jeremy Mikola
The reply below is from Sept 24, 2021. I just realized that I inadvertently responded to Nikita dierctly and never shared to the mailing list. Doing so now to close the loop on some open questions and provide more context for a PR I recently opened (https://github.com/php/php-src/pull/8076).

Re: [PHP-DEV] Multibyte strings

2022-02-11 Thread Rowan Tommins
On 11/02/2022 06:26, Michał wrote: Hi everyone. It's a known fact that nowadays most websites use at least UTF-8 encoding. Unfortunately PHP itself has stopped a bit in the previous century. Is there any reason why the mbstring extension cannot be introduced to core in the next major version