2026年5月13日(水) 19:27 Derick Rethans <[email protected]>:
>
> On Tue, 12 May 2026, youkidearitai wrote:
>
> > 2022年12月16日(金) 0:34 Derick Rethans <[email protected]>:
> >
> > > I have just published an initial draft of the "Unicode Text
> > > Processing" RFC, a proposal to have performant unicode text
> > > processing always available to PHP users, by introducing a new
> > > "Text" class.
> > >
> > > You can find it at:
> > > https://wiki.php.net/rfc/unicode_text_processing
> > >
> > > I'm looking forwards to hearing your opinions, additions, and
> > > suggestions — the RFC specifically asks for these in places.
> >
> > Is still available this topic?
> > I have interesting this Text class.
> > I'm glad to control based on grapheme cluster such as Swift's string type.
>
> I still have interest in working this out into supporting even more
> things. Since I wrote that Draft RFC, I did add a few more features:
>
> https://github.com/derickr/php-text/commits/main/
>
> >
> > I have some idea.
> >
> > 1. Move to Intl extension such as \Intl\Text
> >   * I think keep it simple for implementation.
>
> I don't agree with this, as although it builds on top of ICU like the
> classes in the Intl extension, it isn't following ICU's API style at
> all.
>
> It is meant to be a much more opiniated API that does the simple 80%
> case well.
>
> > 2. Add Text type for grapheme_* function only such as string|Text.
> >    * It is some complexy for implementation but userland is simple
>
> I am not too sure about this. The grapheme_* functions closely match
> ICUs internal, and powerful, API. If you want them to accept a Test
> object too, that means these grapheme_* functions' signature needs to be
> overloaded.
>
> for example:
>
> grapheme_strstr(string $haystack, string $needle, bool $beforeNeedle = false, 
> string $locale = "" ): string|false
>
> would need to change into:
>
> grapheme_strstr(string|Text $haystack, string|Text $needle, bool 
> $beforeNeedle = false, string $locale = "" ): string|false
>
> And then '$locale' makes no sense, as this is already part of each of
> the Text objects themselves.
>
> Instead, the 'contains' method on the Text object already does something
> very similar:
>
> https://github.com/derickr/php-text/blob/main/tests/text-contains.phpt
>
> I think the grapheme functions should stay as they are, and additional
> methods can be added on the Text class, where there is currently
> functionality missing that the grapheme_* functions already support.
>
> The RFC document also already lists more functions than I have
> implemented so far too.
>
> > 3. If UTF-8 validaion failed, throws an exception
>
> It already does that, see this test case:
> https://github.com/derickr/php-text/blob/main/tests/text-in-out-basic.phpt#L13
> — although the exception message itself could be improved.
>
> > __toString method returns string type is seems good.
> > Please consider this.
>
> This is already implemented too:
> https://github.com/derickr/php-text/blob/main/text.c#L323
>
> cheers,
> Derick
>
> --
> https://derickrethans.nl | https://xdebug.org | https://dram.io
>
> Author of Xdebug. Like it? Consider supporting me: https://xdebug.org/support
>
> mastodon: @[email protected] @[email protected]

Thanks, Derick.

I confirmed already almost implemented.
Surely, already grapheme_* functions are implemented `$locale` but
conflict `Text::$locale`.

Regards
Yuya

-- 
---------------------------
Yuya Hamada (tekimen)
- https://tekitoh-memdhoi.info
- https://github.com/youkidearitai
-----------------------------

Reply via email to