2024年10月5日(土) 1:20 Tim Düsterhus <t...@bastelstu.be>: > > Hi > > Am 2024-09-25 09:21, schrieb youkidearitai: > > I tried implement mb_levenshtein function and create an RFC. > > https://wiki.php.net/rfc/mb_levenshtein > > https://github.com/php/php-src/pull/16043 > > > > I would like discussion, feel free to comment. > > Thank you for your RFC. I share the concern raised by cmb in the PR > discussion: > https://github.com/php/php-src/pull/16043#issuecomment-2374574538 > > Generally working with codepoints is going to be confusing for a user, > but sometimes it is necessary when dealing with external systems that > themselves work with codepoints (MySQL comes to my mind). However > calculating the Levenshtein distance is most certainly something that > purely is "user-facing" and not constrained by external systems. > Calculating the distance of codepoints is going to be extremely > confusing when dealing with things like Emoji. It would probably best to > either only offer a `grapheme_*` function here or to leave this fully to > userland. > > Best regards > Tim Düsterhus
Hi, Tim Thank you for response. I thinking about wants users what is levenshtein distance. Surely, I think Levenshtein distance should be measured in terms of grapheme clusters. In most userland codes that based on UTF-8. So seems move to grapheme function is make sense. I more thinking usecase of levenshtein. Probably I'm going to grapheme function. Thanks Yuya -- --------------------------- Yuya Hamada (tekimen) - https://tekitoh-memdhoi.info - https://github.com/youkidearitai -----------------------------