okadakk commented on a change in pull request #11298:
URL: https://github.com/apache/arrow/pull/11298#discussion_r720815471
##########
File path: cpp/src/arrow/compute/kernels/scalar_string.cc
##########
@@ -4299,6 +4390,10 @@ void RegisterScalarStringAscii(FunctionRegistry*
registry) {
&utf8_ltrim_whitespace_doc);
MakeUnaryStringBatchKernel<UTF8RTrimWhitespace>("utf8_rtrim_whitespace",
registry,
&utf8_rtrim_whitespace_doc);
+ MakeUnaryStringBatchKernel<Utf8Nfd>("utf8_nfd", registry, &utf8_nfd_doc);
+ MakeUnaryStringBatchKernel<Utf8Nfkd>("utf8_nfkd", registry, &utf8_nfkd_doc);
+ MakeUnaryStringBatchKernel<Utf8Nfc>("utf8_nfc", registry, &utf8_nfc_doc);
+ MakeUnaryStringBatchKernel<Utf8Nfkc>("utf8_nfkc", registry, &utf8_nfkc_doc);
Review comment:
Thank you!
I referred to the link below.
https://github.com/JuliaStrings/utf8proc/blob/master/utf8proc.c#L759-L785
This code provides four methods.
I think it's easier for users to use without options.
However, I think it would be more convenient if we could control it freely
as an option.
I'm wondering which option is better. If you have any advice, I would like
it...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]