projjal commented on a change in pull request #10604:
URL: https://github.com/apache/arrow/pull/10604#discussion_r669010300
##########
File path: cpp/src/gandiva/gdv_function_stubs.cc
##########
@@ -635,30 +638,31 @@ const char* gdv_fn_initcap_utf8(int64_t context, const
char* data, int32_t data_
int32_t out_char_len = 0;
int32_t out_idx = 0;
uint32_t char_codepoint;
+
+ // Any character is considered as space, except if it is alphanumeric
bool last_char_was_space = true;
for (int32_t i = 0; i < data_len; i += char_len) {
char_len = gdv_fn_utf8_char_length(data[i]);
- // For single byte characters:
- // If it is a lowercase ASCII character, set the output to its
corresponding uppercase
- // character; else, set the output to the read character
+ // An optimization for single byte characters:
if (char_len == 1) {
Review comment:
calculate char_length for non-ascii case. For ascii just set char_len =
1 for iterating.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]