tlm365 commented on code in PR #13691: URL: https://github.com/apache/datafusion/pull/13691#discussion_r1877208207
########## datafusion/functions/src/string/initcap.rs: ########## @@ -132,21 +132,22 @@ fn initcap_utf8view(args: &[ArrayRef]) -> Result<ArrayRef> { Ok(Arc::new(result) as ArrayRef) } -fn initcap_string(string: Option<&str>) -> Option<String> { - let mut char_vector = Vec::<char>::new(); - string.map(|string: &str| { - char_vector.clear(); - let mut previous_character_letter_or_number = false; - for c in string.chars() { - if previous_character_letter_or_number { - char_vector.push(c.to_ascii_lowercase()); +fn initcap_string(input: Option<&str>) -> Option<String> { + input.map(|s| { + let mut result = String::with_capacity(s.len()); + let mut prev_is_alphanumeric = false; + + for c in s.chars() { + let transformed = if prev_is_alphanumeric { + c.to_ascii_lowercase() } else { - char_vector.push(c.to_ascii_uppercase()); - } - previous_character_letter_or_number = - c.is_ascii_uppercase() || c.is_ascii_lowercase() || c.is_ascii_digit(); + c.to_ascii_uppercase() + }; + result.push(transformed); + prev_is_alphanumeric = c.is_ascii_alphanumeric(); } Review Comment: > I think that's not correct and/or possible with utf8? > Seems this answer (last snippet) https://stackoverflow.com/a/38406885 might be close? @Dandandan you're right. I tested it with some unit tests for special characters (unicode), this PR's implementation doesn't work and the old implementation doesn't work either. But I think it makes sense at the moment since the `initcap` function is in `datafusion::function::string` not `datafusion::function::unicode`. Perhaps, it would be better if we update the documentation that `initcap` is not supported for unicode characters yet and try to handle it if necessary in another PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org