tlm365 commented on code in PR #13691:
URL: https://github.com/apache/datafusion/pull/13691#discussion_r1877208207


##########
datafusion/functions/src/string/initcap.rs:
##########
@@ -132,21 +132,22 @@ fn initcap_utf8view(args: &[ArrayRef]) -> 
Result<ArrayRef> {
     Ok(Arc::new(result) as ArrayRef)
 }
 
-fn initcap_string(string: Option<&str>) -> Option<String> {
-    let mut char_vector = Vec::<char>::new();
-    string.map(|string: &str| {
-        char_vector.clear();
-        let mut previous_character_letter_or_number = false;
-        for c in string.chars() {
-            if previous_character_letter_or_number {
-                char_vector.push(c.to_ascii_lowercase());
+fn initcap_string(input: Option<&str>) -> Option<String> {
+    input.map(|s| {
+        let mut result = String::with_capacity(s.len());
+        let mut prev_is_alphanumeric = false;
+
+        for c in s.chars() {
+            let transformed = if prev_is_alphanumeric {
+                c.to_ascii_lowercase()
             } else {
-                char_vector.push(c.to_ascii_uppercase());
-            }
-            previous_character_letter_or_number =
-                c.is_ascii_uppercase() || c.is_ascii_lowercase() || 
c.is_ascii_digit();
+                c.to_ascii_uppercase()
+            };
+            result.push(transformed);
+            prev_is_alphanumeric = c.is_ascii_alphanumeric();
         }

Review Comment:
   > I think that's not correct and/or possible with utf8?
   > Seems this answer (last snippet) https://stackoverflow.com/a/38406885 
might be close?
   
   @Dandandan you're right. I tested it with some unit tests for special 
characters (unicode), this PR's implementation doesn't work and the old 
implementation doesn't work either. But I think it makes sense at the moment 
since the `initcap` function is in `datafusion::function::string` not 
`datafusion::function::unicode`.
   
   Perhaps, it would be better if we update the documentation that `initcap` is 
not supported for unicode characters yet and try to handle it if necessary in 
another PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to