jpedroantunes commented on a change in pull request #10173:
URL: https://github.com/apache/arrow/pull/10173#discussion_r623128543



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -1422,6 +1422,239 @@ const char* replace_utf8_utf8_utf8(gdv_int64 context, 
const char* text,
                                              out_len);
 }
 
+FORCE_INLINE
+const char* lpad(gdv_int64 context, const char* text, gdv_int32 text_len,
+                 gdv_int32 return_length, const char* fill_text, gdv_int32 
fill_text_len,
+                 gdv_int32* out_len) {
+  // if the text length or the defined return length (number of characters to 
return)
+  // is <=0, then return an empty string.
+  if (text_len == 0 || return_length <= 0) {
+    *out_len = 0;
+    return "";
+  }
+
+  // initially counts the number of utf8 characters in the defined text and 
fill_text
+  int32_t text_char_count = utf8_length(context, text, text_len);
+  int32_t fill_char_count = utf8_length(context, fill_text, fill_text_len);
+  // text_char_count is zero if input has invalid utf8 char
+  // fill_char_count is zero if fill_text_len is > 0 and its value has invalid 
utf8 char
+  if (text_char_count == 0 || (fill_text_len > 0 && fill_char_count == 0)) {

Review comment:
       Waiting for confirmation to define if we should really implement this 
behavior of ignoring invalid characters




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to