sagnikc-dremio commented on a change in pull request #7641:
URL: https://github.com/apache/arrow/pull/7641#discussion_r459212543



##########
File path: cpp/src/gandiva/precompiled/string_ops.cc
##########
@@ -305,21 +342,104 @@ const char* trim_utf8(gdv_int64 context, const char* 
data, gdv_int32 data_len,
     --end;
   }
 
-  // string with no leading/trailing spaces, return original string
-  if (start == 0 && end == data_len - 1) {
-    *out_len = data_len;
-    return data;
+  // string has some leading/trailing spaces and some non-space characters
+  *out_len = end - start + 1;
+  return data + start;
+}
+
+// Trims characters present in the trim text from the left end of the base text
+FORCE_INLINE
+const char* ltrim_utf8_utf8(gdv_int64 context, const char* basetext,

Review comment:
       Added logic which will sniff out an invalid byte or an incomplete 
multibyte character and will throw an error, as was discussed offline. Also, 
added unit tests for the same.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to