zeroshade commented on code in PR #35161:
URL: https://github.com/apache/arrow/pull/35161#discussion_r1171445177


##########
go/arrow/array/compare.go:
##########
@@ -630,6 +631,34 @@ func validityBitmapEqual(left, right arrow.Array) bool {
        return true
 }
 
+func stripNulls(s string) string {
+       return strings.ReplaceAll(s, "\x00", "")
+}
+

Review Comment:
   So for performance reasons we don't validate the utf8 strings on Append 
currently and leave it up to the producer to ensure that they are passing valid 
utf-8 strings when constructing the array (if it's not valid utf-8 they should 
be using a Binary array). 
   
   On my list of things to do eventually is a "Validate" method for each array 
type like the C++ library has. That "Validate" method would do the UTF-8 
validity check on the buffer so that a consumer can choose *when* they take the 
performance hit for validating the utf-8.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to