zeroshade commented on code in PR #35161:
URL: https://github.com/apache/arrow/pull/35161#discussion_r1168824967
##########
go/arrow/array/binarybuilder.go:
##########
@@ -319,7 +320,10 @@ func (b *BinaryBuilder) UnmarshalOne(dec *json.Decoder)
error {
case string:
data, err := base64.StdEncoding.DecodeString(v)
if err != nil {
- return err
+ data, err = hex.DecodeString(v)
+ if err != nil {
+ return err
+ }
Review Comment:
this seems odd to add here. Should we also add this to
`AppendValueFromString` and update the docs that you can also use hex to
specify binary strings?
##########
go/arrow/array/compare.go:
##########
@@ -630,6 +631,34 @@ func validityBitmapEqual(left, right arrow.Array) bool {
return true
}
+func stripNulls(s string) string {
+ return strings.ReplaceAll(s, "\x00", "")
+}
+
Review Comment:
In the Arrow spec a String array should be all valid utf8. So shouldn't this
only be for trimming trailing Nulls? (ie: use `strings.TrimRight` instead of
`strings.ReplaceAll` )?
To this end, can you have a test with a string that contains a null inside
it that *isn't* at the end and ensure that this approxequal only trims nulls,
and doesn't strip them from within the string?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]