rip-nsk commented on PR #36814: URL: https://github.com/apache/arrow/pull/36814#issuecomment-1648874404
> Hmmm, for example, if you're using int32 and uint32, the encoded bytes might be the same, but the handling method would be different. As for string, it might has different collations. So just compare the binary usally doesn't get the right result. just to confirm, let's we have column chunk with values V1..Vn, I assume that statistic's min <= of any V1..Vn, and correspondently max >= any of V1..Vn if I understand correctly, this may be guarantied only for min_value/max_value and only if sort order is not UNDEFINED. If so, silent returning min/max in other cases may be wrong. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
