markjschreiber commented on code in PR #15106:
URL: https://github.com/apache/arrow/pull/15106#discussion_r1059169767


##########
java/algorithm/src/main/java/org/apache/arrow/algorithm/sort/VectorValueComparator.java:
##########
@@ -76,6 +88,10 @@ public void attachVector(V vector) {
   public void attachVectors(V vector1, V vector2) {
     this.vector1 = vector1;
     this.vector2 = vector2;
+
+    final boolean v1MayHaveNulls = 
vector1.getField().getFieldType().isNullable();
+    final boolean v2MayHaveNulls = 
vector2.getField().getFieldType().isNullable();

Review Comment:
   Weirdly it seems that the validity buffer does have a capacity set when any 
values are added to the value buffer. There is a method to find the number of 
nulls in the vector which is relatively fast, only needs to be checked when 
attaching the vector and can be skipped for `FieldTypes` that are `nonNullable`
   
   Turns out I also need to add a safety check that uses the "null safe" 
comparison when the vector contains no values (valueCount = 0) as  the 
`TreeBasedDictionaryBuilder` seems to need this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to