Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1164#discussion_r174327511 --- Diff: exec/vector/src/main/codegen/templates/VariableLengthVectors.java --- @@ -514,6 +516,22 @@ public boolean isNull(int index){ * The equivalent Java primitive is '${minor.javaType!type.javaType}' * * NB: this class is automatically generated from ValueVectorTypes.tdd using FreeMarker. + * </p> + * <h2>Contract</h2> + * <p> + * Variable length vectors do not support random writes. All set methods must be called for with a monotonically increasing consecutive sequence of indexes. --- End diff -- This is very important to know. This is why spill-to-disk for hash agg will eventually cause a serious customer failure. Aggregate UDFs write to vectors to store intermediate group values. A "max" string can't. Instead, it writes to a Java object. That object will be lost on spill and reread. Will result in loosing prior max values and the aggregate starting over. So, this little note is not just a nuisance, it is the fatal flaw in how we handle the (albeit obscure) string aggregate values.
---