ASF GitHub Bot commented on DRILL-5709:

Github user paul-rogers commented on a diff in the pull request:

    --- Diff: 
exec/vector/src/main/java/org/apache/drill/exec/vector/BaseValueVector.java ---
    @@ -133,5 +134,21 @@ public static boolean checkBufRefs(final ValueVector 
vv) {
       public BufferAllocator getAllocator() {
         return allocator;
    +  public static void fillBitsVector(UInt1Vector bits, int valueCount) {
    +    // Create a new bits vector, all values non-null
    +    bits.allocateNew(valueCount);
    +    UInt1Vector.Mutator bitsMutator = bits.getMutator();
    +    for (int i = 0; i < valueCount; i++) {
    +      bitsMutator.set(i, 1);
    +    }
    --- End diff --
    Sadly, Netty's `PlatformDependent` provides no equivalent of `memset`. Some 
tricks we could do, if tests show we have a performance issue:
    * Use `setLong()` to write the value 0x01010101010101L every 8 values, with 
bytes for the last < 8 values.
    * Create a heap buffer of some length n, filled with 0x01, and do a copy 
from that buffer to the direct memory, as many times as needed.
    Since both of these will make the code much more complex, let's measure to 
see the cost of the simple solution before we try them.
    Also, at a higher level, we should not even need this trick. A 
properly-designed value vector hierarchy would treat a non-nullable vector as a 
nullable vector that always returns `false` for `isNull()`. This is what the 
new readers do and it vastly simplifies the code.
    Sometimes it helps to go back to basics and remember Abstractions 101: 
well-designed abstractions are our friend.

> Provide a value vector method to convert a vector to nullable
> -------------------------------------------------------------
>                 Key: DRILL-5709
>                 URL: https://issues.apache.org/jira/browse/DRILL-5709
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Minor
>             Fix For: 1.12.0
> The hash agg spill work has need to convert a non-null scalar vector to the 
> nullable equivalent. For efficiency, the code wishes to simply transfer the 
> underlying data buffer(s), and create the required "bits" vector, rather than 
> generating code that does the transfer row-by-row.
> The solution is to add a {{toNullable(ValueVector nullableVector)}} method to 
> the {{ValueVector}} class, then implement it where needed.
> Since the target code only works with scalars (that is, no arrays, no maps, 
> no lists), the code only handles these cases, throwing an 
> {{UnsupportedOperationException}} in other cases.
> Usage:
> {code}
> ValueVector nonNullableVector = // your non-nullable vector
> MajorType type = MajorType.newBuilder(nonNullableVector.getType)
>     .setMode(DataMode.OPTIONAL)
>     .build();
> MaterializedField field = MaterializedField.create(name, type);
> ValueVector nullableVector = TypeHelper.getNewVector(field, 
> oContext.getAllocator());
> nonNullableVector.toNullable(nullableVector);
> // Data is now in nullableVector
> {code}

This message was sent by Atlassian JIRA

Reply via email to