[ https://issues.apache.org/jira/browse/DRILL-5602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Paul Rogers updated DRILL-5602: ------------------------------- Description: The code that allocates a new {{RepeatedListVector}} does not initialize the first offset to zero as required: {code} @Override public void allocateNew(int valueCount, int innerValueCount) { clear(); getOffsetVector().allocateNew(valueCount + 1); getMutator().reset(); } {code} Since Netty does not zero-fill vectors, the result is vector corruption. If the code worked correctly, here is the behavior when writing to the first element of the list: * Access the offset vector at offset 0. Should be 0. * Write the new value at that offset. Since the first offset is 0, the first value is written at 0 in the value vector. * Write into offset 1 the value at offset 0 plus the length of the new value. But, the offset vector is not initialized to zero. Instead, offset 0 contains the value 16 million. Now: * Access the offset vector at offset 0. Value is 16 million. * Write the new value at that offset. Write at position 16 million. This requires growing the value vector from its present size to 16 MB. was: The query in DRILL-5513 highlighted a problem described in DRILL-5594: that the external sort did not properly allocate its spill batch vectors, and instead allowed them to grow by doubling. While fixing that issue, a new issue became clear. The method to allocate a repeated map vector, however, has a serious bug, as described in DRILL-5530: value vectors do not zero-fill the first allocation for a vector (though subsequent reallocs are zero-filled.) If the code worked correctly, here is the behavior when writing to the first element of the list: * Access the offset vector at offset 0. Should be 0. * Write the new value at that offset. Since the first offset is 0, the first value is written at 0 in the value vector. * Write into offset 1 the value at offset 0 plus the length of the new value. But, the offset vector is not initialized to zero. Instead, offset 0 contains the value 16 million. Now: * Access the offset vector at offset 0. Value is 16 million. * Write the new value at that offset. Write at position 16 million. This requires growing the value vector from its present size to 16 MB. The problem is here in {{RepeatedMapVector}}: {code} public void allocateOffsetsNew(int groupCount) { offsets.allocateNew(groupCount + 1); } {code} Notice that there is no code to set the value at offset 0. Then, in the {{UInt4Vector}}: {code} public void allocateNew(final int valueCount) { allocateBytes(valueCount * 4); } private void allocateBytes(final long size) { ... data = allocator.buffer(curSize); ... {code} The above eventually calls the Netty memory allocator, which explicitly states that, for performance reasons, it does not zero-fill its buffers. The code works in small tests because the new buffer comes from Java direct memory, which *does* zero-fill the buffer. > Repeated List Vector fails to initialize the offset vector > ---------------------------------------------------------- > > Key: DRILL-5602 > URL: https://issues.apache.org/jira/browse/DRILL-5602 > Project: Apache Drill > Issue Type: Bug > Affects Versions: 1.10.0 > Reporter: Paul Rogers > Assignee: Paul Rogers > Fix For: 1.11.0 > > > The code that allocates a new {{RepeatedListVector}} does not initialize the > first offset to zero as required: > {code} > @Override > public void allocateNew(int valueCount, int innerValueCount) { > clear(); > getOffsetVector().allocateNew(valueCount + 1); > getMutator().reset(); > } > {code} > Since Netty does not zero-fill vectors, the result is vector corruption. > If the code worked correctly, here is the behavior when writing to the first > element of the list: > * Access the offset vector at offset 0. Should be 0. > * Write the new value at that offset. Since the first offset is 0, the first > value is written at 0 in the value vector. > * Write into offset 1 the value at offset 0 plus the length of the new value. > But, the offset vector is not initialized to zero. Instead, offset 0 contains > the value 16 million. Now: > * Access the offset vector at offset 0. Value is 16 million. > * Write the new value at that offset. Write at position 16 million. This > requires growing the value vector from its present size to 16 MB. -- This message was sent by Atlassian JIRA (v6.4.14#64029)