[ 
https://issues.apache.org/jira/browse/DRILL-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001706#comment-16001706
 ] 

Paul Rogers edited comment on DRILL-5488 at 5/8/17 10:57 PM:
-------------------------------------------------------------

Consider the {{setValueCount()}} method again:

{code}
    public void setValueCount(int valueCount) {
      final int currentValueCapacity = getValueCapacity();
      ...
      while(valueCount > getValueCapacity()) {
        reAlloc();
      }
      if (valueCount > 0 && currentValueCapacity > valueCount * 2) {
        ...
{code}

The {{if}} statement is very probably wrong. We get the current capacity. If 
too small, we realloc. Then, we check the value, after the possible 
reallocation, without updating the capacity. It is not clear if we are trying 
to do check the before or after.

Then:

{code}
      if (valueCount > 0 && currentValueCapacity > valueCount * 2) {
        incrementAllocationMonitor(); // is ++allocationMonitor
      } else if (allocationMonitor > 0) {
        allocationMonitor = 0;
      }
{code}

It seems we are trying to guess if we did a realloc rather than simply 
incrementing the monitor when we actually do the reAlloc. It is also not clear 
that the if-statement mimics the if that does the reAlloc.

In fact, a close inspection of the {{setValueCount()}} method, shows that it 
does not even work correctly. It is allocating enough space to hold the value 
count, which will implicitly set the unused slots to 0. But, this is an offset 
vector; we should be copying forward the last set offset. Consider an example:

\[0, 3, 9] as set values.

Set row count as 5. What we create is:

\[0, 3, 9, 0, 0, 0]

What we should have set is:

\[0, 3, 9, 9, 9, 9]

It seems to be the responsibility of the caller to set the last value, so what 
we actually set today, in this case, is:

\[0, 3, 9, 0, 0, 9]


was (Author: paul-rogers):
Consider the {{setValueCount()}} method again:

{code}
    public void setValueCount(int valueCount) {
      final int currentValueCapacity = getValueCapacity();
      ...
      while(valueCount > getValueCapacity()) {
        reAlloc();
      }
      if (valueCount > 0 && currentValueCapacity > valueCount * 2) {
        ...
{code}

The {{if}} statement is very probably wrong. We get the current capacity. If 
too small, we realloc. Then, we check the value, after the possible 
reallocation, without updating the capacity. It is not clear if we are trying 
to do check the before or after.

Then:

{code}
      if (valueCount > 0 && currentValueCapacity > valueCount * 2) {
        incrementAllocationMonitor(); // is ++allocationMonitor
      } else if (allocationMonitor > 0) {
        allocationMonitor = 0;
      }
{code}

It seems we are trying to guess if we did a realloc rather than simply 
incrementing the monitor when we actually do the reAlloc. It is also not clear 
that the if-statement mimics the if that does the reAlloc.

In fact, a close inspection of the {{setValueCount()}} method, shows that it 
does not even work correctly. It is allocating enough space to hold the value 
count, which will implicitly set the unused slots to 0. But, this is an offset 
vector; we should be copying forward the last set offset. Consider an example:

\[0, 3, 9] as set values.

Set row count as 5. What we create is:

\[0, 3, 9, 0, 0]

What we should have set is:

\[0, 3, 9, 9, 9]

> Useless code in VectorTrimmer, fixed-width vector setValueCount()
> -----------------------------------------------------------------
>
>                 Key: DRILL-5488
>                 URL: https://issues.apache.org/jira/browse/DRILL-5488
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.10.0
>            Reporter: Paul Rogers
>            Priority: Trivial
>
> Consider this code in a generated fixed-width vector, such as UInt4Vector:
> {code}
>     @Override
>     public void setValueCount(int valueCount) {
>       ...
>       final int idx = (VALUE_WIDTH * valueCount);
>       ...
>       VectorTrimmer.trim(data, idx);
>       data.writerIndex(valueCount * VALUE_WIDTH);
>     }
> {code}
> Consider the {{trim()}} method:
> {code}
> public class VectorTrimmer {
>   ...
>   public static void trim(ByteBuf data, int idx) {
>     data.writerIndex(idx);
>     if (data instanceof DrillBuf) {
>       // data.capacity(idx);
>       data.writerIndex(idx);
>     }
>   }
> }
> {code}
> This method is called {{trim}}, but it actually sets the writer index in the 
> buffer (though we never use that index.) Since all buffers we use are 
> {{DrillBuf}}, the if-statement is a no-op: we simply set the writer index 
> twice.
> But, notice that the {{setValueCount()}} method itself calls the same 
> {{writerIndex()}} method, so it is actually being called three times.
> It seems this code can simply be discarded: it is called from only two 
> places; neither of which end up using the writer index.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to