[ 
https://issues.apache.org/jira/browse/DRILL-4410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15151781#comment-15151781
 ] 

MinJi Kim edited comment on DRILL-4410 at 2/18/16 6:10 AM:
-----------------------------------------------------------

In ListVector.allocateNew(), bits is not allocated.  This can cause problems in 
ValueVectors since the bits are used to decide whether re-allocation is 
necessary.  For example, UInt1Vector.java, we use the bits to determine whether 
we need to reAlloc (in getValueCapacity).  In reAlloc, we always double the 
size of allocation.  Since bits is not set, we continue to double the 
allocation, even if most of the allocated space is not used.

  public void copyFromSafe(int fromIndex, int thisIndex, UInt1Vector from){
    while(thisIndex >= getValueCapacity()) { 
        reAlloc();
    }
    copyFrom(fromIndex, thisIndex, from);
  }

  public void reAlloc() {
    final long newAllocationSize = allocationSizeInBytes * 2L;
    if (newAllocationSize > MAX_ALLOCATION_SIZE)  {
      throw new OversizedAllocationException("Unable to expand the buffer. Max 
allowed buffer size is reached.");
    }
...
  }


was (Author: minjikim):
In ListVector.allocateNew(), bits is not allocated.  This can cause problems in 
ValueVectors since the bits are used to decide whether re-allocation is 
necessary.  For example, UInt1Vector.java, we use the bits to determine whether 
we need to reAlloc.  In reAlloc, we always double the size of allocation.  
Since bits is not set, we continue to double the allocation, even if most of 
the allocated space is not used.

  public void copyFromSafe(int fromIndex, int thisIndex, UInt1Vector from){
    while(thisIndex >= getValueCapacity()) {   <----- getValueCapacity uses the 
"bits" to determine whether to reallocate (0 if bits is empty).
        reAlloc();
    }
    copyFrom(fromIndex, thisIndex, from);
  }

  public void reAlloc() {
    final long newAllocationSize = allocationSizeInBytes * 2L;
    if (newAllocationSize > MAX_ALLOCATION_SIZE)  {
      throw new OversizedAllocationException("Unable to expand the buffer. Max 
allowed buffer size is reached.");
    }
...
  }

> ListVector causes OversizedAllocationException
> ----------------------------------------------
>
>                 Key: DRILL-4410
>                 URL: https://issues.apache.org/jira/browse/DRILL-4410
>             Project: Apache Drill
>          Issue Type: Bug
>          Components:  Server
>            Reporter: MinJi Kim
>            Assignee: MinJi Kim
>
> Reading large data set with array/list causes the following problem.  This 
> happens when union type is enabled.
> (org.apache.drill.exec.exception.OversizedAllocationException) Unable to 
> expand the buffer. Max allowed buffer size is reached.
> org.apache.drill.exec.vector.UInt1Vector.reAlloc():214
> org.apache.drill.exec.vector.UInt1Vector$Mutator.setSafe():406
> org.apache.drill.exec.vector.complex.ListVector$Mutator.setNotNull():298
> org.apache.drill.exec.vector.complex.ListVector$Mutator.startNewValue():307
> org.apache.drill.exec.vector.complex.impl.UnionListWriter.startList():563
> org.apache.drill.exec.vector.complex.impl.ComplexCopier.writeValue():115
> org.apache.drill.exec.vector.complex.impl.ComplexCopier.copy():100
> org.apache.drill.exec.vector.complex.ListVector.copyFrom():97
> org.apache.drill.exec.vector.complex.ListVector.copyFromSafe():89
> org.apache.drill.exec.test.generated.HashJoinProbeGen197.projectBuildRecord():356
> org.apache.drill.exec.test.generated.HashJoinProbeGen197.executeProbePhase():173
> org.apache.drill.exec.test.generated.HashJoinProbeGen197.probeAndProject():223
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext():233
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():257
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():251
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1657
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():251
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745 (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to