[ 
https://issues.apache.org/jira/browse/DRILL-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093509#comment-16093509
 ] 

ASF GitHub Bot commented on DRILL-5601:
---------------------------------------

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/860#discussion_r128127882
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java
 ---
    @@ -189,30 +238,29 @@ public RecordBatchSizer(VectorAccessible va) {
       public RecordBatchSizer(VectorAccessible va, SelectionVector2 sv2) {
         rowCount = va.getRecordCount();
         for (VectorWrapper<?> vw : va) {
    -      int size = measureColumn(vw.getValueVector());
    -      if ( size > maxSize ) { maxSize = size; }
    -      if ( vw.getField().isNullable() ) { numNullables++; }
    +      measureColumn(vw.getValueVector(), "", rowCount);
    +    }
    +
    +    for (BufferLedger ledger : ledgers) {
    +      accountedMemorySize += ledger.getAccountedSize();
         }
     
         if (rowCount > 0) {
    -      grossRowWidth = roundUp(totalBatchSize, rowCount);
    +      grossRowWidth = roundUp(accountedMemorySize, rowCount);
         }
     
         if (sv2 != null) {
           sv2Size = sv2.getBuffer(false).capacity();
    -      grossRowWidth += roundUp(sv2Size, rowCount);
    -      netRowWidth += 2;
    +      accountedMemorySize += sv2Size;
         }
     
    -    int totalDensity = 0;
    -    int usableCount = 0;
    -    for (ColumnSize colSize : columnSizes) {
    -      if ( colSize.density > 0 ) {
    -        usableCount++;
    -      }
    -      totalDensity += colSize.density;
    -    }
    -    avgDensity = roundUp(totalDensity, usableCount);
    +    computeEstimates();
    +  }
    +
    +  private void computeEstimates() {
    +    grossRowWidth = roundUp(accountedMemorySize, rowCount);
    +    netRowWidth = roundUp(netBatchSize, rowCount);
    +    avgDensity = roundUp(netBatchSize * 100, accountedMemorySize);
       }
     
       public void applySv2() {
    --- End diff --
    
    Fixed.


> Rollup of External Sort memory management fixes
> -----------------------------------------------
>
>                 Key: DRILL-5601
>                 URL: https://issues.apache.org/jira/browse/DRILL-5601
>             Project: Apache Drill
>          Issue Type: Task
>    Affects Versions: 1.11.0
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>             Fix For: 1.12.0
>
>
> Rollup of a set of specific JIRA entries that all relate to the very 
> difficult problem of managing memory within Drill in order for the external 
> sort to stay within a memory budget. In general, the fixes relate to better 
> estimating memory used by the three ways that Drill allocates vector memory 
> (see DRILL-5522) and to predicting the size of vectors that the sort will 
> create, to avoid repeated realloc-copy cycles (see DRILL-5594).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to