[GitHub] drill pull request #938: DRILL-5694: Handle HashAgg OOM by spill and retry, ...

paul-rogers Sat, 09 Sep 2017 19:56:48 -0700

Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/938#discussion_r137939361
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java
 ---
    @@ -1335,7 +1470,7 @@ private void updateStats(HashTable[] htables) {
         }
         if ( rowsReturnedEarly > 0 ) {
           stats.setLongStat(Metric.SPILL_MB, // update stats - est. total MB 
returned early
    -          (int) Math.round( rowsReturnedEarly * estRowWidth / 1024.0D / 
1024.0));
    +          (int) Math.round( rowsReturnedEarly * estOutputRowWidth / 
1024.0D / 1024.0));
    --- End diff --
    
    This file is a template. This means, we copy *all* this code each time we 
generate a new class. How is doing so helping stability, customer value or 
performance? Should all this code be in a template that is copied on every 
query? Or, should it be refactored into a driver class, with only a very light 
wrapper appearing in the copied template?
    
    As this code get ever more complex, it puts a strain on the Java code that 
must walk though this code and do method fixup, scalar replacements, etc. That 
work takes time. What value accrues to the user from doing this fixup on code 
that never changes  from one query to the next?
    
    Filed [DRILL-5779](https://issues.apache.org/jira/browse/DRILL-5779) for 
this issue.

---

[GitHub] drill pull request #938: DRILL-5694: Handle HashAgg OOM by spill and retry, ...

Reply via email to