Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/938#discussion_r137939361 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggTemplate.java --- @@ -1335,7 +1470,7 @@ private void updateStats(HashTable[] htables) { } if ( rowsReturnedEarly > 0 ) { stats.setLongStat(Metric.SPILL_MB, // update stats - est. total MB returned early - (int) Math.round( rowsReturnedEarly * estRowWidth / 1024.0D / 1024.0)); + (int) Math.round( rowsReturnedEarly * estOutputRowWidth / 1024.0D / 1024.0)); --- End diff -- This file is a template. This means, we copy *all* this code each time we generate a new class. How is doing so helping stability, customer value or performance? Should all this code be in a template that is copied on every query? Or, should it be refactored into a driver class, with only a very light wrapper appearing in the copied template? As this code get ever more complex, it puts a strain on the Java code that must walk though this code and do method fixup, scalar replacements, etc. That work takes time. What value accrues to the user from doing this fixup on code that never changes from one query to the next? Filed [DRILL-5779](https://issues.apache.org/jira/browse/DRILL-5779) for this issue.
---