Github user paul-rogers commented on a diff in the pull request:

    https://github.com/apache/drill/pull/1091#discussion_r162225383
  
    --- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractBase.java
 ---
    @@ -29,9 +28,12 @@
     
       public static long INIT_ALLOCATION = 1_000_000L;
       public static long MAX_ALLOCATION = 10_000_000_000L;
    +  // Default output batch size, 512MB
    +  public static long OUTPUT_BATCH_SIZE = 512 * 1024 * 1024L;
    --- End diff --
    
    Too large. The sort & hash agg operators often receive just 20-40 MB on a 
large cluster. (That is, itself, an issue, but one that has proven very 
difficult to resolve.) So, the output batch size must be no larger than 1/3 
this size (for sort). Probably some team discussion is required to agree on a 
good number, and on the work needed to ensure that sort, hash agg and hash join 
are given sufficient memory for the selected batch size.


---

Reply via email to