Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/1091#discussion_r162225383
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/base/AbstractBase.java
---
@@ -29,9 +28,12 @@
public static long INIT_ALLOCATION = 1_000_000L;
public static long MAX_ALLOCATION = 10_000_000_000L;
+ // Default output batch size, 512MB
+ public static long OUTPUT_BATCH_SIZE = 512 * 1024 * 1024L;
--- End diff --
Too large. The sort & hash agg operators often receive just 20-40 MB on a
large cluster. (That is, itself, an issue, but one that has proven very
difficult to resolve.) So, the output batch size must be no larger than 1/3
this size (for sort). Probably some team discussion is required to agree on a
good number, and on the work needed to ensure that sort, hash agg and hash join
are given sufficient memory for the selected batch size.
---