[
https://issues.apache.org/jira/browse/DRILL-5601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16093822#comment-16093822
]
ASF GitHub Bot commented on DRILL-5601:
---------------------------------------
Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/860#discussion_r128367397
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/spill/RecordBatchSizer.java
---
@@ -70,52 +72,90 @@
*/
public final int estSize;
+
+ /**
+ * Number of times the value here (possibly repeated) appears in
+ * the record batch.
+ */
+
public final int valueCount;
- public final int entryCount;
- public final int dataSize;
- public final int estElementCount;
+
+ /**
+ * Number of times the entries of this column appears. If this is a
+ * scalar, the entry count is the same as the value count. If this
--- End diff --
The comment is poorly worded. Meant "array of scalar" vs. "Array of
variable-width". There is also "array of array of scalar | variable width"...
Please check if the revised comment makes this any clearer. (It is hard to
explain without first explaining the whole column data model...)
> Rollup of External Sort memory management fixes
> -----------------------------------------------
>
> Key: DRILL-5601
> URL: https://issues.apache.org/jira/browse/DRILL-5601
> Project: Apache Drill
> Issue Type: Task
> Affects Versions: 1.11.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Fix For: 1.12.0
>
>
> Rollup of a set of specific JIRA entries that all relate to the very
> difficult problem of managing memory within Drill in order for the external
> sort to stay within a memory budget. In general, the fixes relate to better
> estimating memory used by the three ways that Drill allocates vector memory
> (see DRILL-5522) and to predicting the size of vectors that the sort will
> create, to avoid repeated realloc-copy cycles (see DRILL-5594).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)