[
https://issues.apache.org/jira/browse/DRILL-5808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16187257#comment-16187257
]
ASF GitHub Bot commented on DRILL-5808:
---------------------------------------
Github user paul-rogers commented on a diff in the pull request:
https://github.com/apache/drill/pull/958#discussion_r140960874
--- Diff:
exec/memory/base/src/main/java/org/apache/drill/exec/memory/Accountant.java ---
@@ -80,6 +119,23 @@ public Accountant(Accountant parent, long reservation,
long maxAllocation) {
}
/**
+ * Request lenient allocations: allows exceeding the allocation limit
+ * by the configured grace amount. The request is granted only if strict
+ * limits are not required.
+ *
+ * @param enable
+ */
+ public boolean setLenient() {
--- End diff --
Added a comment to explain the purpose. Fixed Javadoc comments.
> Reduce memory allocator strictness for "managed" operators
> ----------------------------------------------------------
>
> Key: DRILL-5808
> URL: https://issues.apache.org/jira/browse/DRILL-5808
> Project: Apache Drill
> Issue Type: Improvement
> Affects Versions: 1.11.0
> Reporter: Paul Rogers
> Assignee: Paul Rogers
> Fix For: 1.12.0
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> Drill 1.11 and 1.12 introduce new "managed" versions of the sort and hash agg
> that enforce memory limits, spilling to disk when necessary.
> Drill's internal memory system is very "lumpy" and unpredictable. The
> operators have no control over the incoming batch size; an overly large batch
> can cause the operator to exceed its memory limit before it has a chance to
> do any work.
> Vector allocations grow in power-of-two sizes. Adding a single record can
> double the memory allocated to a vector.
> Drill has no metadata, so operators cannot predict the size of VarChar
> columns nor the cardinality of arrays. The "Record Batch Sizer" tries to
> extract this information on each batch, but it works with averages, and
> specific column patterns can still throw off the memory calculations. (For
> example, having a series of very wide columns for A-M and very narrow columns
> for N-Z will cause a moderate average. But, once sorted, the A-M rows, and
> batches, will be much larger than expected, causing out-of-memory errors.)
> At present, if an operator is wrong in its memory usage by a single byte, the
> entire query is killed. That is, the user pays the death penalty (of queries)
> for poor design decisions within Drill. This leads to a less-than-optimal
> user experience.
> The proposal here is to make the memory allocator less strict for "managed"
> operators.
> First, we recognize that the managed operators do attempt to control memory
> and, if designed well, will, on average hit their targets.
> Second, we recognize that, due to the lumpiness issues above, any single
> operator may exceed, or be under, the configured maximum memory.
> Given this, the proposal here is:
> 1. An operator identifies itself as managed to the memory allocator.
> 2. In managed mode, the allocator has soft limits. It emits a warning to the
> log when the limit is exceeded.
> 3. For safety, in managed mode, the allocator enforces a hard limit larger
> than the configured limit.
> The enforcement limit might be:
> * For memory sizes < 100MB, up to 2x the configured limit.
> * For larger memory sizes, no more than 100MB over the configured limit.
> The exact numbers can be made configurable.
> Now, during testing, scripts should look for over-memory warnings. Each
> should be fixed as we fix OOM issues today. But, during production, user
> queries are far less likely to fail due to any remaining corner cases that
> throw off the memory calculations.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)