[
https://issues.apache.org/jira/browse/DRILL-5741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kunal Khatua updated DRILL-5741:
--------------------------------
Description:
Currently, during startup, a Drillbit can be assigned large values for the
following:
* Xmx (Heap)
* XX:MaxDirectMemorySize
* XX:ReservedCodeCacheSize
* XX:MaxPermSize
All of this, potentially, can exceed the available memory on a system when a
Drillbit is under heavy load. It would be good to have the Drillbit ensure
during startup itself that the cumulative value of these parameters does not
exceed a pre-defined upper limit for the Drill process.
This JIRA is a *proposal* to allow for automatic configuration (based on
configuration patterns observed in production Drill clusters). It leverages the
capability of providing distribution (and user-specific) checks during Drill
Startup from DRILL-6068.
The idea is to remove the need for a user to worry about managing the tuning
parameters, by providing the optimal values. In addition, it also allows for
the memory allocation to be implicitly managed by simply providing the Drill
process with a single dimensional of total process memory (either in absolute
values, or as a percentage of the total system memory), while
{{distrib-auto.sh}} provides the individual allocations.
This allocation is then partitioned into allocations for Heap and Direct
Memory, with a small portion allocated for the Generated Java CodeCache as
well. If any of the individual allocations are also specified (via
{{distrib-env.sh}} or {{drill-env.sh}}), the remaining unspecified allocations
are adjusted to stay +within the limits+ of the total memory allocation.
The *details* of the proposal are here:
https://docs.google.com/spreadsheets/d/1N6VYlQFiPoTV4iD46XbkIrvEQesiGFUU9-GWXYsAPXs/edit#gid=0
For those unable to access the Google Document, PDFs are attached:
* [^Auto Mem Allocation Proposal - Computation Logic.pdf] - Provides the
equation used for computing the heap, direct and code cache allocations for a
given input
* [^Auto Mem Allocation Proposal - Scenarios.pdf] - Describes the various
inputs, and their expected allocations
The variables that are (_optionally_) defined (in memory, {{distrib-env.sh}} or
{{drill-env.sh}} ) are:
* {{DRILLBIT_MAX_PROC_MEM}} : Total Process Memory
* {{DRILL_HEAP}} : JVM Max Heap Size
* {{DRILL_MAX_DIRECT_MEMORY}} : JVM Max Direct Memory Size
* {{DRILLBIT_CODE_CACHE_SIZE}} : JVM Code Cache Size
Note: _With JDK8, MaxPermSize is no longer supported, so we do not account for
this any more, and will unset the variable if JDK8 or higher is detected._
was:
Currently, during startup, a Drillbit can be assigned large values for the
following:
* Xmx (Heap)
* XX:MaxDirectMemorySize
* XX:ReservedCodeCacheSize
* XX:MaxPermSize
All of this, potentially, can exceed the available memory on a system when a
Drillbit is under heavy load. It would be good to have the Drillbit ensure
during startup itself that the cumulative value of these parameters does not
exceed a pre-defined upper limit for the Drill process.
The proposal is to have the
[runbit|https://github.com/apache/drill/blob/master/distribution/src/resources/runbit]
script look for an additional environment variable:
{{DRILLBIT_MAX_PROC_MEM}}
The parameter can specify the maximum in GB/MB (similar in syntax to how the
Java's MaxHeap is defined), or in terms of percentage of available memory (not
to exceed 95%).
The
[runbit|https://github.com/apache/drill/blob/master/distribution/src/resources/runbit]
script will perform the calculation of the sum of memory required by the
memory spaces (heap, direct, etc) and ensure that it is within the limit
defined by the {{DRILLBIT_MAX_PROC_MEM}} env variable.
In the absence of this parameter, there will be no restriction. A node admin
can then define this variable in the default terminal's environment (e.g.
{{/root/.bashrc}} ) files.
> During startup Drill should not exceed the available memory
> -----------------------------------------------------------
>
> Key: DRILL-5741
> URL: https://issues.apache.org/jira/browse/DRILL-5741
> Project: Apache Drill
> Issue Type: Improvement
> Components: Server
> Affects Versions: 1.11.0
> Reporter: Kunal Khatua
> Assignee: Kunal Khatua
> Fix For: 1.13.0
>
> Attachments: Auto Mem Allocation Proposal - Computation Logic.pdf,
> Auto Mem Allocation Proposal - Scenarios.pdf
>
> Original Estimate: 336h
> Remaining Estimate: 336h
>
> Currently, during startup, a Drillbit can be assigned large values for the
> following:
> * Xmx (Heap)
> * XX:MaxDirectMemorySize
> * XX:ReservedCodeCacheSize
> * XX:MaxPermSize
> All of this, potentially, can exceed the available memory on a system when a
> Drillbit is under heavy load. It would be good to have the Drillbit ensure
> during startup itself that the cumulative value of these parameters does not
> exceed a pre-defined upper limit for the Drill process.
> This JIRA is a *proposal* to allow for automatic configuration (based on
> configuration patterns observed in production Drill clusters). It leverages
> the capability of providing distribution (and user-specific) checks during
> Drill Startup from DRILL-6068.
> The idea is to remove the need for a user to worry about managing the tuning
> parameters, by providing the optimal values. In addition, it also allows for
> the memory allocation to be implicitly managed by simply providing the Drill
> process with a single dimensional of total process memory (either in absolute
> values, or as a percentage of the total system memory), while
> {{distrib-auto.sh}} provides the individual allocations.
> This allocation is then partitioned into allocations for Heap and Direct
> Memory, with a small portion allocated for the Generated Java CodeCache as
> well. If any of the individual allocations are also specified (via
> {{distrib-env.sh}} or {{drill-env.sh}}), the remaining unspecified
> allocations are adjusted to stay +within the limits+ of the total memory
> allocation.
> The *details* of the proposal are here:
> https://docs.google.com/spreadsheets/d/1N6VYlQFiPoTV4iD46XbkIrvEQesiGFUU9-GWXYsAPXs/edit#gid=0
> For those unable to access the Google Document, PDFs are attached:
> * [^Auto Mem Allocation Proposal - Computation Logic.pdf] - Provides the
> equation used for computing the heap, direct and code cache allocations for a
> given input
> * [^Auto Mem Allocation Proposal - Scenarios.pdf] - Describes the various
> inputs, and their expected allocations
> The variables that are (_optionally_) defined (in memory, {{distrib-env.sh}}
> or {{drill-env.sh}} ) are:
> * {{DRILLBIT_MAX_PROC_MEM}} : Total Process Memory
> * {{DRILL_HEAP}} : JVM Max Heap Size
> * {{DRILL_MAX_DIRECT_MEMORY}} : JVM Max Direct Memory Size
> * {{DRILLBIT_CODE_CACHE_SIZE}} : JVM Code Cache Size
> Note: _With JDK8, MaxPermSize is no longer supported, so we do not account
> for this any more, and will unset the variable if JDK8 or higher is detected._
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)