[ 
https://issues.apache.org/jira/browse/DRILL-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boaz Ben-Zvi updated DRILL-6543:
--------------------------------
    Description: 
Introduce a new option to enforce/remind users to reserve some allowance when 
budgeting their memory:

The problem: When the "planner.memory.max_query_memory_per_node" (MQMPN) option 
is set equal (or "nearly equal") to the allocated *Direct Memory*, an OOM is 
still possible. The reason is that the memory used by the "non-buffered" 
operators is not taken into account.

For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
When other non-buffered operators (e.g., a Scanner, or a Sender) also grab some 
of the Direct Memory, then less than 100 MB is left available. And if all those 
5 Hash-Joins are pushing their limits, then one HJ may have only allocated 12MB 
so far, but on the next 1MB allocation it will hit an OOM (from the JVM, as all 
the 100MB Direct memory is already used).

A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
non-buffered operators (e.g., default %25). This *allowance* may prevent many 
of the cases like the example above. The new option would return an error (when 
a query initiates) if the MQMPN is set too high. Note that this option +can 
not+ address concurrent queries.

This should also apply to the alternative for the MQMPN - the 
{{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
_*reserve*_ such memory (e.g., can set it to %100); only its documentation 
clearly explains this issue (that doc suggests reserving %50 allowance, as it 
was written when the Hash-Join was non-buffered; i.e., before spill was 
implemented).

The memory given to the buffered operators is the highest calculated between 
the MQMPN and the PPQ. The new reserve option would verify that this figure 
allows the allowance.

 

  was:
Changes to options related to memory budgeting:

(1) Change the default for "drill.exec.hashjoin.fallback.enabled" to *false* 
(same as for the similar Hash-Agg option). This would force users to calculate 
and assign sufficient memory for the query, or explicitly choose to fallback.

(2) When the "planner.memory.max_query_memory_per_node" (MQMPN) option is set 
equal (or "nearly equal") to the allocated *Direct Memory*, an OOM is still 
possible. The reason is that the memory used by the "non-buffered" operators is 
not taken into account.

For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
When other non-buffered operators (e.g., a Scanner, or a Sender) also grab some 
of the Direct Memory, then less than 100 MB is left available. And if all those 
5 Hash-Joins are pushing their limits, then one HJ may have only allocated 12MB 
so far, but on the next 1MB allocation it will hit an OOM (from the JVM, as all 
the 100MB Direct memory is already used).

A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
non-buffered operators (e.g., default %25). This *allowance* may prevent many 
of the cases like the example above. The new option would return an error (when 
a query initiates) if the MQMPN is set too high. Note that this option +can 
not+ address concurrent queries.

This should also apply to the alternative for the MQMPN - the 
{{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
_*reserve*_ such memory (e.g., can set it to %100); only its documentation 
clearly explains this issue (that doc suggests reserving %50 allowance, as it 
was written when the Hash-Join was non-buffered; i.e., before spill was 
implemented).

The memory given to the buffered operators is the highest calculated between 
the MQMPN and the PPQ. The new reserve option would verify that this figure 
allows the allowance.

 

        Summary: Option for memory mgmt: Reserve allowance for non-buffered  
(was: Options for memory mgmt: Reserve allowance for non-buffered, and 
Hash-Join default to not fallback   )

> Option for memory mgmt: Reserve allowance for non-buffered
> ----------------------------------------------------------
>
>                 Key: DRILL-6543
>                 URL: https://issues.apache.org/jira/browse/DRILL-6543
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Relational Operators
>    Affects Versions: 1.13.0
>            Reporter: Boaz Ben-Zvi
>            Assignee: Boaz Ben-Zvi
>            Priority: Major
>             Fix For: 1.15.0
>
>
> Introduce a new option to enforce/remind users to reserve some allowance when 
> budgeting their memory:
> The problem: When the "planner.memory.max_query_memory_per_node" (MQMPN) 
> option is set equal (or "nearly equal") to the allocated *Direct Memory*, an 
> OOM is still possible. The reason is that the memory used by the 
> "non-buffered" operators is not taken into account.
> For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
> operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
> When other non-buffered operators (e.g., a Scanner, or a Sender) also grab 
> some of the Direct Memory, then less than 100 MB is left available. And if 
> all those 5 Hash-Joins are pushing their limits, then one HJ may have only 
> allocated 12MB so far, but on the next 1MB allocation it will hit an OOM 
> (from the JVM, as all the 100MB Direct memory is already used).
> A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
> non-buffered operators (e.g., default %25). This *allowance* may prevent many 
> of the cases like the example above. The new option would return an error 
> (when a query initiates) if the MQMPN is set too high. Note that this option 
> +can not+ address concurrent queries.
> This should also apply to the alternative for the MQMPN - the 
> {{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
> _*reserve*_ such memory (e.g., can set it to %100); only its documentation 
> clearly explains this issue (that doc suggests reserving %50 allowance, as it 
> was written when the Hash-Join was non-buffered; i.e., before spill was 
> implemented).
> The memory given to the buffered operators is the highest calculated between 
> the MQMPN and the PPQ. The new reserve option would verify that this figure 
> allows the allowance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to