[
https://issues.apache.org/jira/browse/HAWQ-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074846#comment-15074846
]
Ruilong Huo commented on HAWQ-296:
----------------------------------
Root cause analysis shows that: within 12G memory quota in total, it uses about
10.8G (memory_total * runaway_detector_activation_percent = 12G * 90%) as red
zone and reserves 1.2G so that runaway terminator can cancel the query with
high memory usage. The reserved quota is even higher, especially on servers
with large physical memory. For example, for server with 32G memory, it
reserves about 3.2G; for server with 64G memory, it reserves 64G; and for
server with 128G memory, it reserves 12.8G.
Given the memory allocation granularity is 1G, it is reasonable to reduce
runaway_detector_activation_percent so that it won't terminate queries even
there are several gigabytes of memory available.
> TPC-H Query 5 encounters OOM in large HAWQ cluster with None mode
> -----------------------------------------------------------------
>
> Key: HAWQ-296
> URL: https://issues.apache.org/jira/browse/HAWQ-296
> Project: Apache HAWQ
> Issue Type: Bug
> Components: Resource Manager
> Affects Versions: 2.0.0-beta-incubating
> Reporter: Ruilong Huo
> Assignee: Ruilong Huo
>
> TPC-H Query 5 encounters OOM in large HAWQ cluster with None mode.
> OOM in EXPLAIN ANALYZE of TPC-H Query 5:
> {noformat}
> explain analyze select
> n_name,
> sum(l_extendedprice * (1 - l_discount)) as revenue
> from
> customer,
> orders,
> lineitem,
> supplier,
> nation,
> region
> where
> c_custkey = o_custkey
> and l_orderkey = o_orderkey
> and l_suppkey = s_suppkey
> and c_nationkey = s_nationkey
> and s_nationkey = n_nationkey
> and n_regionkey = r_regionkey
> and r_name = 'ASIA'
> and o_orderdate >= date '1997-01-01'
> and o_orderdate < date '1997-01-01' + interval '1 year'
> group by
> n_name
> order by
> revenue desc;
> 532767 [2015-12-25 23:52:39]
> NOTICE: Canceling query because of high VMEM usage. Used: 11060MB, available
> 1228MB, red zone: 11056MB (runaway_cleaner.c:152) (seg969 test45.ic:40000
> pid=738560)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)