GitHub user selvaganesang opened a pull request:
https://github.com/apache/incubator-trafodion/pull/1228
[TRAFODION-2733] Provide an improved memory quota assignment for big
memory operators (BMO)
1) Enabled memory quota per node. The CQD BMO_MEMORY_LIMIT_PER_NODE
(renamed from EXE_MEMORY_LIMIT_PER_CPU) is set to 10240 MB by default.
Old attribute | Old value | Renamed Attribute | New value
-- | -- | -- | --
EXE_MEMORY_LIMIT_PER_CPU | 0 | BMO_MEMORY_LIMIT_PER_NODE | 10240
EXE_MEMORY_LIMIT_LOWER_BOUND_HASHGROUPBY | 10 |
BMO_MEMORY_LIMIT_LOWER_BOUND_HASHGROUPBY | 25
EXE_MEMORY_LIMIT_LOWER_BOUND_HASHJOIN | 10 |
BMO_MEMORY_LIMIT_LOWER_BOUND_HASHJOIN | 25
EXE_MEMORY_LIMIT_LOWER_BOUND_PROBE_CACHE | 10 |
BMO_MEMORY_LIMIT_LOWER_BOUND_PROBE_CACHE | 25
EXE_MEMORY_LIMIT_LOWER_BOUND_SORT | 10 | BMO_MEMORY_LIMIT_LOWER_BOU
2) Changes in EXPLAIN
Estimated memory per node for all BMOs at ROOT operator
Estimated memory per instance for every BMO operator
Memory quota per instance for every BMO operator
3) BMO TDB contains the memory quota per esp instance now.
4) Root TDB now contains the limit per node and estimated memory per node.
This can be used by WMS to change the memory allocation during
runtime without compilation. - Not yet implemented.
5) Added a CQD BMO_MEMORY_LIMIT_UPPER_BOUND to gap the memory
consumed by BMO by the same queries with less number of
bmos.
6) The unused memory quota is yielded to other fragments in the process
also.
7) Removed the code to limit the ESPs from being assigned to a fragment
based on the BMO memory quota.
8) Added a new CQD BMO_MEMORY_ESTIMATE_RATIO_CAP to gap the memory
estimate skew by any one BMO operator to 0.7.
9) To disable the memory quota per node, set BMO_MEMORY_LIMIT_PER_NODE to 0.
10) This memory quota is distributed proportionally based on the estimated
memory
taking into consideration the number of bmo instances per operator and
the number of nodes available in the cluster to host these instances.
Hence, this memory quota should be valid in multi-fragments independent of
the
number of fragments in an ESP.
10) Removed the CQD EXE_MEMORY_LIMIT_NONBMOS_PERCENT and
EXE_MEMORY_RESERVED_FOR_MXOSRVR_IN_MB
11) Fixed BMO stats WM to be at least the allocated memory.
12) Changed the sort operator to account the bmo memory correctly.
13) Introduced a new CQD BMO_MEMORY_EQUAL_QUOTA_SHARE_RATIO
to determine how much of BMO_MEMORY_LIMITE_PER_NODE is divided equally
between
BMO operators and the rest would depend upon the memory estimate. This
ratio is set
to 0.5 by default. Changing it to 0 will divide the memory quota
proportional to the memory estimate of the BMO operator. This change is needed
to get better BMO quota allocation when the memory estimate goes awry for
certain types of query.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/selvaganesang/incubator-trafodion
bmo_memory_quota
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-trafodion/pull/1228.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1228
----
commit 00e3b87432feb9d84455ca8d20e9899611fdb7ff
Author: selvaganesang <[email protected]>
Date: 2017-09-01T06:31:57Z
Following changes are done in BMO memory quota
1) Enabled memory quota per node. The CQD BMO_MEMORY_LIMIT_PER_NODE
(renamed from EXE_MEMORY_LIMIT_PER_CPU) is set to 10240 MB by default.
Old attribute Old value Renamed Attribute
New value
EXE_MEMORY_LIMIT_PER_CPU 0
BMO_MEMORY_LIMIT_PER_CPU 10240
EXE_MEMORY_LIMIT_LOWER_BOUND_HASHGROUPBY 10
BMO_MEMORY_LIMIT_LOWER_BOUND_HASHGROUPBY 25
EXE_MEMORY_LIMIT_LOWER_BOUND_HASHJOIN 10
BMO_MEMORY_LIMIT_LOWER_BOUND_HASHJOIN 25
EXE_MEMORY_LIMIT_LOWER_BOUND_PROBE_CACHE 10
BMO_MEMORY_LIMIT_LOWER_BOUND_PROBE_CACHE 25
EXE_MEMORY_LIMIT_LOWER_BOUND_SORT 10
BMO_MEMORY_LIMIT_LOWER_BOUND_SORT 200
2) Changes in EXPLAIN
Estimated memory per node for all BMOs at ROOT operator
Estimated memory per instance for every BMO operator
Memory quota per instance for every BMO operator
3) BMO TDB contains the memory quota per esp instance now.
4) Root TDB now contains the limit per node and estimated memory per node.
This can be used by WMS to change the memory allocation during
runtime without compilation. - Not yet implemented.
4) Added a CQD BMO_MEMORY_LIMIT_UPPER_BOUND to gap the memory
consumed by BMO by the same queries with less number of
bmos.
5) The unused memory quota is yielded to other fragments in the process
also.
6) Removed the code to limit the ESPs from being assigned to a fragement
based on the BMO memory quota.
7) Added a new CQD BMO_MEMORY_ESTIMATE_RATIO_CAP to gap the memory
estimate skew by any one BMO operator to 0.7.
8) To disable the memory quota per node, set BMO_MEMORY_LIMIT_PER_NODE to 0.
9) This memory quota is distributed proportionally based on the estimated
memory
taking into consideration the number of bmo instances per operator and
the number of nodes available in the cluster to host these instances.
Hence, this memory quota should be valid in multi-fragments independent
of the
number of fragments in an ESP.
10) Removed the CQD EXE_MEMORY_LIMIT_NONBMOS_PERCENT and
EXE_MEMORY_RESERVED_FOR_MXOSRVR_IN_MB
11) Fixed BMO stats WM to be at least the allocated memory.
12) Changed the sort operator to account the bmo memory correctly.
(cherry picked from commit ba19c04a58890fdd845b03f8d915abdd487b6407)
Conflicts:
core/sqf/src/seatrans/hbase-trx/src/main/java/org/apache/hadoop/hbase/coprocessor/transactional/SplitBalanceHelper.java
core/sql/cli/Context.cpp
core/sql/executor/ex_frag_rt.cpp
core/sql/executor/ex_sort.cpp
core/sql/regress/executor/EXPECTED131
core/sql/regress/executor/EXPECTED140
core/sql/regress/hive/EXPECTED009
core/sql/regress/hive/EXPECTED030
core/sql/regress/hive/FILTER009
core/sql/regress/seabase/EXPECTED010
core/sql/regress/seabase/EXPECTED011
core/sql/regress/seabase/EXPECTED016
core/sql/sqlcomp/DefaultConstants.h
commit 175402bfcec383049f85d11bc237c3d1d6b80e03
Author: selvaganesang <[email protected]>
Date: 2017-09-07T04:35:59Z
Merge branch 'master' of git://git.apache.org/incubator-trafodion into
bmo_memory_quota
Conflicts:
core/sql/cli/Globals.cpp
core/sql/executor/cluster.cpp
core/sql/regress/hive/EXPECTED009
core/sql/regress/seabase/EXPECTED010
core/sql/regress/seabase/EXPECTED011
core/sql/regress/seabase/EXPECTED016
commit f538c2f9d45872a3d5a081032a849563b9d053be
Author: selvaganesang <[email protected]>
Date: 2017-09-08T05:20:05Z
Ensured that CQD NUM_ESP_FRAGMENTS can take value upto 8.
Removed the environment variable concept to set the number of esp
fragments. Removed the CQDs ESP_NUM_FRAGMENTS_WITH_QUOTAS and
ESP_MULTI_FRAGMENT_QUOTAS. Use the corresponding CQD
ESP_NUM_FRAGMENTS and ESP_MUTLI_FRAGMENT instead
(cherry picked from commit 72534e89633f3d4b8a61ab26d471aaa8b7f3e12a)
Conflicts:
core/sql/executor/ex_frag_rt.cpp
core/sql/executor/ex_frag_rt.h
core/sql/generator/GenRelMisc.cpp
commit 7d2c4a85b1ab09702e6bf52e456050aab200e5dd
Author: selvaganesang <[email protected]>
Date: 2017-09-08T05:42:20Z
Provision to tune the BMO memory quota.
Currently, BMO quota is allocated based on memory estimate of the BMO
operator
and the BMO_MEMORY_LIMIT_PER_NODE. Introducing a new CQD
BMO_MEMORY_EQUAL_QUOTA_SHARE_RATIO
to determine how much of BMO_MEMORY_LIMITE_PER_NODE is divided equally
between
BMO operators and the rest would depend upon the memory estimate. This
ratio is set
to 0.5 by default. Changing it to 0 will get the current behavior. This
change is needed
to get better BMO quota allocation when the memory estimate goes awry for
certain types of query.
(cherry picked from commit f3dd246b50905f1e83e852f1fb2e556dc4189761)
Conflicts:
core/sql/optimizer/RelExpr.cpp
commit eece8702212df8d32924717e344059c893e4fa5b
Author: selvaganesang <[email protected]>
Date: 2017-09-09T01:19:24Z
[TRAFODION-2733] Provide an improved memory quota assignment for big memory
operators (BMO)
Enabled memory quota per node. The CQD BMO_MEMORY_LIMIT_PER_NODE
(renamed from EXE_MEMORY_LIMIT_PER_CPU) is set to 10240 MB by default.
Old attribute Old value Renamed Attribute New value
EXE_MEMORY_LIMIT_PER_CPU 0 BMO_MEMORY_LIMIT_PER_NODE 10240
EXE_MEMORY_LIMIT_LOWER_BOUND_HASHGROUPBY 10
BMO_MEMORY_LIMIT_LOWER_BOUND_HASHGROUPBY 25
EXE_MEMORY_LIMIT_LOWER_BOUND_HASHJOIN 10
BMO_MEMORY_LIMIT_LOWER_BOUND_HASHJOIN 25
EXE_MEMORY_LIMIT_LOWER_BOUND_PROBE_CACHE 10
BMO_MEMORY_LIMIT_LOWER_BOUND_PROBE_CACHE 25
EXE_MEMORY_LIMIT_LOWER_BOUND_SORT 10
BMO_MEMORY_LIMIT_LOWER_BOUND_SORT 200
Changes in EXPLAIN
Estimated memory per node for all BMOs at ROOT operator
Estimated memory per instance for every BMO operator
Memory quota per instance for every BMO operator
BMO TDB contains the memory quota per esp instance now.
Root TDB now contains the limit per node and estimated memory per node.
This can be used by WMS to change the memory allocation during
runtime without compilation. - Not yet implemented.
Added a CQD BMO_MEMORY_LIMIT_UPPER_BOUND to gap the memory
consumed by BMO by the same queries with less number of
bmos.
The unused memory quota is yielded to other fragments in the process
also.
Removed the code to limit the ESPs from being assigned to a fragement
based on the BMO memory quota.
Added a new CQD BMO_MEMORY_ESTIMATE_RATIO_CAP to gap the memory
estimate skew by any one BMO operator to 0.7.
To disable the memory quota per node, set BMO_MEMORY_LIMIT_PER_NODE to 0.
This memory quota is distributed proportionally based on the estimated
memory
taking into consideration the number of bmo instances per operator and
the number of nodes available in the cluster to host these instances.
Hence, this memory quota should be valid in multi-fragments independent of
the
number of fragments in an ESP.
Removed the CQD EXE_MEMORY_LIMIT_NONBMOS_PERCENT and
EXE_MEMORY_RESERVED_FOR_MXOSRVR_IN_MB
Fixed BMO stats WM to be at least the allocated memory.
Changed the sort operator to account the bmo memory correctly.
----
---