Ezra Zerihun created IMPALA-13075:
-------------------------------------
Summary: Setting very high BATCH_SIZE can blow up memory usage of
fragments
Key: IMPALA-13075
URL: https://issues.apache.org/jira/browse/IMPALA-13075
Project: IMPALA
Issue Type: Improvement
Components: Backend
Affects Versions: Impala 4.0.0
Reporter: Ezra Zerihun
In Impala 4.0, setting a very high BATCH_SIZE or near max limit of 65536 can
cause some fragment's memory usage to spike way past the query's defined
MEM_LIMIT or pool's Maximum Query Memory Limit with Clamp on. So even though
MEM_LIMIT is set reasonable, the query can still fail with out of memory and a
huge amount of memory used on fragment. Reducing BATCH_SIZE to a reasonable
amount or back to default will allow the query to run without issue and use
reasonable amount of memory within query's MEM_LIMIT or pool's Maximum Query
Memory Limit.
1) set BATCH_SIZE=65536; set MEM_LIMIT=1g;
{code:java}
Query State: EXCEPTION
Impala Query State: ERROR
Query Status: Memory limit exceeded: Error occurred on backend ...:27000 by
fragment ... Memory left in process limit: 145.53 GB Memory left in query
limit: -6.80 GB Query(...): memory limit exceeded. Limit=1.00 GB
Reservation=86.44 MB ReservationLimit=819.20 MB OtherMemory=7.71 GB Total=7.80
GB Peak=7.84 GB Unclaimed reservations: Reservation=8.50 MB OtherMemory=0
Total=8.50 MB Peak=56.44 MB Runtime Filter Bank: Reservation=4.00 MB
ReservationLimit=4.00 MB OtherMemory=0 Total=4.00 MB Peak=4.00 MB Fragment
...: Reservation=1.94 MB OtherMemory=7.59 GB Total=7.59 GB Peak=7.63 GB
HASH_JOIN_NODE (id=8): Reservation=1.94 MB OtherMemory=7.57 GB Total=7.57 GB
Peak=7.57 GB Exprs: Total=7.57 GB Peak=7.57 GB Hash Join Builder
(join_node_id=8): Total=0 Peak=1.95 MB
...
Query Options (set by configuration):
BATCH_SIZE=65536,MEM_LIMIT=1073741824,CLIENT_IDENTIFIER=Impala Shell
v4.0.0.7.2.16.0-287 (5ae3917) built on Mon Jan 9 21:23:59 UTC
2023,DEFAULT_FILE_FORMAT=PARQUET,...
...
ExecSummary:
...
09:AGGREGATE 32 32 0.000ns 0.000ns 0
4.83M 36.31 MB 212.78 MB STREAMING
08:HASH JOIN 32 32 5s149ms 2m44s 0
194.95M 7.57 GB 1.94 MB RIGHT OUTER JOIN, PARTITIONED
|--18:EXCHANGE 32 32 93.750us 1.000ms 10.46K
1.55K 1.65 MB 2.56 MB HASH(...
{code}
2) set BATCH_SIZE=0; set MEM_LIMIT=1g;
{code:java}
Query State: FINISHED
Impala Query State: FINISHED
...
Query Options (set by configuration and planner):
MEM_LIMIT=1073741824,CLIENT_IDENTIFIER=Impala Shell v4.0.0.7.2.16.0-287
(5ae3917) built on Mon Jan 9 21:23:59 UTC 2023,DEFAULT_FILE_FORMAT=PARQUET,...
...
ExecSummary:
...
09:AGGREGATE 32 32 593.748us 18.999ms 45
4.83M 34.06 MB 212.78 MB STREAMING
08:HASH JOIN 32 32 10s873ms 5m47s 10.47K
194.95M 123.48 MB 1.94 MB RIGHT OUTER JOIN, PARTITIONED
|--18:EXCHANGE 32 32 0.000ns 0.000ns 10.46K
1.55K 344.00 KB 1.69 MB HASH(...
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)