Matthew Jacobs created IMPALA-5838:
--------------------------------------

             Summary: Suggested MEM_LIMIT in rejected query error may be too low
                 Key: IMPALA-5838
                 URL: https://issues.apache.org/jira/browse/IMPALA-5838
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 2.10.0
            Reporter: Matthew Jacobs


If you set a small memlimit, it may suggest that the memlimit should be at 
least 79.75 MB. Then, if you run it with 80MB, it'll suggest 92MB.

{code}
[philip-dev.gce.cloudera.com:21000] > set mem_limit=1024;
MEM_LIMIT set to 1024
[philip-dev.gce.cloudera.com:21000] > select count(*) from functional.alltypes;
Query: select count(*) from functional.alltypes
Query submitted at: 2017-08-24 08:36:37 (Coordinator: 
http://philip-dev.gce.cloudera.com:25000)
ERROR: Memory limit exceeded: Could not allocate aggregate expression 
intermediate value
Exprs could not allocate 16.00 B without exceeding limit.
Error occurred on backend philip-dev.gce.cloudera.com:22000 by fragment 
704f7351ee03f7f1:5ed9277d00000003
Memory left in process limit: 31.34 GB
Memory left in query limit: 1.00 KB
Query(704f7351ee03f7f1:5ed9277d00000000): Limit=1.00 KB Reservation=0 
ReservationLimit=0 OtherMemory=0 Total=0 Peak=0
  Fragment 704f7351ee03f7f1:5ed9277d00000000: Reservation=0 OtherMemory=0 
Total=0 Peak=0
    AGGREGATION_NODE (id=3): Total=0 Peak=0
    EXCHANGE_NODE (id=2): Total=0 Peak=0
    DataStreamRecvr: Total=0 Peak=0
  Fragment 704f7351ee03f7f1:5ed9277d00000003: Reservation=0 OtherMemory=0 
Total=0 Peak=0
    AGGREGATION_NODE (id=1): Total=0 Peak=0
    HDFS_SCAN_NODE (id=0): Total=0 Peak=0

[philip-dev.gce.cloudera.com:21000] >       select count(*)
      from tpch_parquet.lineitem join tpch_parquet.orders on l_orderkey = 
o_orderkey;
Query: select count(*)
      from tpch_parquet.lineitem join tpch_parquet.orders on l_orderkey = 
o_orderkey
Query submitted at: 2017-08-24 08:36:46 (Coordinator: 
http://philip-dev.gce.cloudera.com:25000)
ERROR: Rejected query from pool default-pool: minimum memory reservation is 
greater than memory available to the query for buffer reservations. Mem 
available for buffer reservations based on mem_limit: 1.00 KB, memory 
reservation needed: 4.75 MB. Set mem_limit to at least 79.75 MB. See the query 
profile for more information.

[philip-dev.gce.cloudera.com:21000] > set mem_limit=80mb;
MEM_LIMIT set to 80mb
[philip-dev.gce.cloudera.com:21000] >       select count(*)
      from tpch_parquet.lineitem join tpch_parquet.orders on l_orderkey = 
o_orderkey;
Query: select count(*)
      from tpch_parquet.lineitem join tpch_parquet.orders on l_orderkey = 
o_orderkey
Query submitted at: 2017-08-24 08:37:01 (Coordinator: 
http://philip-dev.gce.cloudera.com:25000)
ERROR: Rejected query from pool default-pool: minimum memory reservation is 
greater than memory available to the query for buffer reservations. Mem 
available for buffer reservations based on mem_limit: 80.00 MB, memory 
reservation needed: 17.00 MB. Set mem_limit to at least 92.00 MB. See the query 
profile for more information.
{code}

The join is switching from broadcast to partitioned based on the mem_limit, 
which then affects the per-node bytes estimate, which then affects the buffer 
size chosen by the planner, which then affects the reservation. This is the code
https://github.com/apache/incubator-impala/blob/c7db60aa46565c19634e8a791df3af8d116b9017/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java#L539

The good news is that it will converge after bumping the mem_limit to the 
recommended value the second time. This is still very wonky.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to