Michael Ho resolved IMPALA-6116.
       Resolution: Fixed
    Fix Version/s: Impala 2.12.0
                   Impala 3.0


IMPALA-6116: Bound memory usage of DataStreamSevice's service queue

The fix for IMPALA-6193 added a memory tracker for the memory consumed
 by the payloads in the service queue of DataStreamService. This change
 extends it by introducing a bound on the memory usage for that service
 queue. In addition, it deprecates FLAGS_datastream_service_queue_depth
 and replaces it with FLAGS_datastream_service_queue_mem_limit. These flags
 only take effect when KRPC is in use and KRPC was never enabled in any
 previous releases so it seems safe to do this flag replacement. The new
 flag FLAGS_datastream_service_queue_mem_limit directly dictates the amount
 of memory which can be consumed by the service queue of DataStreamService.
 This allows a more direct control over the memory usage of the queue instead
 of inferring via the number of entries in the queue. The default value of
 this flag is left at 0, in which case it will be set to 5% of process
 memory limit.

Testing done: exhaustive debug builds. Updated data-stream-test to
 exercise the case in which the payload is larger than the limit.

Change-Id: Idea4262dfb0e0aa8d58ff6ea6a8aaaa248e880b9
 Reviewed-on: [http://gerrit.cloudera.org:8080/9282]
 Reviewed-by: Michael Ho <k...@cloudera.com>
 Tested-by: Impala Public Jenkins

> Bound memory usage of KRPC service queue
> ----------------------------------------
>                 Key: IMPALA-6116
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6116
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Distributed Exec
>    Affects Versions: Impala 2.11.0
>            Reporter: Michael Ho
>            Assignee: Michael Ho
>            Priority: Major
>             Fix For: Impala 3.0, Impala 2.12.0
> During testing with KRPC, it's observed that the number of entries in the 
> service queue needs to be extended to a rather large number (e.g. 10,000) in 
> order to avoid excessive RPC retries due to service queue being full when 
> hosts are under load in a cluster of modest size (100~150 nodes).
> In the short run, we may have to use a rather large default value for the 
> service queue's size. We can consider sizing it based on the cluster's size 
> at startup (if at all possible). This, of course, is not ideal as the service 
> queue may consume non-trivial amount of untracked memory (each entry contains 
> a compressed row batch).
> In the long run, we may need to implement flow control (IMPALA-5859) in 
> Impala to avoid re-sending the row batch without checking if the receiver has 
> space for it. Moreover, we may need to reconsider the role of service queue: 
> should the actual queuing of incoming row batches be attached to a 
> per-exchange node queue and have a separate deserialization thread pool which 
> pulls from this queue ? In this way, we can account for the memory 
> consumption and size the queue based on number of senders and memory budget 
> of a query. One hurdle is that we need to overcome the undesirable 
> cross-thread allocation pattern as rpc_context is allocated from service 
> threads but freed by the deserialization thread. Other ideas are welcomed.
> cc'ing [~mmokhtar]

This message was sent by Atlassian JIRA

Reply via email to