[jira] [Commented] (IMPALA-14661) Optimize admissiond memory usage by compressing RPC payloads
[ https://issues.apache.org/jira/browse/IMPALA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18075728#comment-18075728 ] ASF subversion and git services commented on IMPALA-14661: -- Commit 1a73414d729e4ab8e519444aac7bfd6b3019e60a in impala's branch refs/heads/master from Yida Wu [ https://gitbox.apache.org/repos/asf?p=impala.git;h=1a73414d7 ] IMPALA-14763: Prevent admissiond OOM during request decompression When admissiond is close to its memory limit and a very large queued query is dequeued, decompression of this compressed request can push memory usage over the limit and cause an OOM. Previously in IMPALA-14493, memory checks TCMalloc's BYTES_IN_USE to provide memory safeguard on Submission for uncompressed requests, but after IMPALA-14661, we need to consider the decompression cases. This patch adds memory safeguard for compressed requests, mainly the decompression will happen on Submission or Dequeue. We put all the rejection logic into a static function RejectForAdmissionServiceMemory(), and introduce a new memory tracker, pending_decompression_mem_tracker, to track the total uncompressed size of pending compressed requests. RejectForAdmissionServiceMemory() compares the current tcmalloc bytes-in-use plus the additional memory to reserve against the process memory limit. For compressed requests, we first add the request’s uncompressed size to pending_decompression_mem_tracker, then pass the total pending uncompressed size as the additional reserved memory to RejectForAdmissionServiceMemory(), ensuring thread safety. For uncompressed requests, the additional memory is zero. If the check fails, RejectForAdmissionServiceMemory() returns an error and admissiond rejects the query. Additionally, to prevent early decompression for queued compressed requests when GetQueryStatus() is called, we now cache the TQueryOptions inside AdmissionExecRequestCompressed during the first decompression. WaitOnQueued() uses these cached options for the AC_AFTER_ADMISSION_OUTCOME debug action instead of redecompressing the whole request. Testing: Added a new test to check compressed requests being rejected on Submission. Manually verified that the safeguard also works at Dequeue, an automated test for the Dequeue case was a bit flaky to include. Passed exhaustive test test_admission_controller.py. Change-Id: I196455f445f0644d89467a23b4ec1f64f184f2db Reviewed-on: http://gerrit.cloudera.org:8080/24055 Reviewed-by: Impala Public Jenkins Tested-by: Impala Public Jenkins > Optimize admissiond memory usage by compressing RPC payloads > > > Key: IMPALA-14661 > URL: https://issues.apache.org/jira/browse/IMPALA-14661 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Affects Versions: Impala 4.5.0 >Reporter: Yida Wu >Assignee: Yida Wu >Priority: Critical > Fix For: Impala 5.0.0 > > > In global admissiond mode, the TQueryExecRequest in the admission request can > be very large for complex queries, consuming significant memory in the > admissiond service. > This task is to optimize memory usage by compressing the request payload on > the client side (coordinator) before sending. On the server side > (admissiond), we store the compressed version in memory and only decompress > it just-in-time when the admission decision is actually being made. This can > largely reduces the memory footprint for queued queries. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
[jira] [Commented] (IMPALA-14661) Optimize admissiond memory usage by compressing RPC payloads
[ https://issues.apache.org/jira/browse/IMPALA-14661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18054406#comment-18054406 ] ASF subversion and git services commented on IMPALA-14661: -- Commit 543f0206eb68429c5bc7d69f2623addf71f57f7b in impala's branch refs/heads/master from Yida Wu [ https://gitbox.apache.org/repos/asf?p=impala.git;h=543f0206e ] IMPALA-14661: Optimize admissiond memory usage by compressing exec requests In global admissiond, the TQueryExecRequest can be very large for complex queries, consuming large memory while queries are queued. This patch adds support for compressing TQueryExecRequest when sending it to the admission control service through AdmitQuery RPC. This reduces memory usage in admissiond for large query execution requests. Compression is controlled by the new startup flag admission_control_rpc_compress_threshold_bytes. A value of 0 disables compression, while positive values enable compression for requests larger than the threshold. The uncompressed path remains unchanged. Adds a new TQueryExecRequestCompressed thrift struct along with compression and decompression helper functions. The admission controller now handles both compressed and uncompressed requests through a common AdmissionExecRequest abstraction. Compressed requests are decompressed lazily and cached to reduce decompression overhead. Decompression timing is carefully controlled. Requests are initially decompressed at submission, but if a request is queued, the decompressed request cache is released to reduce memory usage. When a queued request is later dequeued, it is decompressed again and the decompressed cache is retained. Since admission uses FIFO ordering, a dequeued request is expected to be at the head of the queue and may be accessed multiple times if not admitted immediately. Retaining the cache in this case avoids repeated decompression. This patch also removes the query_options field in AdmissionRequest to eliminate ambiguity between TExecRequest.query_options and the query options nested in TQueryExecRequest. ClientRequestState is updated to sync the top-level TExecRequest.query_options into the nested TQueryExecRequest before admission. As a result, the admission controller now reads query options from TQueryExecRequest, enforcing a single source of truth for admission logic. Adds admissiond metrics to track compressed size, uncompressed size, and compression ratio for query execution requests. Testing: Adds unit tests for Thrift compression and decompression helpers. Updates admission controller tests to cover compressed requests. Passed exhaustive tests. Change-Id: I5a676d1a806451cbf84b0a3f8a706d7c6655e12d Reviewed-on: http://gerrit.cloudera.org:8080/23852 Tested-by: Impala Public Jenkins Reviewed-by: Yida Wu > Optimize admissiond memory usage by compressing RPC payloads > > > Key: IMPALA-14661 > URL: https://issues.apache.org/jira/browse/IMPALA-14661 > Project: IMPALA > Issue Type: Sub-task > Components: Backend >Affects Versions: Impala 4.5.0 >Reporter: Yida Wu >Assignee: Yida Wu >Priority: Critical > > In global admissiond mode, the TQueryExecRequest in the admission request can > be very large for complex queries, consuming significant memory in the > admissiond service. > This task is to optimize memory usage by compressing the request payload on > the client side (coordinator) before sending. On the server side > (admissiond), we store the compressed version in memory and only decompress > it just-in-time when the admission decision is actually being made. This can > largely reduces the memory footprint for queued queries. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
