[
https://issues.apache.org/jira/browse/KAFKA-20651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ibenchhida updated KAFKA-20651:
-------------------------------
Description:
{{StandardAuthorizerData.findAclRule}} and {{checkSection}} perform linear
scans over the global ACL set ({{{}aclsByResource{}}}) for each authorization
request.
With large ACL datasets (e.g. ~7000 ACLs), this results in excessive CPU usage
due to repeated evaluation of irrelevant ACLs for a given principal.
The current design filters ACLs by principal only at evaluation time
({{{}findResult{}}}), instead of narrowing the search space earlier.
was:
In KRaft mode, StandardAuthorizerData.findResult() calls acl.kafkaPrincipal()
for every ACL visited during authorization. Despite being called repeatedly
for the same principal strings (e.g., "User:alice"), kafkaPrincipal() parses
the principal string from scratch on each invocation:
public KafkaPrincipal kafkaPrincipal() {
int colonIndex = principal.indexOf(":");
String principalType = principal.substring(0, colonIndex); // alloc 1
String principalName = principal.substring(colonIndex + 1); // alloc 2
return new KafkaPrincipal(principalType, principalName); // alloc 3
}
With a large number of ACLs and repeated authorization requests, this
generates millions of transient String + KafkaPrincipal allocations,
creating unnecessary CPU and GC pressure.
> High CPU usage in StandardAuthorizerData.findAclRule due to O(N) ACL scanning
> and lack of principal-level indexing
> ------------------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-20651
> URL: https://issues.apache.org/jira/browse/KAFKA-20651
> Project: Kafka
> Issue Type: Improvement
> Components: core
> Affects Versions: 3.9.2
> Environment: KRaft clusters using StandardAuthorizer (3.4.0+)
> Reporter: ibenchhida
> Priority: Critical
> Labels: authorization, performance
> Attachments: KAFKA-20651.patch
>
>
> {{StandardAuthorizerData.findAclRule}} and {{checkSection}} perform linear
> scans over the global ACL set ({{{}aclsByResource{}}}) for each authorization
> request.
> With large ACL datasets (e.g. ~7000 ACLs), this results in excessive CPU
> usage due to repeated evaluation of irrelevant ACLs for a given principal.
> The current design filters ACLs by principal only at evaluation time
> ({{{}findResult{}}}), instead of narrowing the search space earlier.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)