[ 
https://issues.apache.org/jira/browse/KAFKA-20651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ibenchhida updated KAFKA-20651:
-------------------------------
    Description: 
In KRaft mode, StandardAuthorizerData.findResult() calls acl.kafkaPrincipal()
for every ACL visited during authorization. Despite being called repeatedly
for the same principal strings (e.g., "User:alice"), kafkaPrincipal() parses
the principal string from scratch on each invocation:

    public KafkaPrincipal kafkaPrincipal() {
        int colonIndex = principal.indexOf(":");
        String principalType = principal.substring(0, colonIndex);  // alloc 1
        String principalName = principal.substring(colonIndex + 1); // alloc 2
        return new KafkaPrincipal(principalType, principalName);     // alloc 3
    }

With a large number of ACLs and repeated authorization requests, this
generates millions of transient String + KafkaPrincipal allocations,
creating unnecessary CPU and GC pressure.

 

  was:
*StandardAuthorizerData.checkSection()* can enter an infinite loop when 
iterating over ACLs, causing request handler threads to spin at 100% CPU 
indefinitely. The broker becomes unresponsive (metadata timeouts, file 
descriptor leaks, memory growth) because the handler thread never returns.
*Root Cause*
The loop in checkSection iterates over a NavigableSet.tailSet(exemplar, true) 
of sorted ACLs. It narrows the search range on each iteration by computing a 
common prefix length (matchesUpTo) between the queried resource name and the 
current ACL's resource name, then creating a new exemplar with that shortened 
prefix.
_The bug:_ When matchesUpTo equals the length of exemplar.resourceName() (i.e., 
the queried resource is a prefix of the ACL resource name, e.g. queried 
"foobar" vs ACL "foobar-A"), newPrefix = exemplar.resourceName().substring(0, 
matchesUpTo) produces the same string as the original exemplar. The subsequent 
tailSet(exemplar, true) restarts from the same exemplar, and the first ACL in 
the iterator is the same one just processed — infinite loop.


> StandardAuthorizer: cache KafkaPrincipal in StandardAcl to eliminate 
> allocation hotspot
> ---------------------------------------------------------------------------------------
>
>                 Key: KAFKA-20651
>                 URL: https://issues.apache.org/jira/browse/KAFKA-20651
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 3.9.2
>         Environment:  KRaft-based clusters using StandardAuthorizer (3.7.x, 
> 3.8.x, 3.9.x, 4.0.x — all versions with the current checkSection 
> implementation)
>            Reporter: ibenchhida
>            Priority: Critical
>
> In KRaft mode, StandardAuthorizerData.findResult() calls acl.kafkaPrincipal()
> for every ACL visited during authorization. Despite being called repeatedly
> for the same principal strings (e.g., "User:alice"), kafkaPrincipal() parses
> the principal string from scratch on each invocation:
>     public KafkaPrincipal kafkaPrincipal() {
>         int colonIndex = principal.indexOf(":");
>         String principalType = principal.substring(0, colonIndex);  // alloc 1
>         String principalName = principal.substring(colonIndex + 1); // alloc 2
>         return new KafkaPrincipal(principalType, principalName);     // alloc 
> 3
>     }
> With a large number of ACLs and repeated authorization requests, this
> generates millions of transient String + KafkaPrincipal allocations,
> creating unnecessary CPU and GC pressure.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to