[ 
https://issues.apache.org/jira/browse/KAFKA-9685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiao Zhang updated KAFKA-9685:
------------------------------
    Description: 
In version 1.1, 
[https://github.com/apache/kafka/blob/71b1e19fc60b5e1f9bba33025737ec2b7fb1c2aa/core/src/main/scala/kafka/security/auth/SimpleAclAuthorizer.scala#L110]
 the logic for checking acls is preparing a merged acl Set with 
 'acls = getAcls(new Resource(resource.resourceType, 
Resource.WildCardResource)) ++ getAcls(resource);' and then pass it as 
aclMatch's parameter.
 We found scala's Set '++' operation is very slow for example in the case that 
the Set on right hand of '++' has more than 100 entries.
 And the bad performance of '++' is due to iterating every entry of the Set on 
right hand of '++' in which the calculation of HashCode seems heavy.
 The performance of 'authorize' is important as each request delivered to 
broker goes through the logic, that's the reason we can't leave it as-is 
although the change for this proposal seems trivial.

Here is the approach. We propose to solve this issue by introducing a new class 
'AclSets' which takes multiple Sets as parameters and do 'find' against them 
one by one.
 ``` 
 class AclSets(sets: Set[Acl]*){  

  def find(p: Acl => Boolean): Option[Acl] = sets.flatMap(_.find(p)).headOption 
 

  def isEmpty: Boolean = !sets.exists(_.nonEmpty)

}

``` 
 This approach avoids the Set '++' operation, and thus outperforms a lot 
compared to old '++' logic.

The benchmark result(we did the test with kafka version 1.1) shows notable 
difference under the condition:
 1. set on left consists of 60 entries
 2. set of right consists of 30 entries
 3. search for absent entry (so that all entries are iterated)

Benchmark Results is as following.

Mode                                                     Cnt    Score         
Error   Units
 ScalaSetConcatination.Set thrpt          3   281.974  ± 140.029  ops/ms
 ScalaSetConcatination.AclSets thrpt   3   887.426 ± 40.261    ops/ms

As the upstream also use the similar ++ operation, 
[https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/security/authorizer/AclAuthorizer.scala#L360]
 we think it's necessary to fix this issue.

  was:
In version 1.1, 
[https://github.com/apache/kafka/blob/71b1e19fc60b5e1f9bba33025737ec2b7fb1c2aa/core/src/main/scala/kafka/security/auth/SimpleAclAuthorizer.scala#L110]
 the logic for checking acls is preparing a merged acl Set with 
 'acls = getAcls(new Resource(resource.resourceType, 
Resource.WildCardResource)) ++ getAcls(resource);' and then pass it as 
aclMatch's parameter.
 We found scala's Set '++' operation is very slow for example in the case that 
the Set on right hand of '++' has more than 100 entries.
 And the bad performance of '++' is due to iterating every entry of the Set on 
right hand of '++' in which the calculation of HashCode seems heavy.
 The performance of 'authorize' is important as each request delivered to 
broker goes through the logic, that's the reason we can't leave it as-is 
although the change for this proposal seems trivial.

Here is the approach. We propose to solve this issue by introducing a new class 
'AclSets' which takes multiple Sets as parameters and do 'find' against them 
one by one.
 ``` 
 class AclSets(sets: Set[Acl]*)

{   def find(p: Acl => Boolean): Option[Acl] = 
sets.flatMap(_.find(p)).headOption   def isEmpty: Boolean = 
!sets.exists(_.nonEmpty) }

``` 
 This approach avoids the Set '++' operation, and thus outperforms a lot 
compared to old '++' logic.

The benchmark result(we did the test with kafka version 1.1) shows notable 
difference under the condition:
 1. set on left consists of 60 entries
 2. set of right consists of 30 entries
 3. search for absent entry (so that all entries are iterated)

Benchmark Results is as following.

Mode                                                     Cnt    Score         
Error   Units
 ScalaSetConcatination.Set thrpt          3   281.974  ± 140.029  ops/ms
 ScalaSetConcatination.AclSets thrpt   3   887.426 ± 40.261    ops/ms

As the upstream also use the similar ++ operation, 
[https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/security/authorizer/AclAuthorizer.scala#L360]
 we think it's necessary to fix this issue.


> Solve Set concatenation perf issue in AclAuthorizer
> ---------------------------------------------------
>
>                 Key: KAFKA-9685
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9685
>             Project: Kafka
>          Issue Type: Improvement
>          Components: security
>    Affects Versions: 1.1.0
>            Reporter: Jiao Zhang
>            Priority: Minor
>
> In version 1.1, 
> [https://github.com/apache/kafka/blob/71b1e19fc60b5e1f9bba33025737ec2b7fb1c2aa/core/src/main/scala/kafka/security/auth/SimpleAclAuthorizer.scala#L110]
>  the logic for checking acls is preparing a merged acl Set with 
>  'acls = getAcls(new Resource(resource.resourceType, 
> Resource.WildCardResource)) ++ getAcls(resource);' and then pass it as 
> aclMatch's parameter.
>  We found scala's Set '++' operation is very slow for example in the case 
> that the Set on right hand of '++' has more than 100 entries.
>  And the bad performance of '++' is due to iterating every entry of the Set 
> on right hand of '++' in which the calculation of HashCode seems heavy.
>  The performance of 'authorize' is important as each request delivered to 
> broker goes through the logic, that's the reason we can't leave it as-is 
> although the change for this proposal seems trivial.
> Here is the approach. We propose to solve this issue by introducing a new 
> class 'AclSets' which takes multiple Sets as parameters and do 'find' against 
> them one by one.
>  ``` 
>  class AclSets(sets: Set[Acl]*){  
>   def find(p: Acl => Boolean): Option[Acl] = 
> sets.flatMap(_.find(p)).headOption  
>   def isEmpty: Boolean = !sets.exists(_.nonEmpty)
> }
> ``` 
>  This approach avoids the Set '++' operation, and thus outperforms a lot 
> compared to old '++' logic.
> The benchmark result(we did the test with kafka version 1.1) shows notable 
> difference under the condition:
>  1. set on left consists of 60 entries
>  2. set of right consists of 30 entries
>  3. search for absent entry (so that all entries are iterated)
> Benchmark Results is as following.
> Mode                                                     Cnt    Score         
> Error   Units
>  ScalaSetConcatination.Set thrpt          3   281.974  ± 140.029  ops/ms
>  ScalaSetConcatination.AclSets thrpt   3   887.426 ± 40.261    ops/ms
> As the upstream also use the similar ++ operation, 
> [https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/security/authorizer/AclAuthorizer.scala#L360]
>  we think it's necessary to fix this issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to