[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager

2024-05-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846805#comment-17846805
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2114003582

   Thanks @vikaskr22 for the contribution, and @Hexiaoqiao for the review.




> Performance improvement for DelegationTokenSecretManager
> 
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We can say that this thread is slow and has blocked remaining all. But 
> following is my observation:
>  
>  # verifyToken() and createPaswword() has been synchronized because one is 
> reading the tokenMap and another is updating the map. If it's only to protect 
> the map, then can't we simply use ConcurrentHashMap and remove the 
> "synchronized" keyword. Because due to this, all reader 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager

2024-05-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846803#comment-17846803
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi merged PR #6803:
URL: https://github.com/apache/hadoop/pull/6803




> Performance improvement for DelegationTokenSecretManager
> 
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We can say that this thread is slow and has blocked remaining all. But 
> following is my observation:
>  
>  # verifyToken() and createPaswword() has been synchronized because one is 
> reading the tokenMap and another is updating the map. If it's only to protect 
> the map, then can't we simply use ConcurrentHashMap and remove the 
> "synchronized" keyword. Because due to this, all reader threads ( to 
> verifyToken()) are also blocking each other.
>  # IN HA env, It is recommended to use ZK to 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846407#comment-17846407
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

hadoop-yetus commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2111003939

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 00s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 01s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m 00s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  | 119m 49s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  56m 02s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   6m 47s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   6m 20s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/5/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |   6m 38s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 190m 46s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 51s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  51m 03s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  51m 03s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   7m 14s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   6m 18s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/5/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |   6m 44s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 208m 05s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   8m 54s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 652m 41s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6803 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 2348886a150d 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / b506761282fef3aac9d96d04a68cb1a3afe8def1 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/5/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/5/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846296#comment-17846296
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2110134515

   @Hexiaoqiao , thanks for the info about mvnsite. Current, hadoop-yetus is 
passed. This "Apache Yetus", I never see it's all green. Should this github CI 
be all green before the PR can be committed?  
   




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We can say that this thread is slow and has blocked remaining all. But 
> following is my observation:
>  
>  # verifyToken() and createPaswword() has been synchronized because one is 
> reading the tokenMap and another is updating the map. If it's only to protect 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846268#comment-17846268
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

hadoop-yetus commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2109897935

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 20s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 22s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   8m 45s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   8m  3s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 38s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 52s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 44s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 30s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 28s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 34s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   9m 51s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   9m 51s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   8m 35s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 39s |  |  
hadoop-common-project/hadoop-common: The patch generated 0 new + 23 unchanged - 
1 fixed = 23 total (was 24)  |
   | +1 :green_heart: |  mvnsite  |   0m 53s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 39s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 31s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 26s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m 15s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 28s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 35s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 140m  2s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/5/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6803 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux fa3112f0de77 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / b506761282fef3aac9d96d04a68cb1a3afe8def1 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/5/testReport/ |
   | Max. process+thread count | 1775 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
   | Console output | 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17846160#comment-17846160
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

Hexiaoqiao commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2109216164

   > the mvnsite failure is, looks like not relevant, but other merged MR seems 
doesn't have this failure.
   
   Hi @ChenSammi, @vikaskr22 `mvnsite` failure is not related to this PR, it 
has been followed-up by another thread.




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We can say that this thread is slow and has blocked remaining all. But 
> following is my observation:
>  
>  # verifyToken() and createPaswword() has been synchronized because one is 
> reading the tokenMap and another is updating the map. 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845987#comment-17845987
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

hadoop-yetus commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2108044944

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 01s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 00s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 00s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 00s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m 00s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  91m 48s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  40m 52s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   4m 43s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   4m 21s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/4/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |   4m 40s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 147m 22s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 14s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  38m 12s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  38m 13s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 42s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   4m 33s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/4/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |   5m 04s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 161m 04s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   6m 04s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 498m 23s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6803 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 908f425e64f3 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 62949a34cc8297996934e20f279afbf4d883afbe |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/4/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/4/console
 |
   | versions | git=2.45.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845878#comment-17845878
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

hadoop-yetus commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2107268466

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 31s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  44m 48s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  17m 33s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |  15m 47s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   1m 13s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 40s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 46s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   2m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  35m 54s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 54s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  16m 28s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |  16m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  15m 37s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |  15m 37s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   1m 10s | 
[/results-checkstyle-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/4/artifact/out/results-checkstyle-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common-project/hadoop-common: The patch generated 9 new + 23 
unchanged - 1 fixed = 32 total (was 24)  |
   | +1 :green_heart: |  mvnsite  |   1m 36s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 50s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   2m 38s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  35m  7s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  20m 25s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  0s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 222m 33s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.44 ServerAPI=1.44 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6803 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 476c6deadd75 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 62949a34cc8297996934e20f279afbf4d883afbe |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/4/testReport/ |
   | Max. process+thread count | 1309 (vs. ulimit of 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845807#comment-17845807
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2106842836

   This mvnsite failure is, looks like not relevant, but other merged MR seems 
doesn't have this failure. 
   ```
   [INFO] -

> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We can say that this thread is slow and has blocked remaining all. But 
> following is my observation:
>  
>  # verifyToken() and createPaswword() has been synchronized because one is 
> reading the tokenMap and another is updating the map. If it's only to protect 
> the map, then can't we simply use ConcurrentHashMap and remove the 
> 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845803#comment-17845803
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

Hexiaoqiao commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2106824252

   @ChenSammi Got it. Make sense to me. 
   > Yes, removing the synchronized directly have potential risks to sub 
classes of AbstractDelegationTokenSecretManager. Using LOCK is a safe way to 
replace synchronized and can improve the concurrency too. That's the motivation 
of this implementation.
   
   @vikaskr22 The current changes LGTM. One nit comment about `mvnsite` build 
failed, not sure if related with this changes. Would you mind to double check? 
Thanks. 




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845794#comment-17845794
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

vikaskr22 commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2106765757

   @Hexiaoqiao , @ChenSammi , Anything else needed from my side?  I had fixed 
all the review comments that's part of my commit.




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We can say that this thread is slow and has blocked remaining all. But 
> following is my observation:
>  
>  # verifyToken() and createPaswword() has been synchronized because one is 
> reading the tokenMap and another is updating the map. If it's only to protect 
> the map, then can't we simply use ConcurrentHashMap and remove the 
> 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845542#comment-17845542
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

hadoop-yetus commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2105598897

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 00s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m 00s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  | 126m 36s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  59m 52s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   7m 21s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   6m 47s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/3/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |   7m 15s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 206m 03s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   8m 00s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  56m 30s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  56m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 01s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   7m 08s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   6m 50s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/3/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |   7m 28s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 227m 44s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   8m 54s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 705m 05s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6803 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 2000e04dde5c 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 18c4f3a6dfce2157f7829cc99ca406fa308505b5 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/3/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/3/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845399#comment-17845399
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

hadoop-yetus commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2104796915

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 21s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 47s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   8m 56s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   8m 12s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 43s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 57s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 43s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 34s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 30s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 58s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 30s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   8m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   8m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  1s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 38s | 
[/results-checkstyle-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/3/artifact/out/results-checkstyle-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common-project/hadoop-common: The patch generated 9 new + 23 
unchanged - 1 fixed = 32 total (was 24)  |
   | +1 :green_heart: |  mvnsite  |   0m 54s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 42s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 33s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m  0s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m  1s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 42s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 138m 29s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6803 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 91ff72ba8905 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 18c4f3a6dfce2157f7829cc99ca406fa308505b5 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/3/testReport/ |
   | Max. process+thread count | 1275 (vs. ulimit of 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845234#comment-17845234
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

hadoop-yetus commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2104125153

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 00s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m 00s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  90m 41s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  40m 44s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   4m 36s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   4m 23s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/2/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |   4m 42s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 150m 01s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   5m 07s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  39m 30s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  39m 30s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 01s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 54s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   4m 28s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/2/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |   4m 49s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 161m 51s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   5m 44s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 501m 20s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6803 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 0f898775fb38 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / 4c3f83ae8ab0bbf066008f320acf05223bfa2bbb |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/2/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/2/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17845178#comment-17845178
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2103782158

   > @ChenSammi , Please add your input as well in case I missed anything. 
Thanks.
   
   Yes, removing the synchronized directly have potential risks to sub classes 
of AbstractDelegationTokenSecretManager.  Using LOCK is a safe way to replace 
synchronized and can improve the concurrency too. That's the motivation of this 
implementation. @Hexiaoqiao . 
   
   




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We can say that this thread is slow and has blocked remaining all. But 
> following is my observation:
>  

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844971#comment-17844971
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

vikaskr22 commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2102594497

   Hi @Hexiaoqiao , technically I didn't see or observe any issues till now but 
@ChenSammi has some concerns related to concurrency in other sub classes of 
AbstractDelegationTokenSecretManager.java .
   
   I had done testing with  RangerKMS  and there 
ZKDelegationTokenSecretManager.java is used as implementing class. That means, 
our test suite only verifies ZK specific cases. There are other implementation 
that is for HDFS federation, YARN etc. I 
   don't have any test suite that ensures concurrency stability for other 
implementations as well. 
   
   @ChenSammi point is, as part of last commit, I removed "synchronization" 
from super class and took care of concurrency in ZK sub class. So I was 
breaking concurrency at API level and all the implementing classes needs to 
take care of that. and at the same time, I don't have test suite internally 
that verifies anything except Ranger KMS DT usage.
   
   One more point she raised  that, what about any other subclass outside this 
repo? They might have written their code in assumption that lock has been 
acquired at API level. 
   
   Although I went through the source code of other implementing classes, and 
for me it still seems good. But due to lack of automated test suite and 
concurrency nature, I agreed to implement using Lock API.
   
   Using ReadWriteLock API, at least I am able to unblock multiple reader 
threads, like verifyToken(). Now verifyToken() can be invoked by multiple 
threads due to ReadLock nature. And write lock will ensure thread safety at API 
level.
   
   @ChenSammi , Please add your input as well in case I missed anything. Thanks.




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844848#comment-17844848
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

Hexiaoqiao commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2101893364

   Hi @vikaskr22 @ChenSammi Thanks for your works here. One nit concerns, #6001 
try to remove synchronization then revert and add RWLock here, any issues do 
you meet while remove synchronization? Thanks.




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We can say that this thread is slow and has blocked remaining all. But 
> following is my observation:
>  
>  # verifyToken() and createPaswword() has been synchronized because one is 
> reading the tokenMap and another is updating the map. If it's only to protect 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844833#comment-17844833
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on code in PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#discussion_r1594856131


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java:
##
@@ -169,21 +172,29 @@ public AbstractDelegationTokenSecretManager(long 
delegationKeyUpdateInterval,
   public void startThreads() throws IOException {
 Preconditions.checkState(!running);
 updateCurrentKey();
-synchronized (this) {
+this.apiLock.writeLock().lock();
+try {
   running = true;
   tokenRemoverThread = new Daemon(new ExpiredTokenRemover());
   tokenRemoverThread.start();
+}finally {

Review Comment:
   There need be a blank between "}" and finally.  Kindly check other "finally" 
statements too. 
   
   Also please check the checkstyle issues reported by the CI,  
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/2/artifact/out/results-checkstyle-hadoop-common-project_hadoop-common.txt.
 





> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844832#comment-17844832
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on code in PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#discussion_r1594856131


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java:
##
@@ -169,21 +172,29 @@ public AbstractDelegationTokenSecretManager(long 
delegationKeyUpdateInterval,
   public void startThreads() throws IOException {
 Preconditions.checkState(!running);
 updateCurrentKey();
-synchronized (this) {
+this.apiLock.writeLock().lock();
+try {
   running = true;
   tokenRemoverThread = new Daemon(new ExpiredTokenRemover());
   tokenRemoverThread.start();
+}finally {

Review Comment:
   There need be a blank between "}" and finally.  Kindly check other "finally" 
statement. 
   
   Also please check the checkstyle issues reported by the CI,  
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/2/artifact/out/results-checkstyle-hadoop-common-project_hadoop-common.txt.
 





> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844831#comment-17844831
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on code in PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#discussion_r1594856131


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java:
##
@@ -169,21 +172,29 @@ public AbstractDelegationTokenSecretManager(long 
delegationKeyUpdateInterval,
   public void startThreads() throws IOException {
 Preconditions.checkState(!running);
 updateCurrentKey();
-synchronized (this) {
+this.apiLock.writeLock().lock();
+try {
   running = true;
   tokenRemoverThread = new Daemon(new ExpiredTokenRemover());
   tokenRemoverThread.start();
+}finally {

Review Comment:
   There need be a blank between "}" and finally.  Kindly check other "finally" 
statement. 
   
   Also please check the checkstyle issues reported 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/2/artifact/out/results-checkstyle-hadoop-common-project_hadoop-common.txt.
 





> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844615#comment-17844615
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

hadoop-yetus commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2100256835

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 20s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 52s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   8m 54s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   8m  7s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 40s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   0m 54s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 33s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 30s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  20m 47s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 31s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 42s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   8m 42s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 17s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   8m 17s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 44s | 
[/results-checkstyle-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/2/artifact/out/results-checkstyle-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common-project/hadoop-common: The patch generated 13 new + 23 
unchanged - 1 fixed = 36 total (was 24)  |
   | +1 :green_heart: |  mvnsite  |   0m 56s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 43s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 36s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 34s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 28s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  16m 57s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 41s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 139m 41s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6803 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux abe3780c86f4 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 4c3f83ae8ab0bbf066008f320acf05223bfa2bbb |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/2/testReport/ |
   | Max. process+thread count | 1272 (vs. ulimit 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844613#comment-17844613
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

vikaskr22 commented on code in PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#discussion_r1593797928


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java:
##
@@ -441,18 +496,14 @@ private void updateCurrentKey() throws IOException {
 LOG.info("Updating the current master key for generating delegation 
tokens");
 /* Create a new currentKey with an estimated expiry date. */
 int newCurrentId;
-synchronized (this) {
-  newCurrentId = incrementCurrentKeyId();
-}
+newCurrentId = incrementCurrentKeyId();
 DelegationKey newKey = new DelegationKey(newCurrentId, System
 .currentTimeMillis()
 + keyUpdateInterval + tokenMaxLifetime, generateSecret());
 //Log must be invoked outside the lock on 'this'
 logUpdateMasterKey(newKey);
-synchronized (this) {
-  currentKey = newKey;
-  storeDelegationKey(currentKey);
-}
+currentKey = newKey;

Review Comment:
   Thanks. It has been incorporated.





> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844612#comment-17844612
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

vikaskr22 commented on code in PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#discussion_r1593797668


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java:
##
@@ -169,21 +172,29 @@ public AbstractDelegationTokenSecretManager(long 
delegationKeyUpdateInterval,
   public void startThreads() throws IOException {
 Preconditions.checkState(!running);
 updateCurrentKey();
-synchronized (this) {
+this.apiLock.writeLock().lock();
+try{

Review Comment:
   Thanks. It has been incorporated.





> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844534#comment-17844534
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

hadoop-yetus commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2099774430

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m 00s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  spotbugs  |   0m 01s |  |  spotbugs executables are not 
available.  |
   | +0 :ok: |  codespell  |   0m 01s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m 01s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m 00s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m 00s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  | 112m 27s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  48m 25s |  |  trunk passed  |
   | +1 :green_heart: |  checkstyle  |   5m 17s |  |  trunk passed  |
   | -1 :x: |  mvnsite  |   5m 00s | 
[/branch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/1/artifact/out/branch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in trunk failed.  |
   | +1 :green_heart: |  javadoc  |   5m 38s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  | 172m 01s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   6m 06s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  44m 21s |  |  the patch passed  |
   | +1 :green_heart: |  javac  |  44m 21s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m 00s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   5m 27s |  |  the patch passed  |
   | -1 :x: |  mvnsite  |   5m 04s | 
[/patch-mvnsite-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/1/artifact/out/patch-mvnsite-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch failed.  |
   | +1 :green_heart: |  javadoc  |   5m 25s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  | 190m 40s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   7m 16s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 591m 12s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | GITHUB PR | https://github.com/apache/hadoop/pull/6803 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | MINGW64_NT-10.0-17763 94506f5a0dcd 3.4.10-87d57229.x86_64 
2024-02-14 20:17 UTC x86_64 Msys |
   | Build tool | maven |
   | Personality | /c/hadoop/dev-support/bin/hadoop.sh |
   | git revision | trunk / cd01f45cfe7f9941b3d082f7d6cc26840edd2e09 |
   | Default Java | Azul Systems, Inc.-1.8.0_332-b09 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/1/testReport/
 |
   | modules | C: hadoop-common-project/hadoop-common U: 
hadoop-common-project/hadoop-common |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch-windows-10/job/PR-6803/1/console
 |
   | versions | git=2.44.0.windows.1 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844523#comment-17844523
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on code in PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#discussion_r1593369308


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java:
##
@@ -169,21 +172,29 @@ public AbstractDelegationTokenSecretManager(long 
delegationKeyUpdateInterval,
   public void startThreads() throws IOException {
 Preconditions.checkState(!running);
 updateCurrentKey();
-synchronized (this) {
+this.apiLock.writeLock().lock();
+try{

Review Comment:
   It requires a blank space between try and "{",  "}" and finally. 
   
   Kindly check this checkstyle issue in all new codes. 





> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844522#comment-17844522
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on code in PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#discussion_r1593368398


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java:
##
@@ -120,12 +121,12 @@ private String formatTokenId(TokenIdent id) {
   /**
* Access to currentKey is protected by this object lock
*/
-  private DelegationKey currentKey;
+  private volatile  DelegationKey currentKey;

Review Comment:
   This volatile key word can be removed, since the operation of currentKey 
will all be protected by lock. 





> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844521#comment-17844521
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on code in PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#discussion_r1593339527


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java:
##
@@ -441,18 +496,14 @@ private void updateCurrentKey() throws IOException {
 LOG.info("Updating the current master key for generating delegation 
tokens");
 /* Create a new currentKey with an estimated expiry date. */
 int newCurrentId;
-synchronized (this) {
-  newCurrentId = incrementCurrentKeyId();
-}
+newCurrentId = incrementCurrentKeyId();
 DelegationKey newKey = new DelegationKey(newCurrentId, System
 .currentTimeMillis()
 + keyUpdateInterval + tokenMaxLifetime, generateSecret());
 //Log must be invoked outside the lock on 'this'
 logUpdateMasterKey(newKey);
-synchronized (this) {
-  currentKey = newKey;
-  storeDelegationKey(currentKey);
-}
+currentKey = newKey;

Review Comment:
   Need to keep this synchronized or use Lock, to protect currentKey change and 
storeKey as one atomic operation. 





> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844520#comment-17844520
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on code in PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#discussion_r1593339527


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java:
##
@@ -441,18 +496,14 @@ private void updateCurrentKey() throws IOException {
 LOG.info("Updating the current master key for generating delegation 
tokens");
 /* Create a new currentKey with an estimated expiry date. */
 int newCurrentId;
-synchronized (this) {
-  newCurrentId = incrementCurrentKeyId();
-}
+newCurrentId = incrementCurrentKeyId();
 DelegationKey newKey = new DelegationKey(newCurrentId, System
 .currentTimeMillis()
 + keyUpdateInterval + tokenMaxLifetime, generateSecret());
 //Log must be invoked outside the lock on 'this'
 logUpdateMasterKey(newKey);
-synchronized (this) {
-  currentKey = newKey;
-  storeDelegationKey(currentKey);
-}
+currentKey = newKey;

Review Comment:
   Need to keep this synchronized or use Lock, to protect currentKey change and 
storeKey as one atomic operation.





> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844509#comment-17844509
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

ChenSammi commented on code in PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#discussion_r1593339527


##
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/token/delegation/AbstractDelegationTokenSecretManager.java:
##
@@ -441,18 +496,14 @@ private void updateCurrentKey() throws IOException {
 LOG.info("Updating the current master key for generating delegation 
tokens");
 /* Create a new currentKey with an estimated expiry date. */
 int newCurrentId;
-synchronized (this) {
-  newCurrentId = incrementCurrentKeyId();
-}
+newCurrentId = incrementCurrentKeyId();
 DelegationKey newKey = new DelegationKey(newCurrentId, System
 .currentTimeMillis()
 + keyUpdateInterval + tokenMaxLifetime, generateSecret());
 //Log must be invoked outside the lock on 'this'
 logUpdateMasterKey(newKey);
-synchronized (this) {
-  currentKey = newKey;
-  storeDelegationKey(currentKey);
-}
+currentKey = newKey;

Review Comment:
   Need to keep this synchronized or use Lock, to protect currentKey change and 
storeKey as one operation.





> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844282#comment-17844282
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

hadoop-yetus commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2098357532

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 19s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | -1 :x: |  test4tests  |   0m  0s |  |  The patch doesn't appear to include 
any new or modified tests. Please justify why no new tests are needed for this 
patch. Also please list what manual steps were performed to verify this patch.  
|
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 28s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   8m 49s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  compile  |   8m  8s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  checkstyle  |   0m 42s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   0m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 35s |  |  trunk passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 32s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 23s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 30s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javac  |   8m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   8m 10s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  javac  |   8m 10s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | -0 :warning: |  checkstyle  |   0m 39s | 
[/results-checkstyle-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/1/artifact/out/results-checkstyle-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common-project/hadoop-common: The patch generated 13 new + 23 
unchanged - 1 fixed = 36 total (was 24)  |
   | +1 :green_heart: |  mvnsite  |   0m 58s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 43s |  |  the patch passed with JDK 
Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1  |
   | +1 :green_heart: |  javadoc  |   0m 37s |  |  the patch passed with JDK 
Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06  |
   | +1 :green_heart: |  spotbugs  |   1m 32s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 28s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  16m 52s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 42s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 139m 29s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.45 ServerAPI=1.45 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/6803 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 963820cfbb47 5.15.0-94-generic #104-Ubuntu SMP Tue Jan 9 
15:25:40 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / cd01f45cfe7f9941b3d082f7d6cc26840edd2e09 |
   | Default Java | Private Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.22+7-post-Ubuntu-0ubuntu220.04.1 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_402-8u402-ga-2ubuntu1~20.04-b06 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-6803/1/testReport/ |
   | Max. process+thread count | 3152 (vs. ulimit 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844244#comment-17844244
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

vikaskr22 commented on PR #6803:
URL: https://github.com/apache/hadoop/pull/6803#issuecomment-2098067781

   @ChenSammi , Can you review the changes and provide your input ? Thanks.




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402)
> - locked <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48)
> at org.apache.hadoop.security.token.Token.(Token.java:67)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183)
>  {code}
> We can say that this thread is slow and has blocked remaining all. But 
> following is my observation:
>  
>  # verifyToken() and createPaswword() has been synchronized because one is 
> reading the tokenMap and another is updating the map. If it's only to protect 
> the map, then can't we simply use ConcurrentHashMap and remove the 
> "synchronized" keyword. Because due to this, all reader 

[jira] [Commented] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.

2024-05-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17844239#comment-17844239
 ] 

ASF GitHub Bot commented on HADOOP-18851:
-

vikaskr22 opened a new pull request, #6803:
URL: https://github.com/apache/hadoop/pull/6803

   
   
   ### Description of PR
   AbstractDelegationTokenSecretManager's method all synchronized and are 
blocking each other, even multiple readers threads are blocking each other. 
This PR is an efforts towards optimising the synchronization contexts.
   For detailed description, please go through the discussion on 
https://issues.apache.org/jira/browse/HADOOP-18851
   
   ### How was this patch tested?
   Build is working fine. And this is more about a logical change like where to 
acquire or release a lock. It requires careful review. There is not any 
functional logic change.
   
   ### For code changes:
   
   - [ Y] Does the title or this PR starts with the corresponding JIRA issue id 
(e.g. 'HADOOP-17799. Your PR title ...')?
   - [N ] Object storage: have the integration tests been executed and the 
endpoint declared according to the connector-specific documentation?
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, 
`NOTICE-binary` files?
   
   




> Performance improvement for DelegationTokenSecretManager.
> -
>
> Key: HADOOP-18851
> URL: https://issues.apache.org/jira/browse/HADOOP-18851
> Project: Hadoop Common
>  Issue Type: Task
>  Components: common
>Affects Versions: 3.4.0
>Reporter: Vikas Kumar
>Assignee: Vikas Kumar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: 
> 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot 
> 2023-08-16 at 5.36.57 PM.png
>
>
> *Context:*
> KMS depends on hadoop-common for DT management. Recently we were analysing 
> one performance issue and following is out findings:
>  # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at 
> following:
>  ## *AbstractDelegationTokenSecretManager.verifyToken()*
>  ## *AbstractDelegationTokenSecretManager.createPassword()* 
>  # And then process crashed.
>  
> {code:java}
> http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : 
> 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED
> stackTrace:
> java.lang.Thread.State: BLOCKED (on object monitor)
> at 
> org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474)
> - waiting to lock <0x0005f2f545e8> (a 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213)
> at 
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396)
> at  {code}
> All the 199 out of 200 were blocked at above point.
> And the lock they are waiting for is acquired by a thread that was trying to 
> createPassword and publishing the same on ZK.
>  
> {code:java}
> stackTrace:
> java.lang.Thread.State: WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598)
> - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570)
> at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385)
> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358)
> at 
> org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36)
> at 
> org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201)
> at 
> org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116)
> at 
> org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586)
> at 
>