[ 
https://issues.apache.org/jira/browse/HDFS-16671?focusedWorklogId=793216&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-793216
 ]

ASF GitHub Bot logged work on HDFS-16671:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Jul/22 12:35
            Start Date: 20/Jul/22 12:35
    Worklog Time Spent: 10m 
      Work Description: ZanderXu opened a new pull request, #4597:
URL: https://github.com/apache/hadoop/pull/4597

   RouterRpcFairnessPolicyController supports configurable permit acquire 
timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
environment when one nameserivce is busy.
   
   And the optimal timeout maybe should be less than p50(avgTime).
   
   And all handlers in RBF is waiting to acquire the permit of the busy ns. 
   ```
   "IPC Server handler 12 on default port 8888" #2370 daemon prio=5 os_prio=0 
tid=? nid=?  waiting on condition [?]
      java.lang.Thread.State: TIMED_WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <?> (a 
java.util.concurrent.Semaphore$NonfairSync)
        at 
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
        at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
        at 
org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
        at 
org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
        at 
org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
   ```
   




Issue Time Tracking
-------------------

            Worklog Id:     (was: 793216)
    Remaining Estimate: 0h
            Time Spent: 10m

> RBF: RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-16671
>                 URL: https://issues.apache.org/jira/browse/HDFS-16671
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: ZanderXu
>            Assignee: ZanderXu
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> RouterRpcFairnessPolicyController supports configurable permit acquire 
> timeout. Hardcode 1s is very long, and it has causes an incident in our prod 
> environment when one nameserivce is busy.
> And the optimal timeout maybe should be less than p50(avgTime).
> And all handlers in RBF is waiting to acquire the permit of the busy ns. 
> {code:java}
> "IPC Server handler 12 on default port 8888" #2370 daemon prio=5 os_prio=0 
> tid=? nid=?  waiting on condition [?]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>       at sun.misc.Unsafe.park(Native Method)
>       - parking to wait for  <?> (a 
> java.util.concurrent.Semaphore$NonfairSync)
>       at 
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
>       at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
>       at java.util.concurrent.Semaphore.tryAcquire(Semaphore.java:409)
>       at 
> org.apache.hadoop.hdfs.server.federation.fairness.AbstractRouterRpcFairnessPolicyController.acquirePermit(AbstractRouterRpcFairnessPolicyController.java:56)
>       at 
> org.apache.hadoop.hdfs.server.federation.fairness.DynamicRouterRpcFairnessPolicyController.acquirePermit(DynamicRouterRpcFairnessPolicyController.java:123)
>       at 
> org.apache.hadoop.hdfs.server.federation.router.RouterRpcClient.acquirePermit(RouterRpcClient.java:1500)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to