[ 
https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17916757#comment-17916757
 ] 

ASF GitHub Bot commented on HDFS-17531:
---------------------------------------

hadoop-yetus commented on PR #7308:
URL: https://github.com/apache/hadoop/pull/7308#issuecomment-2612809368

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 33s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  2s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  
|
   | +0 :ok: |  xmllint  |   0m  0s |  |  xmllint was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 25 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   6m  8s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  31m 20s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  16m 46s |  |  trunk passed with JDK 
Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04  |
   | +1 :green_heart: |  compile  |  15m 24s |  |  trunk passed with JDK 
Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga  |
   | +1 :green_heart: |  checkstyle  |   4m  9s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   5m  0s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   4m 10s |  |  trunk passed with JDK 
Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   4m 11s |  |  trunk passed with JDK 
Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga  |
   | +1 :green_heart: |  spotbugs  |  10m  2s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  35m 15s |  |  branch has no errors 
when building and testing our client artifacts.  |
   | -0 :warning: |  patch  |  35m 41s |  |  Used diff version of patch file. 
Binary files and potentially other changes not applied. Please rebase and 
squash commits if necessary.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 21s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  16m 32s |  |  the patch passed with JDK 
Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04  |
   | +1 :green_heart: |  javac  |  16m 32s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  15m 41s |  |  the patch passed with JDK 
Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga  |
   | +1 :green_heart: |  javac  |  15m 41s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   4m 10s |  |  root: The patch generated 
0 new + 95 unchanged - 73 fixed = 95 total (was 168)  |
   | +1 :green_heart: |  mvnsite  |   5m 17s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   4m 24s |  |  the patch passed with JDK 
Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04  |
   | +1 :green_heart: |  javadoc  |   4m 26s |  |  the patch passed with JDK 
Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga  |
   | +1 :green_heart: |  spotbugs  |  10m 45s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  39m 56s |  |  patch has no errors 
when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  |  20m 45s | 
[/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch passed.  |
   | -1 :x: |  unit  |   0m 48s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-client.txt)
 |  hadoop-hdfs-client in the patch failed.  |
   | -1 :x: |  unit  |   1m 51s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt)
 |  hadoop-hdfs in the patch failed.  |
   | -1 :x: |  unit  |   0m 46s | 
[/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt)
 |  hadoop-hdfs-rbf in the patch failed.  |
   | +0 :ok: |  asflicense  |   0m 47s |  |  ASF License check generated no 
output?  |
   |  |   | 267m 27s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.ipc.TestIPC |
   |   | hadoop.metrics2.source.TestJvmMetrics |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.47 ServerAPI=1.47 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/7308 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint 
markdownlint |
   | uname | Linux d9b44a0e37cb 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 
17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / b47e11e67dbfb450780f72f727475d1380d046b6 |
   | Default Java | Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/testReport/ |
   | Max. process+thread count | 970 (vs. ulimit of 5500) |
   | modules | C: hadoop-common-project/hadoop-common 
hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs 
hadoop-hdfs-project/hadoop-hdfs-rbf U: . |
   | Console output | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> RBF: Asynchronous router RPC
> ----------------------------
>
>                 Key: HDFS-17531
>                 URL: https://issues.apache.org/jira/browse/HDFS-17531
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: rbf
>            Reporter: Jian Zhang
>            Assignee: Jian Zhang
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: Async router single ns performance test.pdf, Aynchronous 
> router.pdf, Comparison of Async router & sync router performance.pdf, 
> HDFS-17531.001.patch, Router asynchronous rpc implementation .pdf, 
> image-2024-05-19-18-07-51-282.png
>
>
> *Description*
> Currently, the main function of the Router service is to accept client 
> requests, forward the requests to the corresponding downstream ns, and then 
> return the results of the downstream ns to the client. The link is as follows:
> *!image-2024-05-19-18-07-51-282.png|width=900,height=300!*
> The main threads involved in the rpc link are:
> {*}Read{*}: Get the client request and put it into the call queue *(1)*
> {*}Handler{*}:
> Extract call *(2)* from the call queue, process the call, generate a new 
> call, place it in the call of the connection thread, and wait for the call 
> processing to complete *(3)*
> After being awakened by the connection thread, process the response and put 
> it into the response queue *(5)*
> *Connection:*
> Hold the link with downstream ns, send the call from the call to the 
> downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. 
> Based on the call in the response, notify the call to complete processing 
> *(4)*
> *Responder:*
> Retrieve the response queue from the queue *(6)* and return it to the client
>  
> *Shortcoming*
> Even if the *connection* thread can send more requests to downstream 
> nameservices, since *(3)* and *(4)* are synchronous, when the *handler* 
> thread adds the call to connection.calls, it needs to wait until the 
> *connection* notifies the call to complete, and then Only after the response 
> is put into the response queue can a new call be obtained from the call queue 
> and processed. Therefore, the concurrency performance of the router is 
> limited by the number of handlers; a simple example is as follows: If the 
> number of handlers is 1 and the maximum number of calls in the connection 
> thread is 10, then even if the connection thread can send 10 requests to the 
> downstream ns, since the number of handlers is 1, the router can only process 
> one request after another. 
>  
> Since the performance of router rpc is mainly limited by the number of 
> handlers, the most effective way to improve rpc performance currently is to 
> increase the number of handlers. Letting the router create a large number of 
> handler threads will also increase the number of thread switches and cannot 
> maximize the use of machine performance.
>  
> There are usually multiple ns downstream of the router. If the handler 
> forwards the request to an ns with poor performance, it will cause the 
> handler to wait for a long time. Due to the reduction of available handlers, 
> the router's ability to handle ns requests with normal performance will be 
> reduced. From the perspective of the client, the performance of the 
> downstream ns of the router has deteriorated at this time. We often find that 
> the call queue of the downstream ns is not high, but the call queue of the 
> router is very high.
>  
> Therefore, although the main function of the router is to federate and handle 
> requests from multiple NSs, the current synchronous RPC performance cannot 
> satisfy the scenario where there are many NSs downstream of the router. Even 
> if the concurrent performance of the router can be improved by increasing the 
> number of handlers, it is still relatively slow. More threads will increase 
> the CPU context switching time, and in fact many of the handler threads are 
> in a blocked state, which is undoubtedly a waste of thread resources. When a 
> request enters the router, there is no guarantee that there will be a running 
> handler at this time.
>  
> Therefore, I consider asynchronous router rpc. Please view the *pdf* for the 
> complete solution.
>  
> Welcome everyone to exchange and discuss!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to