[ https://issues.apache.org/jira/browse/HDFS-17531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17916757#comment-17916757 ]
ASF GitHub Bot commented on HDFS-17531: --------------------------------------- hadoop-yetus commented on PR #7308: URL: https://github.com/apache/hadoop/pull/7308#issuecomment-2612809368 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |:----:|----------:|--------:|:--------:|:-------:| | +0 :ok: | reexec | 0m 33s | | Docker mode activated. | |||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 2s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +0 :ok: | markdownlint | 0m 0s | | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 25 new or modified test files. | |||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 6m 8s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 31m 20s | | trunk passed | | +1 :green_heart: | compile | 16m 46s | | trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | compile | 15m 24s | | trunk passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga | | +1 :green_heart: | checkstyle | 4m 9s | | trunk passed | | +1 :green_heart: | mvnsite | 5m 0s | | trunk passed | | +1 :green_heart: | javadoc | 4m 10s | | trunk passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 4m 11s | | trunk passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga | | +1 :green_heart: | spotbugs | 10m 2s | | trunk passed | | +1 :green_heart: | shadedclient | 35m 15s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 35m 41s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | |||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 33s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 3m 21s | | the patch passed | | +1 :green_heart: | compile | 16m 32s | | the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javac | 16m 32s | | the patch passed | | +1 :green_heart: | compile | 15m 41s | | the patch passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga | | +1 :green_heart: | javac | 15m 41s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 4m 10s | | root: The patch generated 0 new + 95 unchanged - 73 fixed = 95 total (was 168) | | +1 :green_heart: | mvnsite | 5m 17s | | the patch passed | | +1 :green_heart: | javadoc | 4m 24s | | the patch passed with JDK Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 | | +1 :green_heart: | javadoc | 4m 26s | | the patch passed with JDK Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga | | +1 :green_heart: | spotbugs | 10m 45s | | the patch passed | | +1 :green_heart: | shadedclient | 39m 56s | | patch has no errors when building and testing our client artifacts. | |||| _ Other Tests _ | | -1 :x: | unit | 20m 45s | [/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt) | hadoop-common in the patch passed. | | -1 :x: | unit | 0m 48s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs-client.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-client.txt) | hadoop-hdfs-client in the patch failed. | | -1 :x: | unit | 1m 51s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) | hadoop-hdfs in the patch failed. | | -1 :x: | unit | 0m 46s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt) | hadoop-hdfs-rbf in the patch failed. | | +0 :ok: | asflicense | 0m 47s | | ASF License check generated no output? | | | | 267m 27s | | | | Reason | Tests | |-------:|:------| | Failed junit tests | hadoop.ipc.TestIPC | | | hadoop.metrics2.source.TestJvmMetrics | | Subsystem | Report/Notes | |----------:|:-------------| | Docker | ClientAPI=1.47 ServerAPI=1.47 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/7308 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint markdownlint | | uname | Linux d9b44a0e37cb 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / b47e11e67dbfb450780f72f727475d1380d046b6 | | Default Java | Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga | | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.25+9-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_432-8u432-ga~us1-0ubuntu2~20.04-ga | | Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/testReport/ | | Max. process+thread count | 970 (vs. ulimit of 5500) | | modules | C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs hadoop-hdfs-project/hadoop-hdfs-rbf U: . | | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7308/4/console | | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 | | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org | This message was automatically generated. > RBF: Asynchronous router RPC > ---------------------------- > > Key: HDFS-17531 > URL: https://issues.apache.org/jira/browse/HDFS-17531 > Project: Hadoop HDFS > Issue Type: New Feature > Components: rbf > Reporter: Jian Zhang > Assignee: Jian Zhang > Priority: Major > Labels: pull-request-available > Attachments: Async router single ns performance test.pdf, Aynchronous > router.pdf, Comparison of Async router & sync router performance.pdf, > HDFS-17531.001.patch, Router asynchronous rpc implementation .pdf, > image-2024-05-19-18-07-51-282.png > > > *Description* > Currently, the main function of the Router service is to accept client > requests, forward the requests to the corresponding downstream ns, and then > return the results of the downstream ns to the client. The link is as follows: > *!image-2024-05-19-18-07-51-282.png|width=900,height=300!* > The main threads involved in the rpc link are: > {*}Read{*}: Get the client request and put it into the call queue *(1)* > {*}Handler{*}: > Extract call *(2)* from the call queue, process the call, generate a new > call, place it in the call of the connection thread, and wait for the call > processing to complete *(3)* > After being awakened by the connection thread, process the response and put > it into the response queue *(5)* > *Connection:* > Hold the link with downstream ns, send the call from the call to the > downstream ns (via {*}rpcRequestThread{*}), and obtain a response from ns. > Based on the call in the response, notify the call to complete processing > *(4)* > *Responder:* > Retrieve the response queue from the queue *(6)* and return it to the client > > *Shortcoming* > Even if the *connection* thread can send more requests to downstream > nameservices, since *(3)* and *(4)* are synchronous, when the *handler* > thread adds the call to connection.calls, it needs to wait until the > *connection* notifies the call to complete, and then Only after the response > is put into the response queue can a new call be obtained from the call queue > and processed. Therefore, the concurrency performance of the router is > limited by the number of handlers; a simple example is as follows: If the > number of handlers is 1 and the maximum number of calls in the connection > thread is 10, then even if the connection thread can send 10 requests to the > downstream ns, since the number of handlers is 1, the router can only process > one request after another. > > Since the performance of router rpc is mainly limited by the number of > handlers, the most effective way to improve rpc performance currently is to > increase the number of handlers. Letting the router create a large number of > handler threads will also increase the number of thread switches and cannot > maximize the use of machine performance. > > There are usually multiple ns downstream of the router. If the handler > forwards the request to an ns with poor performance, it will cause the > handler to wait for a long time. Due to the reduction of available handlers, > the router's ability to handle ns requests with normal performance will be > reduced. From the perspective of the client, the performance of the > downstream ns of the router has deteriorated at this time. We often find that > the call queue of the downstream ns is not high, but the call queue of the > router is very high. > > Therefore, although the main function of the router is to federate and handle > requests from multiple NSs, the current synchronous RPC performance cannot > satisfy the scenario where there are many NSs downstream of the router. Even > if the concurrent performance of the router can be improved by increasing the > number of handlers, it is still relatively slow. More threads will increase > the CPU context switching time, and in fact many of the handler threads are > in a blocked state, which is undoubtedly a waste of thread resources. When a > request enters the router, there is no guarantee that there will be a running > handler at this time. > > Therefore, I consider asynchronous router rpc. Please view the *pdf* for the > complete solution. > > Welcome everyone to exchange and discuss! -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org