[jira] [Comment Edited] (HADOOP-13144) Enhancing IPC client throughput via multiple connections per user

Aihua Xu (Jira) Sun, 17 Apr 2022 15:53:06 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523465#comment-17523465
 ]


Aihua Xu edited comment on HADOOP-13144 at 4/17/22 10:52 PM:
-------------------------------------------------------------

This change has not been merged in the trunk but we have applied in our 
environment and observed the performance improvement. While when the routers 
are heavily overloaded, the routers will create too many connections against 
NameNode and the NameNode gets degraded with too many file descriptors. We have 
tuned the connection pool size (logical connections) within routers but the 
won't be able to control much on the physical connections. I have attached a 
new change which enables fine-grained configuration on the number of physical 
connections for a nameservice or for a user. With such control/connection 
sharing, we didn't observe performance degradation with connection size like 64.

{{dfs.federation.router.ipc.connection.size=64}}
{{dfs.federation.router.ipc.connection.size.ns0=32}}
{{dfs.federation.router.ipc.connection.size.ns0.ingestion=128}}


was (Author: aihuaxu):
This change has not been merged in the trunk but we have applied in our 
environment and observed the performance improvement. While when the routers 
are heavily overloaded, the routers will create too many connections against 
NameNode and the NameNode gets degraded with too many file descriptors. We have 
tuned the connection pool size (logical connections) within routers but the 
won't be able to control much on the physical connections. I have attached a 
new change which enables fine-grained configuration on the number of physical 
connections for a nameservice or for a user. With such control/connection 
sharing, we didn't observe performance degradation with connection size like 
64.  

{quote}dfs.federation.router.ipc.connection.size=64
dfs.federation.router.ipc.connection.size.ns0=32
dfs.federation.router.ipc.connection.size.ns0.ingestion=128{quote}


> Enhancing IPC client throughput via multiple connections per user
> -----------------------------------------------------------------
>
>                 Key: HADOOP-13144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13144
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>            Reporter: Jason Kace
>            Assignee: Íñigo Goiri
>            Priority: Minor
>         Attachments: HADOOP-13144-performance.patch, HADOOP-13144.000.patch, 
> HADOOP-13144.001.patch, HADOOP-13144.002.patch, HADOOP-13144.003.patch, 
> HADOOP-13144_overload_enhancement.patch
>
>
> The generic IPC client ({{org.apache.hadoop.ipc.Client}}) utilizes a single 
> connection thread for each {{ConnectionId}}.  The {{ConnectionId}} is unique 
> to the connection's remote address, ticket and protocol.  Each ConnectionId 
> is 1:1 mapped to a connection thread by the client via a map cache.
> The result is to serialize all IPC read/write activity through a single 
> thread for a each user/ticket + address.  If a single user makes repeated 
> calls (1k-100k/sec) to the same destination, the IPC client becomes a 
> bottleneck.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HADOOP-13144) Enhancing IPC client throughput via multiple connections per user

Reply via email to