[
https://issues.apache.org/jira/browse/HADOOP-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17523465#comment-17523465
]
Aihua Xu edited comment on HADOOP-13144 at 4/17/22 10:52 PM:
-------------------------------------------------------------
This change has not been merged in the trunk but we have applied in our
environment and observed the performance improvement. While when the routers
are heavily overloaded, the routers will create too many connections against
NameNode and the NameNode gets degraded with too many file descriptors. We have
tuned the connection pool size (logical connections) within routers but the
won't be able to control much on the physical connections. I have attached a
new change which enables fine-grained configuration on the number of physical
connections for a nameservice or for a user. With such control/connection
sharing, we didn't observe performance degradation with connection size like 64.
{{dfs.federation.router.ipc.connection.size=64}}
{{dfs.federation.router.ipc.connection.size.ns0=32}}
{{dfs.federation.router.ipc.connection.size.ns0.ingestion=128}}
was (Author: aihuaxu):
This change has not been merged in the trunk but we have applied in our
environment and observed the performance improvement. While when the routers
are heavily overloaded, the routers will create too many connections against
NameNode and the NameNode gets degraded with too many file descriptors. We have
tuned the connection pool size (logical connections) within routers but the
won't be able to control much on the physical connections. I have attached a
new change which enables fine-grained configuration on the number of physical
connections for a nameservice or for a user. With such control/connection
sharing, we didn't observe performance degradation with connection size like
64.
{quote}dfs.federation.router.ipc.connection.size=64
dfs.federation.router.ipc.connection.size.ns0=32
dfs.federation.router.ipc.connection.size.ns0.ingestion=128{quote}
> Enhancing IPC client throughput via multiple connections per user
> -----------------------------------------------------------------
>
> Key: HADOOP-13144
> URL: https://issues.apache.org/jira/browse/HADOOP-13144
> Project: Hadoop Common
> Issue Type: Improvement
> Components: ipc
> Reporter: Jason Kace
> Assignee: Íñigo Goiri
> Priority: Minor
> Attachments: HADOOP-13144-performance.patch, HADOOP-13144.000.patch,
> HADOOP-13144.001.patch, HADOOP-13144.002.patch, HADOOP-13144.003.patch,
> HADOOP-13144_overload_enhancement.patch
>
>
> The generic IPC client ({{org.apache.hadoop.ipc.Client}}) utilizes a single
> connection thread for each {{ConnectionId}}. The {{ConnectionId}} is unique
> to the connection's remote address, ticket and protocol. Each ConnectionId
> is 1:1 mapped to a connection thread by the client via a map cache.
> The result is to serialize all IPC read/write activity through a single
> thread for a each user/ticket + address. If a single user makes repeated
> calls (1k-100k/sec) to the same destination, the IPC client becomes a
> bottleneck.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]