[
https://issues.apache.org/jira/browse/HADOOP-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16814103#comment-16814103
]
Vinayakumar B commented on HADOOP-13144:
----------------------------------------
I agree that adding multiple connections from same user to same server
increases throughput of RPCs sent from client to server.
But in case of Router, overall throughput depends on the NameNode's ability of
processing RPCs.
I assume, the verification done by [~ywskycn] contains all read RPCs calls,
which doesn't wait for locks in NameNode and gets processed immediately. What
if there are mixed RPCs.? then total throughput will be the same, even number
of connections increased from client->sever.
If connection is the bottleneck, did you try increasing the Number of Reader
threads at NameNode?
Even allowing multiple connections from Client also needs an increase in the
Reader threads in the NameNode.
There is one more possibility for RPC requests to get blocked, if Reader
threads in server side are busy in authenticating connections in case of
security enabled.
HADOOP-15602 offloads the connections establishment to separate handlers, which
in-turn increases the RPC throughput for already established connections.
[~ywskycn], Are you seeing bottleneck (without this change) in establishing
connection? or Writing request over connection?
To summarize,
My understanding is, current bottleneck for RPC throughput will be based on
server's ability to process RPCs. By increasing number of connections, RPC will
reach server's Queue faster, but gets processed in the same time.
Increase in throughput can be seen only if All READ RPCs are being executed in
parallel. May be this will help Standby NameNode Reads.
> Enhancing IPC client throughput via multiple connections per user
> -----------------------------------------------------------------
>
> Key: HADOOP-13144
> URL: https://issues.apache.org/jira/browse/HADOOP-13144
> Project: Hadoop Common
> Issue Type: Improvement
> Components: ipc
> Reporter: Jason Kace
> Assignee: Íñigo Goiri
> Priority: Minor
> Attachments: HADOOP-13144.000.patch, HADOOP-13144.001.patch,
> HADOOP-13144.002.patch, HADOOP-13144.003.patch
>
>
> The generic IPC client ({{org.apache.hadoop.ipc.Client}}) utilizes a single
> connection thread for each {{ConnectionId}}. The {{ConnectionId}} is unique
> to the connection's remote address, ticket and protocol. Each ConnectionId
> is 1:1 mapped to a connection thread by the client via a map cache.
> The result is to serialize all IPC read/write activity through a single
> thread for a each user/ticket + address. If a single user makes repeated
> calls (1k-100k/sec) to the same destination, the IPC client becomes a
> bottleneck.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]