[jira] [Commented] (HADOOP-13144) Enhancing IPC client throughput via multiple connections per user

Vinayakumar B (JIRA) Tue, 09 Apr 2019 23:31:24 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-13144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16814103#comment-16814103
 ]


Vinayakumar B commented on HADOOP-13144:
----------------------------------------

I agree that adding multiple connections from same user to same server 
increases throughput of RPCs sent from client to server.

But in case of Router, overall throughput depends on the NameNode's ability of 
processing RPCs.

I assume, the verification done by [~ywskycn] contains all read RPCs calls, 
which doesn't wait for locks in NameNode and gets processed immediately. What 
if there are mixed RPCs.? then total throughput will be the same, even number 
of connections increased from client->sever.

 

If connection is the bottleneck, did you try increasing the Number of Reader 
threads at NameNode?

Even allowing multiple connections from Client also needs an increase in the 
Reader threads in the NameNode.

 

There is one more possibility for RPC requests to get blocked, if Reader 
threads in server side are busy in authenticating connections in case of 
security enabled.

HADOOP-15602 offloads the connections establishment to separate handlers, which 
in-turn increases the RPC throughput for already established connections.

[~ywskycn], Are you seeing bottleneck (without this change) in establishing 
connection? or Writing request over connection?

 

To summarize, 

My understanding is, current bottleneck for RPC throughput will be based on 
server's ability to process RPCs. By increasing number of connections, RPC will 
reach server's Queue faster, but gets processed in the same time.

Increase in throughput can be seen only if All READ RPCs are being executed in 
parallel. May be this will help Standby NameNode Reads.

> Enhancing IPC client throughput via multiple connections per user
> -----------------------------------------------------------------
>
>                 Key: HADOOP-13144
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13144
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>            Reporter: Jason Kace
>            Assignee: Íñigo Goiri
>            Priority: Minor
>         Attachments: HADOOP-13144.000.patch, HADOOP-13144.001.patch, 
> HADOOP-13144.002.patch, HADOOP-13144.003.patch
>
>
> The generic IPC client ({{org.apache.hadoop.ipc.Client}}) utilizes a single 
> connection thread for each {{ConnectionId}}.  The {{ConnectionId}} is unique 
> to the connection's remote address, ticket and protocol.  Each ConnectionId 
> is 1:1 mapped to a connection thread by the client via a map cache.
> The result is to serialize all IPC read/write activity through a single 
> thread for a each user/ticket + address.  If a single user makes repeated 
> calls (1k-100k/sec) to the same destination, the IPC client becomes a 
> bottleneck.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-13144) Enhancing IPC client throughput via multiple connections per user

Reply via email to