[ 
https://issues.apache.org/jira/browse/HADOOP-16059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16810519#comment-16810519
 ] 

Vinayakumar B commented on HADOOP-16059:
----------------------------------------

Thanks [~ayushtkn] for the contribution.

Above screenshots of profiling shows the clear difference in time consumed 
while loading the SaslFactory.

As [~jojochuang] mentioned, it may not add much of value in case of RPCs 
interacting with same RPC server continuosly as same RPC connection will be 
maintained. Only in case of client is idle for 10 seconds (default) connection 
needs to be recreated.

Also, there are other cases in which this patch will help.
 # Same clients interacting with multiple RPC servers in not-so-frequent 
intervals.
 ** In this case, RPC connection to second server will be faster, as time to 
load the SASL factory will be zero.
 # Clients connecting to DataNodes to read/write data without using cached 
connection.
 ** HDFS Client's will write data to DataNodes using TCP connection using new 
connection everytime. There is NO connection cache for writeBlock() Op.
 ** For ReadBlock() op connection can be cached only after complete read of 
intended bytes. Ex: In case of sequential read, client should consume entire 
block data.
 ** Socket cache capacity is limited ( 16 ) and expires quickly (4 sec) by 
default.
 ** HDFS Client is Non-data-local, then it might be getting different 
datanode's location for each block, in this case, cache-hits will be less.

[~elgoiri] , I believe this change will help above case #2 more as that is more 
common. Its evident in the above screenshot of 
*SaslParticipant.createClientSaslParticipant() and 
S**aslParticipant.createServerSaslParticipant()* **takes far less time for same 
number of connections.

Hope its clear.

> Use SASL Factories Cache to Improve Performance
> -----------------------------------------------
>
>                 Key: HADOOP-16059
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16059
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Ayush Saxena
>            Assignee: Ayush Saxena
>            Priority: Critical
>         Attachments: After-Dn.png, After-Read.png, After-Server.png, 
> After-write.png, Before-DN.png, Before-Read.png, Before-Server.png, 
> Before-Write.png, HADOOP-16059-01.patch, HADOOP-16059-02.patch, 
> HADOOP-16059-02.patch, HADOOP-16059-03.patch, HADOOP-16059-04.patch
>
>
> SASL Client factories can be cached and SASL Server Factories and SASL Client 
> Factories can be together extended at SaslParticipant  to improve performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to