[ 
https://issues.apache.org/jira/browse/HDFS-15757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17280404#comment-17280404
 ] 

Xiaoqiao He commented on HDFS-15757:
------------------------------------

{quote}proxy time, this is directly impacted since the change improves 
getConnection() a lot. I have done some flamegraphes for Router to understand 
the performance bottleneck and often I can see getConnection() in the stack 
taking a lot of time. With this change, connections are actually maintained as 
Active as possible. v.s. previously the connection left not quite closed and 
hitting the connection cap for the pool thus no more active connection can be 
created.{quote}
Thanks [~fengnanli], this is very clear explain for me. I would like to deploy 
this improvement on my production env and will post the result when it is ready.
One more point (may be not related closely with this issue). In my experience, 
it seems not enough using (process time + queue time) only to check improvement 
at Router side for end-to-end. At router there will be many rejection events 
when CallQueue pile up, which is triggered when process time increases or other 
bottleneck meets. Then end-to-end request time will expand IIRC.
Anyway, in my opinion it will be better to combine process time and reject 
numbers to reflect the improvement. Thanks [~fengnanli] for your works again.

> RBF: Improving Router Connection Management
> -------------------------------------------
>
>                 Key: HDFS-15757
>                 URL: https://issues.apache.org/jira/browse/HDFS-15757
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: rbf
>            Reporter: Fengnan Li
>            Assignee: Fengnan Li
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: RBF_ Improving Router Connection Management_v2.pdf, RBF_ 
> Improving Router Connection Management_v3.pdf, RBF_ Router Connection 
> Management.pdf
>
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> We have seen high number of connections from Router to namenodes, leaving 
> namenodes unstable.
> This ticket is trying to reduce connections through some changes. Please take 
> a look at the design and leave comments. 
> Thanks!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to