[
https://issues.apache.org/jira/browse/HDFS-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14271922#comment-14271922
]
Daryn Sharp commented on HDFS-7597:
-----------------------------------
Good question, but synchronizing the whole operation will cause a cache-miss on
one lookup to stall all other lookups including those that will be cache hits.
While it might be tolerable on the the client-side with minimal impact, I don't
think it's worth dragging down the performance of the server-side connection
handling.
> Clients seeking over webhdfs may crash the NN
> ---------------------------------------------
>
> Key: HDFS-7597
> URL: https://issues.apache.org/jira/browse/HDFS-7597
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: webhdfs
> Affects Versions: 2.0.0-alpha
> Reporter: Daryn Sharp
> Assignee: Daryn Sharp
> Priority: Critical
> Attachments: HDFS-7597.patch
>
>
> Webhdfs seeks involve closing the current connection, and reissuing a new
> open request with the new offset. The RPC layer caches connections so the DN
> keeps a lingering connection open to the NN. Connection caching is in part
> based on UGI. Although the client used the same token for the new offset
> request, the UGI is different which forces the DN to open another unnecessary
> connection to the NN.
> A job that performs many seeks will easily crash the NN due to fd exhaustion.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)