[jira] [Commented] (HDFS-7597) Clients seeking over webhdfs may crash the NN

Colin Patrick McCabe (JIRA) Wed, 14 Jan 2015 12:35:49 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14277623#comment-14277623
 ]


Colin Patrick McCabe commented on HDFS-7597:
--------------------------------------------

The cache builder code is only used once at startup, though, to build the cache 
object.  Being readable and developer-friendly is clearly the right thing to do 
for code that only runs once at startup.  If there are examples of 
inefficiencies in the code that will actually be used at runtime, that would be 
more interesting.

bq. A CHM is neither useful nor performant unless you intend to cache many 
multiples of the number of accessing threads. Probably on the order of 
thousands which is overkill.

Can you go into more detail about when the performance of a 
{{ConcurrentHashMap}} would be worse than a regular one?  The last time I 
looked at it, CHM was just using lock striping.  So basically each "get" or 
"put" takes a single lock, does its business, and then releases.  This seems 
like the same level of overhead as a normal hash map.  I don't think using 
multiple locks will be slower than one.  By definition, interlocked 
instructions bypass CPU caches... that's what they're designed to do and must 
do.

Like I said earlier, I am fine with this patch going in as-is (assuming the 
test failure is unrelated).  But I'd like to get more understanding of the 
performance issues here so we can optimize in the future.

> Clients seeking over webhdfs may crash the NN
> ---------------------------------------------
>
>                 Key: HDFS-7597
>                 URL: https://issues.apache.org/jira/browse/HDFS-7597
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: webhdfs
>    Affects Versions: 2.0.0-alpha
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>            Priority: Critical
>         Attachments: HDFS-7597.patch
>
>
> Webhdfs seeks involve closing the current connection, and reissuing a new 
> open request with the new offset.  The RPC layer caches connections so the DN 
> keeps a lingering connection open to the NN.  Connection caching is in part 
> based on UGI.  Although the client used the same token for the new offset 
> request, the UGI is different which forces the DN to open another unnecessary 
> connection to the NN.
> A job that performs many seeks will easily crash the NN due to fd exhaustion.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7597) Clients seeking over webhdfs may crash the NN

Reply via email to