[
https://issues.apache.org/jira/browse/HDFS-12533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16678603#comment-16678603
]
Erik Krogen commented on HDFS-12533:
------------------------------------
Just an update on this, I tried running the same experiments I did previous to
HADOOP-9747:
{code}
$HADOOP_HOME/bin/hadoop jar
$HADOOP_HOME/share/hadoop/hdfs/hadoop-hdfs-3.3.0-SNAPSHOT-tests.jar
org.apache.hadoop.hdfs.server.namenode.NNThroughputBenchmark -op fileStatus
-threads 1000 -files 5000000 -filesPerDir 10 -useExisting -keepResults
{code}
I ran it 3 times on trunk, and 3 times on a hacked build of trunk in which
{{Server#getRemoteUser()}} returns a statically defined UGI, avoiding the
{{getCurrentUser()}} lookup. The average with that fix was 118 kop/s, and
without was 101 kop/s. So I think it's still worth moving forward with this
patch to ensure that the synchronization does not have any effect on
NNThroughputBenchmark results.
> NNThroughputBenchmark threads get stuck on UGI.getCurrentUser()
> ---------------------------------------------------------------
>
> Key: HDFS-12533
> URL: https://issues.apache.org/jira/browse/HDFS-12533
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Erik Krogen
> Assignee: Erik Krogen
> Priority: Major
>
> In {{NameNode#getRemoteUser()}}, it first attempts to fetch from the RPC user
> (not a synchronized operation), and if there is no RPC call, it will call
> {{UserGroupInformation#getCurrentUser()}} (which is {{synchronized}}). This
> makes it efficient for RPC operations (the bulk) so that there is not too
> much contention.
> In NNThroughputBenchmark, however, there is no RPC call since we bypass that
> later, so with a high thread count many of the threads are getting stuck. At
> one point I attached a profiler and found that quite a few threads had been
> waiting for {{#getCurrentUser()}} for 2 minutes ( ! ). When taking this away
> I found some improvement in the throughput numbers I was seeing. To more
> closely emulate a real NN we should improve this issue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]