[ 
https://issues.apache.org/jira/browse/HDFS-6982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167851#comment-14167851
 ] 

Andrew Wang commented on HDFS-6982:
-----------------------------------

Hi Maysam, thanks for sticking with this, and again apologies for the delay 
with reviewing this. I especially appreciate the effort to javadoc everything, 
it helped a lot with reviewing this.

One thing that'd be helpful are some basic setup instructions so I can try it 
out. Since I don't see any modifications to any existing files, I'm also 
wondering how this is exposed to JMX or on the webUI.

Anyway, I did a light pass for mostly code-style and nitty stuff, hopefully I 
can do a more high-level overview once the above is answered.

Nits:
* Need to put {{<p/>}} tags in javadoc to get line breaks. Can also use 
{{@link}} to refer to methods.
* Should use SLF4J logging in new classes, i.e. {{LoggerFactory.getLogger()}}, 
and use the "{}" template to avoid string construction overhead for debug prints
* These new classes should probably all be annotated @InterfaceAudience.Private
* Bucket could be private rather than package-private
* Code style: if/for blocks should always have braces
* A few lines longer than 80 chars
* I generally prefer doing Precondition checks to manually throwing 
IllegalStateException, since it's more concise.

RollingWindowManager:
* There's only a {{getDefaultRollingWindow}} class, no other ways of 
constructing a RollingWindow. With this in mind, could we rename it to just 
{{getRollingWindow}}? Same for params like {{defaultWindowLen}} and 
{{defaultBPW}}, which make me think of the configuration defaults.
* Would be good to do Precondition validation on the config params, e.g. 
greater than zero, numBuckets less than windowLen, evenly divisible, etc.

TopConfiguration:
* Could we put most of this into hdfs-site.xml, hdfs-default.xml, and 
DFSConfigKeys instead? It'd also be good to have some light docs in 
hdfs-default.xml about how to configure this.
* If so, propose we use a prefix like "dfs.namenode.top."
* How do we configure multiple reporting periods? That TopMetrics constructor 
isn't called right now. It'd also be good to unify getting and then validating 
all the configs in a single place, like this class. See NNConf for an example 
of what I'm thinking of.
* Typo in config key: BUKCETS
* WEB_PORT and DEFAULT_WEB_PORT seem to be unused, 
TopConfiguration(Configuration) constructor too
* getCmdTotal and getTopMetricsRecordPrefix static getters are only used in 
TopMetrics, that might be a better home.
* Rather than MIN_2_MS, could we have a long array with the default periods, 
i.e. DEFAULT_REPORTING_PERIODS? We already have a Configuration#getInts 
methods, could add a Configuration#getLongs method too.

TopMetrics:
* Typo: "TopMetic singlton"
* If you use SLF4J, you can oftentimes skip wrapping things in 
LOG.isDebugEnabled if you also do the string templating
* getMetrics does the isDebugEnabled check, but inside does LOG.info
* report, we construct the permStr, but don't actually use it.
* report, I don't think we need the catch for {{Throwable t}}, no checked 
exceptions are being thrown?
* getReportingPeriods is unused

TopUtil
* This stuff isn't shared much, seems like we could just move things to where 
they're used

TopMetricsCollector
* Is this used?

TestRWM:
* Let's do the init in a @BeforeClass block
* Rather than grabbing defaultWindowLen and defaultBPW directly, we should set 
them in the config, pass that in when making the RWManager, then we can verify 
that in the test that they've taken effect. Can then make these RWM variables 
private.

More tests:
* Could also test the JMX interface, e.g. in TestNameNodeMetrics

> nntop: top­-like tool for name node users
> -----------------------------------------
>
>                 Key: HDFS-6982
>                 URL: https://issues.apache.org/jira/browse/HDFS-6982
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Maysam Yabandeh
>            Assignee: Maysam Yabandeh
>         Attachments: HDFS-6982.patch, HDFS-6982.v2.patch, nntop-design-v1.pdf
>
>
> In this jira we motivate the need for nntop, a tool that, similarly to what 
> top does in Linux, gives the list of top users of the HDFS name node and 
> gives insight about which users are sending majority of each traffic type to 
> the name node. This information turns out to be the most critical when the 
> name node is under pressure and the HDFS admin needs to know which user is 
> hammering the name node and with what kind of requests. Here we present the 
> design of nntop which has been in production at Twitter in the past 10 
> months. nntop proved to have low cpu overhead (< 2% in a cluster of 4K 
> nodes), low memory footprint (less than a few MB), and quite efficient for 
> the write path (only two hash lookup for updating a metric).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to