[ https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16032154#comment-16032154 ]
Arpit Agarwal commented on HDFS-11907: -------------------------------------- Thanks for this improvement Chen! A few comments: # We should use Time#monotonicNow instead of System#currentTimeMillis in both files. Time#monotonicNow also returns a millisecond value, but it is guaranteed to be monotonically increasing. # Instead of initializing availableSpeceTimeStamp to zero, we should initialize it to (Time#monotonicNow - 5000) since 0 can be a valid timestamp returned by nanoTime. # You can also log the IP address of the client that issued the request to aid debugging. It can be retrieved in an RPC call by calling Server.getIpAddr(). # Typo: availabeSpeceTimeStamp --> availableSpaceTimeStamp. # Let's Replace 5000 and 3000 with static final ints. # See if you can write an isolated unit test for NameNodeResourceChecker. e.g. the first call to isResourceAvailable should update availableSpaceTimeStamp, subsequent calls immediately should not. Then if you advance the timer (see FakeTimer) and call isResourceAvailable again, availableSpaceTimeStamp should be updated. > NameNodeResourceChecker should avoid calling df.getAvailable too frequently > --------------------------------------------------------------------------- > > Key: HDFS-11907 > URL: https://issues.apache.org/jira/browse/HDFS-11907 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Chen Liang > Assignee: Chen Liang > Attachments: HDFS-11907.001.patch, HDFS-11907.002.patch > > > Currently, {{HealthMonitor#doHealthChecks}} invokes > {{NameNode#monitorHealth}} which ends up invoking > {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once per > second by default. And NameNodeResourceChecker#isResourceAvailable invokes > {{df.getAvailable();}} every time it is called. > Since available space information should rarely be changing dramatically at > the pace of per second. A cached value should be sufficient. i.e. only try to > get the updated value when the cached value is too old. otherwise simply > return the cached value. This way df.getAvailable() gets invoked less. > Thanks [~arpitagarwal] for the offline discussion. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org