[ 
https://issues.apache.org/jira/browse/HDFS-11907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16032154#comment-16032154
 ] 

Arpit Agarwal commented on HDFS-11907:
--------------------------------------

Thanks for this improvement Chen! A few comments:
# We should use Time#monotonicNow instead of System#currentTimeMillis in both 
files. Time#monotonicNow also returns a millisecond value, but it is guaranteed 
to be monotonically increasing.
# Instead of initializing availableSpeceTimeStamp to zero, we should initialize 
it to (Time#monotonicNow - 5000) since 0 can be a valid timestamp returned by 
nanoTime.
# You can also log the IP address of the client that issued the request to aid 
debugging. It can be retrieved in an RPC call by calling Server.getIpAddr().
# Typo: availabeSpeceTimeStamp --> availableSpaceTimeStamp.
# Let's Replace 5000 and 3000 with static final ints.
# See if you can write an isolated unit test for NameNodeResourceChecker. e.g. 
the first call to isResourceAvailable should update availableSpaceTimeStamp, 
subsequent calls immediately should not. Then if you advance the timer (see 
FakeTimer) and call isResourceAvailable again, availableSpaceTimeStamp should 
be updated.


> NameNodeResourceChecker should avoid calling df.getAvailable too frequently
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-11907
>                 URL: https://issues.apache.org/jira/browse/HDFS-11907
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>         Attachments: HDFS-11907.001.patch, HDFS-11907.002.patch
>
>
> Currently, {{HealthMonitor#doHealthChecks}} invokes 
> {{NameNode#monitorHealth}} which ends up invoking 
> {{NameNodeResourceChecker#isResourceAvailable}}, at the frequency of once per 
> second by default. And NameNodeResourceChecker#isResourceAvailable invokes 
> {{df.getAvailable();}} every time it is called.
> Since available space information should rarely be changing dramatically at 
> the pace of per second. A cached value should be sufficient. i.e. only try to 
> get the updated value when the cached value is too old. otherwise simply 
> return the cached value. This way df.getAvailable() gets invoked less.
> Thanks [~arpitagarwal] for the offline discussion.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to