[ 
https://issues.apache.org/jira/browse/HDFS-6093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13941108#comment-13941108
 ] 

Andrew Wang commented on HDFS-6093:
-----------------------------------

Hi Arpit and Colin, thanks for reviewing. New patch is up. Addressed your 
feedback except the following, and I also fixed a logging issue I found:

bq. update CentralizedCacheManagement.html in the docs?

Added a short blurb. A nice follow-on JIRA would be an FAQ for debugging 
caching, since it can be tricky right now.

bq. display the pending caching/uncaching counts in the output of 'dfsadmin 
-report'?

I think dfsadmin -report is more about usage statistics than replication work. 
Having the pending stats as a metric and on the webUI means it should still be 
easy enough to access.

bq. Was stillPendingUncached introduced to fix a bug?

This is required because cache reports just tell you what's cached, not also 
what was uncached. So, we need to compute a diff to update pendingUncached 
correctly.

bq. <ternary statement code nit>

I prefer not to use ternary statements, so I'd like to leave it as is if that's 
okay.

bq. decouple the counter(s) that can be read from the CRM from the counters 
that the CRM uses internally

With the locking issues resolved, is it okay to just leave it with a single set 
of variables? I could switch it over to AtomicLongs or something, but I think 
it's all under the FSN lock anyway.

bq. <colin> Incrementally updating the pendingUncached list and stats is a nice 
idea, but it seems too ambitious for 2.4 at this point. 

I'm okay bumping this to 2.5 if you'd rather not put this in 2.4, but I think 
this all works now with the locking fixed.

> Expose more caching information for debugging by users
> ------------------------------------------------------
>
>                 Key: HDFS-6093
>                 URL: https://issues.apache.org/jira/browse/HDFS-6093
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: caching
>    Affects Versions: 2.4.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>         Attachments: hdfs-6093-1.patch, hdfs-6093-2.patch
>
>
> When users submit a new cache directive, it's unclear if the NN has 
> recognized it and is actively trying to cache it, or if it's hung for some 
> other reason. It'd be nice to expose a "pending caching/uncaching" count the 
> same way we expose pending replication work.
> It'd also be nice to display the aggregate cache capacity and usage in 
> dfsadmin -report, since we already have have it as a metric and expose it 
> per-DN in report output.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to