[jira] [Commented] (HDFS-5659) dfsadmin -report doesn't output cache information properly

Colin Patrick McCabe (JIRA) Fri, 20 Dec 2013 09:07:21 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13854163#comment-13854163
 ]


Colin Patrick McCabe commented on HDFS-5659:
--------------------------------------------

bq. I refactored the PBHelper methods for DatanodeInfo in what I think is a 
compatible fashion. Ideally there'd only be one convert method, but I didn't 
want to tease out the behavior there.

This reflects the unfortunate fact that the same protobuf type is used for two 
rather different things.  I think the convert functions need different names 
that describe what they're doing.

bq. The test blocksize (512) was less than the page size (4096), so it was 
getting automatically rounded up to the page size on the DN, leading to 
unexpected numbers. The same issue crops up on the namenode when it comes to 
quotas and stats; we won't hit our perceived capacity if we're caching a bunch 
of (n%PAGE_SIZE+1) files because of this fragmentation. I don't think this is a 
big deal (we're looking at worst case 4k waste per cached file), but it's worth 
keeping in mind.

The operating system can't allocate less than 4kb no matter how small the file 
is.  So our bookkeeping reflects reality... a full 4kb page of physical memory 
is used up by mlocking one 1 byte file.  I made the block size smaller than the 
page size deliberately in the test to test some of those odd scenarios.  
Probably, we should add a comment to that effect to the test, but let's leave 
the page size the same so we get coverage.

> dfsadmin -report doesn't output cache information properly
> ----------------------------------------------------------
>
>                 Key: HDFS-5659
>                 URL: https://issues.apache.org/jira/browse/HDFS-5659
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: caching
>    Affects Versions: 3.0.0
>            Reporter: Akira AJISAKA
>            Assignee: Andrew Wang
>         Attachments: hdfs-5659-1.patch
>
>
> I tried to cache a file by "hdfs cacheadmin -addDirective".
> I thought the file was cached because "CacheUsed" at jmx was more than 0.
> {code}
> {
>     "name" : 
> "Hadoop:service=DataNode,name=FSDatasetState-DS-1043926324-172.28.0.102-50010-1385087929296",
>     "modelerType" : 
> "org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl",
>     "Remaining" : 5604772597760,
>     "StorageInfo" : "FSDataset{dirpath='[/hadoop/data1/dfs/data/current, 
> /hadoop/data2/dfs/data/current, /hadoop/data3/dfs/data/current]'}",
>     "Capacity" : 5905374474240,
>     "DfsUsed" : 11628544,
>     "CacheCapacity" : 1073741824,
>     "CacheUsed" : 360448,
>     "NumFailedVolumes" : 0,
>     "NumBlocksCached" : 1,
>     "NumBlocksFailedToCache" : 0,
>     "NumBlocksFailedToUncache" : 0
>   },
> {code}
> But "dfsadmin -report" didn't output the same value as jmx.
> {code}
> Configured Cache Capacity: 0 (0 B)
> Cache Used: 0 (0 B)
> Cache Remaining: 0 (0 B)
> Cache Used%: 100.00%
> Cache Remaining%: 0.00%
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

[jira] [Commented] (HDFS-5659) dfsadmin -report doesn't output cache information properly

Reply via email to