[ https://issues.apache.org/jira/browse/HDFS-5659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13854163#comment-13854163 ]
Colin Patrick McCabe commented on HDFS-5659: -------------------------------------------- bq. I refactored the PBHelper methods for DatanodeInfo in what I think is a compatible fashion. Ideally there'd only be one convert method, but I didn't want to tease out the behavior there. This reflects the unfortunate fact that the same protobuf type is used for two rather different things. I think the convert functions need different names that describe what they're doing. bq. The test blocksize (512) was less than the page size (4096), so it was getting automatically rounded up to the page size on the DN, leading to unexpected numbers. The same issue crops up on the namenode when it comes to quotas and stats; we won't hit our perceived capacity if we're caching a bunch of (n%PAGE_SIZE+1) files because of this fragmentation. I don't think this is a big deal (we're looking at worst case 4k waste per cached file), but it's worth keeping in mind. The operating system can't allocate less than 4kb no matter how small the file is. So our bookkeeping reflects reality... a full 4kb page of physical memory is used up by mlocking one 1 byte file. I made the block size smaller than the page size deliberately in the test to test some of those odd scenarios. Probably, we should add a comment to that effect to the test, but let's leave the page size the same so we get coverage. > dfsadmin -report doesn't output cache information properly > ---------------------------------------------------------- > > Key: HDFS-5659 > URL: https://issues.apache.org/jira/browse/HDFS-5659 > Project: Hadoop HDFS > Issue Type: Bug > Components: caching > Affects Versions: 3.0.0 > Reporter: Akira AJISAKA > Assignee: Andrew Wang > Attachments: hdfs-5659-1.patch > > > I tried to cache a file by "hdfs cacheadmin -addDirective". > I thought the file was cached because "CacheUsed" at jmx was more than 0. > {code} > { > "name" : > "Hadoop:service=DataNode,name=FSDatasetState-DS-1043926324-172.28.0.102-50010-1385087929296", > "modelerType" : > "org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl", > "Remaining" : 5604772597760, > "StorageInfo" : "FSDataset{dirpath='[/hadoop/data1/dfs/data/current, > /hadoop/data2/dfs/data/current, /hadoop/data3/dfs/data/current]'}", > "Capacity" : 5905374474240, > "DfsUsed" : 11628544, > "CacheCapacity" : 1073741824, > "CacheUsed" : 360448, > "NumFailedVolumes" : 0, > "NumBlocksCached" : 1, > "NumBlocksFailedToCache" : 0, > "NumBlocksFailedToUncache" : 0 > }, > {code} > But "dfsadmin -report" didn't output the same value as jmx. > {code} > Configured Cache Capacity: 0 (0 B) > Cache Used: 0 (0 B) > Cache Remaining: 0 (0 B) > Cache Used%: 100.00% > Cache Remaining%: 0.00% > {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)