[ 
https://issues.apache.org/jira/browse/HDFS-16047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17353359#comment-17353359
 ] 

philipse commented on HDFS-16047:
---------------------------------

[[email protected]] thanks for you hands up,

I just review the related code,not sure it matters ,from the code we can see 
when we use `count`, the result will be `getFileCount+getSymlinkCount`, while 
when we use `list`, it shows results of FileStatus ,instand of 
getFileLinkStatus+FileStatus , if so , we may keep them the same,both contains 
different file type to avoid further misunderstanding,

The background is  ,we have a tools to verify that the data have been copied 
perfectly, one factor is that we will compare the file count between dest one 
and the origin data. if they are different,such as original data  changes 
during the copy process, so the file count of origin data and dest data may be 
different , we will treat the copy fails. 

i am a new to hadoop,please correct if I understand improperly. 
{code:java}
public final ContentSummary 
computeAndConvertContentSummary(ContentSummaryComputationContext summary) {
    ContentCounts counts = this.computeContentSummary(summary).getCounts();
    QuotaCounts q = this.getQuotaCounts();
    return (new 
Builder()).length(counts.getLength()).fileCount(counts.getFileCount() + 
counts.getSymlinkCount()).directoryCount(counts.getDirectoryCount()).quota(q.getNameSpace()).spaceConsumed(counts.getStoragespace()).spaceQuota(q.getStorageSpace()).typeConsumed(counts.getTypeSpaces()).typeQuota(q.getTypeSpaces().asArray()).build();
}
{code}

> getContentSummary return incorrect filecount
> --------------------------------------------
>
>                 Key: HDFS-16047
>                 URL: https://issues.apache.org/jira/browse/HDFS-16047
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: dfsclient
>    Affects Versions: 2.7.7
>         Environment: HDP:2.7.7
>            Reporter: philipse
>            Priority: Minor
>
> Hi team
> I got a strange test result when I get hdfs statistics ,the test process 
> shows below. 
> Any advice will be appreciated ,Thanks in advance.
> {code:java}
> 1、hdfs dfs -count /data/BaseData/Log/mq/2018/02/09/,it shows we have 100 file
> 1          100         9689234070 /data/BaseData/Log/mq/2018/02/09/
> 2、hdfs dfs -ls  /data/BaseData/Log/mq/2018/02/09/ ,it shows we have only 98 
> items
> Found 98 items
> 3、hdfs dfs -cp /data/BaseData/Log/mq/2018/02/09/* 
> /data/BaseData/Log/mq_test/2018/02/09/
> 4、hdfs dfs -count /data/BaseData/Log/mq_test/2018/02/09/ ,it shows we have 98 
> items too
> 1           98         9689234070 /data/dpdcadmin/gf13871/test20210528
> 5、hdfs dfs -ls /data/BaseData/Log/mq_test/2018/02/09/ ,it shows we have 98 
> items
> Found 98 items
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to