[
https://issues.apache.org/jira/browse/HDFS-8581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
tongshiquan updated HDFS-8581:
------------------------------
Description:
If one directory such as "/result" exists about 200000 files, then when execute
"hdfs dfs -count /", the result will go wrong. For all directories whose name
after "/result", file num will not be included.
My cluster see as below, "/result_1433858936" is the directory exist huge
files, and files in "/sparkJobHistory", "/tmp", "/user" are not included
vm-221:/export1/BigData/current # hdfs dfs -ls /
15/06/11 11:00:17 INFO hdfs.PeerCache: SocketCache disabled.
Found 9 items
-rw-r--r-- 3 hdfs supergroup 0 2015-06-08 12:10
/PRE_CREATE_DIR.SUCCESS
drwxr-x--- - flume hadoop 0 2015-06-08 12:08 /flume
drwx------ - hbase hadoop 0 2015-06-10 15:25 /hbase
drwxr-xr-x - hdfs supergroup 0 2015-06-10 17:19 /hyt
drwxrwxrwx - mapred hadoop 0 2015-06-08 12:08 /mr-history
drwxr-xr-x - hdfs supergroup 0 2015-06-09 22:10 /result_1433858936
drwxrwxrwx - spark supergroup 0 2015-06-10 19:15 /sparkJobHistory
drwxrwxrwx - hdfs hadoop 0 2015-06-08 12:14 /tmp
drwxrwxrwx - hdfs hadoop 0 2015-06-09 21:57 /user
vm-221:/export1/BigData/current #
vm-221:/export1/BigData/current # hdfs dfs -count /
15/06/11 11:00:24 INFO hdfs.PeerCache: SocketCache disabled.
1043 171536 1756375688 /
vm-221:/export1/BigData/current #
vm-221:/export1/BigData/current # hdfs dfs -count /PRE_CREATE_DIR.SUCCESS
15/06/11 11:00:30 INFO hdfs.PeerCache: SocketCache disabled.
0 1 0 /PRE_CREATE_DIR.SUCCESS
vm-221:/export1/BigData/current #
vm-221:/export1/BigData/current # hdfs dfs -count /flume
15/06/11 11:00:41 INFO hdfs.PeerCache: SocketCache disabled.
1 0 0 /flume
vm-221:/export1/BigData/current #
vm-221:/export1/BigData/current # hdfs dfs -count /hbase
15/06/11 11:00:49 INFO hdfs.PeerCache: SocketCache disabled.
36 18 14807 /hbase
vm-221:/export1/BigData/current #
vm-221:/export1/BigData/current # hdfs dfs -count /hyt
15/06/11 11:01:09 INFO hdfs.PeerCache: SocketCache disabled.
1 0 0 /hyt
vm-221:/export1/BigData/current #
vm-221:/export1/BigData/current # hdfs dfs -count /mr-history
15/06/11 11:01:18 INFO hdfs.PeerCache: SocketCache disabled.
3 0 0 /mr-history
vm-221:/export1/BigData/current #
vm-221:/export1/BigData/current # hdfs dfs -count /result_1433858936
15/06/11 11:01:29 INFO hdfs.PeerCache: SocketCache disabled.
1001 171517 1756360881 /result_1433858936
vm-221:/export1/BigData/current #
vm-221:/export1/BigData/current # hdfs dfs -count /sparkJobHistory
15/06/11 11:01:41 INFO hdfs.PeerCache: SocketCache disabled.
1 3 21785 /sparkJobHistory
vm-221:/export1/BigData/current #
vm-221:/export1/BigData/current # hdfs dfs -count /tmp
15/06/11 11:01:48 INFO hdfs.PeerCache: SocketCache disabled.
17 6 35958 /tmp
vm-221:/export1/BigData/current #
vm-221:/export1/BigData/current # hdfs dfs -count /user
15/06/11 11:01:55 INFO hdfs.PeerCache: SocketCache disabled.
12 1 19077 /user
was:
If one directory such as "/result" exists about 200000 files, then when execute
"hdfs dfs -count /", the result will go wrong. For all directories whose name
after "/result", file num will not be included.
My cluster see as snapshot, "/result_1433858936" is the directory exist huge
files, and files in "/sparkJobHistory", "/tmp", "/user" are not included
> count cmd calculate wrong when huge files exist in one folder
> -------------------------------------------------------------
>
> Key: HDFS-8581
> URL: https://issues.apache.org/jira/browse/HDFS-8581
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: HDFS
> Reporter: tongshiquan
> Assignee: J.Andreina
> Priority: Minor
>
> If one directory such as "/result" exists about 200000 files, then when
> execute "hdfs dfs -count /", the result will go wrong. For all directories
> whose name after "/result", file num will not be included.
> My cluster see as below, "/result_1433858936" is the directory exist huge
> files, and files in "/sparkJobHistory", "/tmp", "/user" are not included
> vm-221:/export1/BigData/current # hdfs dfs -ls /
> 15/06/11 11:00:17 INFO hdfs.PeerCache: SocketCache disabled.
> Found 9 items
> -rw-r--r-- 3 hdfs supergroup 0 2015-06-08 12:10
> /PRE_CREATE_DIR.SUCCESS
> drwxr-x--- - flume hadoop 0 2015-06-08 12:08 /flume
> drwx------ - hbase hadoop 0 2015-06-10 15:25 /hbase
> drwxr-xr-x - hdfs supergroup 0 2015-06-10 17:19 /hyt
> drwxrwxrwx - mapred hadoop 0 2015-06-08 12:08 /mr-history
> drwxr-xr-x - hdfs supergroup 0 2015-06-09 22:10
> /result_1433858936
> drwxrwxrwx - spark supergroup 0 2015-06-10 19:15 /sparkJobHistory
> drwxrwxrwx - hdfs hadoop 0 2015-06-08 12:14 /tmp
> drwxrwxrwx - hdfs hadoop 0 2015-06-09 21:57 /user
> vm-221:/export1/BigData/current #
> vm-221:/export1/BigData/current # hdfs dfs -count /
> 15/06/11 11:00:24 INFO hdfs.PeerCache: SocketCache disabled.
> 1043 171536 1756375688 /
> vm-221:/export1/BigData/current #
> vm-221:/export1/BigData/current # hdfs dfs -count /PRE_CREATE_DIR.SUCCESS
> 15/06/11 11:00:30 INFO hdfs.PeerCache: SocketCache disabled.
> 0 1 0 /PRE_CREATE_DIR.SUCCESS
> vm-221:/export1/BigData/current #
> vm-221:/export1/BigData/current # hdfs dfs -count /flume
> 15/06/11 11:00:41 INFO hdfs.PeerCache: SocketCache disabled.
> 1 0 0 /flume
> vm-221:/export1/BigData/current #
> vm-221:/export1/BigData/current # hdfs dfs -count /hbase
> 15/06/11 11:00:49 INFO hdfs.PeerCache: SocketCache disabled.
> 36 18 14807 /hbase
> vm-221:/export1/BigData/current #
> vm-221:/export1/BigData/current # hdfs dfs -count /hyt
> 15/06/11 11:01:09 INFO hdfs.PeerCache: SocketCache disabled.
> 1 0 0 /hyt
> vm-221:/export1/BigData/current #
> vm-221:/export1/BigData/current # hdfs dfs -count /mr-history
> 15/06/11 11:01:18 INFO hdfs.PeerCache: SocketCache disabled.
> 3 0 0 /mr-history
> vm-221:/export1/BigData/current #
> vm-221:/export1/BigData/current # hdfs dfs -count /result_1433858936
> 15/06/11 11:01:29 INFO hdfs.PeerCache: SocketCache disabled.
> 1001 171517 1756360881 /result_1433858936
> vm-221:/export1/BigData/current #
> vm-221:/export1/BigData/current # hdfs dfs -count /sparkJobHistory
> 15/06/11 11:01:41 INFO hdfs.PeerCache: SocketCache disabled.
> 1 3 21785 /sparkJobHistory
> vm-221:/export1/BigData/current #
> vm-221:/export1/BigData/current # hdfs dfs -count /tmp
> 15/06/11 11:01:48 INFO hdfs.PeerCache: SocketCache disabled.
> 17 6 35958 /tmp
> vm-221:/export1/BigData/current #
> vm-221:/export1/BigData/current # hdfs dfs -count /user
> 15/06/11 11:01:55 INFO hdfs.PeerCache: SocketCache disabled.
> 12 1 19077 /user
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)