Ajmal Ahammed created HDFS-15597:
------------------------------------
Summary: ContentSummary.getSpaceConsumed does not consider
replication
Key: HDFS-15597
URL: https://issues.apache.org/jira/browse/HDFS-15597
Project: Hadoop HDFS
Issue Type: Bug
Components: dfs
Affects Versions: 2.6.0
Reporter: Ajmal Ahammed
I am trying to get the disk space consumed by an HDFS directory using the
{{ContentSummary.getSpaceConsumed}} method. I can't get the space consumption
correctly considering the replication factor. The replication factor is 2, and
I was expecting twice the size of the actual file size from the above method.
I can't get the space consumption correctly considering the replication factor.
The replication factor is 2, and I was expecting twice the size of the actual
file size from the above method.
{code}
ubuntu@ubuntu:~/ht$ sudo -u hdfs hdfs dfs -ls /var/lib/ubuntu
Found 2 items
-rw-r--r-- 2 ubuntu ubuntu 3145728 2020-09-08 09:55
/var/lib/ubuntu/size-test
drwxrwxr-x - ubuntu ubuntu 0 2020-09-07 06:37 /var/lib/ubuntu/test
{code}
But when I run the following code,
{code}
String path = "/etc/hadoop/conf/";
conf.addResource(new Path(path + "core-site.xml"));
conf.addResource(new Path(path + "hdfs-site.xml"));
long size =
FileContext.getFileContext(conf).util().getContentSummary(fileStatus).getSpaceConsumed();
System.out.println("Replication : " + fileStatus.getReplication());
System.out.println("File size : " + size);
{code}
The output is
{code}
Replication : 0
File size : 3145728
{code}
Both the file size and the replication factor seems to be incorrect.
/etc/hadoop/conf/hdfs-site.xml contains the following config:
{code}
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]