[ 
https://issues.apache.org/jira/browse/HDDS-7260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Teng updated HDDS-7260:
----------------------------
    Description: 
du command does not return the correct information about the disk usage for 
ozone. The information is incorrect for both ratis and EC data.
The end user would not be able to get the correct usage information and needs 
to figure out manually.

Below are some examples of the issue,
SIZE and DISK_SPACE_CONSUMED_WITH_ALL_REPLICAS numbers are same for both EC and 
ratis data.
{noformat}
ozone fs -du  -v  ofs://ozone1/vol2/buck2/
SIZE        DISK_SPACE_CONSUMED_WITH_ALL_REPLICAS  FULL_PATH_NAME
0           0                                      
ofs://ozone1/vol2/buck2/.Trash
1073741824  1073741824                             
ofs://ozone1/vol2/buck2/ECdata
1073741824  1073741824                             
ofs://ozone1/vol2/buck2/ratisdata


{noformat}
{noformat}
ozone sh key info /vol2/buck2/ECdata
{
  "volumeName" : "vol2",
  "bucketName" : "buck2",
  "name" : "ECdata",
  "dataSize" : 1073741824,
  "creationTime" : "2022-06-15T08:04:59.547Z",
  "modificationTime" : "2022-06-15T08:05:13.826Z",
  "replicationConfig" : {
    "data" : 3,
    "parity" : 2,
    "ecChunkSize" : 1048576,
    "codec" : "RS",
    "replicationType" : "EC",
    "requiredNodes" : 5
  },
  "ozoneKeyLocations" : [ {
    "containerID" : 3001,
    "localID" : 109611004723203001,
    "length" : 805306368,
    "offset" : 0,
    "keyOffset" : 0
  }, {
    "containerID" : 3002,
    "localID" : 109611004723203002,
    "length" : 268435456,
    "offset" : 0,
    "keyOffset" : 805306368
  } ],
  "metadata" : { }
}
[root]# ozone sh key info /vol2/buck2/ratisdata
{
  "volumeName" : "vol2",
  "bucketName" : "buck2",
  "name" : "ratisdata",
  "dataSize" : 1073741824,
  "creationTime" : "2022-06-15T08:07:00.040Z",
  "modificationTime" : "2022-06-15T08:07:05.551Z",
  "replicationConfig" : {
    "replicationFactor" : "THREE",
    "requiredNodes" : 3,
    "replicationType" : "RATIS"
  },
  "ozoneKeyLocations" : [ {
    "containerID" : 3003,
    "localID" : 109611004723203003,
    "length" : 268435456,
    "offset" : 0,
    "keyOffset" : 0
  }, {
    "containerID" : 3004,
    "localID" : 109611004723203004,
    "length" : 268435456,
    "offset" : 0,
    "keyOffset" : 268435456
  }, {
    "containerID" : 3005,
    "localID" : 109611004723203005,
    "length" : 268435456,
    "offset" : 0,
    "keyOffset" : 536870912
  }, {
    "containerID" : 3006,
    "localID" : 109611004723203006,
    "length" : 268435456,
    "offset" : 0,
    "keyOffset" : 805306368
  } ],
  "metadata" : { }
}
{noformat}
For HDFS , it returns correct output for disk space consumed with replica
{noformat}
hdfs  dfs -du -v  hdfs://ns1/tmp/
SIZE         DISK_SPACE_CONSUMED_WITH_ALL_REPLICAS  FULL_PATH_NAME
0            0                                      
hdfs://ns1/tmp/.cloudera_health_monitoring_canary_files
10737418240  32212254720                            hdfs://ns1/tmp/10GB
183719092    551157276                              hdfs://ns1/tmp/hive
508027       1524081                                hdfs://ns1/tmp/logs
{noformat}
h4.

  was:
du command does not return the correct information about the disk usage for 
ozone. The information is incorrect for both ratis and EC data.
The end user would not be able to get the correct usage information and needs 
to figure out manually.


> du command does not return correct disk consumed with replica for both ratis 
> and EC
> -----------------------------------------------------------------------------------
>
>                 Key: HDDS-7260
>                 URL: https://issues.apache.org/jira/browse/HDDS-7260
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Dave Teng
>            Assignee: Dave Teng
>            Priority: Major
>
> du command does not return the correct information about the disk usage for 
> ozone. The information is incorrect for both ratis and EC data.
> The end user would not be able to get the correct usage information and needs 
> to figure out manually.
> Below are some examples of the issue,
> SIZE and DISK_SPACE_CONSUMED_WITH_ALL_REPLICAS numbers are same for both EC 
> and ratis data.
> {noformat}
> ozone fs -du  -v  ofs://ozone1/vol2/buck2/
> SIZE        DISK_SPACE_CONSUMED_WITH_ALL_REPLICAS  FULL_PATH_NAME
> 0           0                                      
> ofs://ozone1/vol2/buck2/.Trash
> 1073741824  1073741824                             
> ofs://ozone1/vol2/buck2/ECdata
> 1073741824  1073741824                             
> ofs://ozone1/vol2/buck2/ratisdata
> {noformat}
> {noformat}
> ozone sh key info /vol2/buck2/ECdata
> {
>   "volumeName" : "vol2",
>   "bucketName" : "buck2",
>   "name" : "ECdata",
>   "dataSize" : 1073741824,
>   "creationTime" : "2022-06-15T08:04:59.547Z",
>   "modificationTime" : "2022-06-15T08:05:13.826Z",
>   "replicationConfig" : {
>     "data" : 3,
>     "parity" : 2,
>     "ecChunkSize" : 1048576,
>     "codec" : "RS",
>     "replicationType" : "EC",
>     "requiredNodes" : 5
>   },
>   "ozoneKeyLocations" : [ {
>     "containerID" : 3001,
>     "localID" : 109611004723203001,
>     "length" : 805306368,
>     "offset" : 0,
>     "keyOffset" : 0
>   }, {
>     "containerID" : 3002,
>     "localID" : 109611004723203002,
>     "length" : 268435456,
>     "offset" : 0,
>     "keyOffset" : 805306368
>   } ],
>   "metadata" : { }
> }
> [root]# ozone sh key info /vol2/buck2/ratisdata
> {
>   "volumeName" : "vol2",
>   "bucketName" : "buck2",
>   "name" : "ratisdata",
>   "dataSize" : 1073741824,
>   "creationTime" : "2022-06-15T08:07:00.040Z",
>   "modificationTime" : "2022-06-15T08:07:05.551Z",
>   "replicationConfig" : {
>     "replicationFactor" : "THREE",
>     "requiredNodes" : 3,
>     "replicationType" : "RATIS"
>   },
>   "ozoneKeyLocations" : [ {
>     "containerID" : 3003,
>     "localID" : 109611004723203003,
>     "length" : 268435456,
>     "offset" : 0,
>     "keyOffset" : 0
>   }, {
>     "containerID" : 3004,
>     "localID" : 109611004723203004,
>     "length" : 268435456,
>     "offset" : 0,
>     "keyOffset" : 268435456
>   }, {
>     "containerID" : 3005,
>     "localID" : 109611004723203005,
>     "length" : 268435456,
>     "offset" : 0,
>     "keyOffset" : 536870912
>   }, {
>     "containerID" : 3006,
>     "localID" : 109611004723203006,
>     "length" : 268435456,
>     "offset" : 0,
>     "keyOffset" : 805306368
>   } ],
>   "metadata" : { }
> }
> {noformat}
> For HDFS , it returns correct output for disk space consumed with replica
> {noformat}
> hdfs  dfs -du -v  hdfs://ns1/tmp/
> SIZE         DISK_SPACE_CONSUMED_WITH_ALL_REPLICAS  FULL_PATH_NAME
> 0            0                                      
> hdfs://ns1/tmp/.cloudera_health_monitoring_canary_files
> 10737418240  32212254720                            hdfs://ns1/tmp/10GB
> 183719092    551157276                              hdfs://ns1/tmp/hive
> 508027       1524081                                hdfs://ns1/tmp/logs
> {noformat}
> h4.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to