[
https://issues.apache.org/jira/browse/HDFS-10797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Mackrory updated HDFS-10797:
---------------------------------
Release Note: Disk usage summaries previously incorrectly counted files
twice if they had been renamed (including files moved to Trash) since being
snapshotted. Summaries now include current data plus snapshotted data that is
no longer under the directory either due to deletion or being moved outside of
the directory. (was: Disk usage summaries previously incorrectly counted files
twice if they had been renamed since being snapshotted. Summaries now include
current data plus snapshotted data that is no longer under in the directory
either due to deletion or being moved outside of the directory.)
> Disk usage summary of snapshots causes renamed blocks to get counted twice
> --------------------------------------------------------------------------
>
> Key: HDFS-10797
> URL: https://issues.apache.org/jira/browse/HDFS-10797
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: snapshots
> Affects Versions: 2.8.0
> Reporter: Sean Mackrory
> Assignee: Sean Mackrory
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HDFS-10797.001.patch, HDFS-10797.002.patch,
> HDFS-10797.003.patch, HDFS-10797.004.patch, HDFS-10797.005.patch,
> HDFS-10797.006.patch, HDFS-10797.007.patch, HDFS-10797.008.patch,
> HDFS-10797.009.patch, HDFS-10797.010.patch, HDFS-10797.010.patch
>
>
> DirectoryWithSnapshotFeature.computeContentSummary4Snapshot calculates how
> much disk usage is used by a snapshot by tallying up the files in the
> snapshot that have since been deleted (that way it won't overlap with regular
> files whose disk usage is computed separately). However that is determined
> from a diff that shows moved (to Trash or otherwise) or renamed files as a
> deletion and a creation operation that may overlap with the list of blocks.
> Only the deletion operation is taken into consideration, and this causes
> those blocks to get represented twice in the disk usage tallying.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]