[
https://issues.apache.org/jira/browse/HDDS-15455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sreeja updated HDDS-15455:
--------------------------
Description:
Implement logic to traverse all storage volumes configured in
*{{hdds.datanode.dir}}* and discover container directories present under the
DataNode container storage hierarchy.
For each discovered container directory:
* Extract the container ID from the directory name.
* Collect the container directory path, storage volume, and directory size.
* Determine the metadata status:
** {{*MISSING_METADATA*}} if {{metadata/\{containerId}.container}} does not
exist.
** {{*INVALID_METADATA*}} if the metadata file exists but cannot be parsed, or
if the container ID stored in the metadata does not match the directory-name
container ID.
** *{{VALID}}* otherwise.
Store the results as a mapping:
{{containerId -> List<ContainerOccurrence>}}
where each occurrence contains the container directory path, volume, size, and
metadata status.
Use this mapping to identify duplicate container directories by detecting
container IDs associated with more than one on-disk occurrence across storage
volumes on the same DataNode.
was:
Implement logic to traverse all storage volumes configured in
*{{hdds.datanode.dir}}* and discover container directories present on disk. For
each container directory, collect the container ID, directory path, volume,
size, and
compute the metadata status :
if metadata/\{containerId}.container missing => *MISSING_METADATA*
if metadata/\{containerId}.container exists but parse fails or containerId does
not match with directory-name containerId => *INVALID_METADATA*
otherwise => *VALID*
Store the results as a mapping of {{{}containerId -> list of disk
occurrences{}}}, preserving all discovered copies of a container. This
information will also be used to identify duplicate container directories by
detecting container IDs with multiple on-disk occurrences across storage
volumes on the same DataNode.
> Implement Custom DataNode Container Directory Discovery and Duplicate
> Detection
> -------------------------------------------------------------------------------
>
> Key: HDDS-15455
> URL: https://issues.apache.org/jira/browse/HDDS-15455
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Sreeja
> Assignee: Sreeja
> Priority: Major
>
> Implement logic to traverse all storage volumes configured in
> *{{hdds.datanode.dir}}* and discover container directories present under the
> DataNode container storage hierarchy.
> For each discovered container directory:
> * Extract the container ID from the directory name.
> * Collect the container directory path, storage volume, and directory size.
> * Determine the metadata status:
> ** {{*MISSING_METADATA*}} if {{metadata/\{containerId}.container}} does not
> exist.
> ** {{*INVALID_METADATA*}} if the metadata file exists but cannot be parsed,
> or if the container ID stored in the metadata does not match the
> directory-name container ID.
> ** *{{VALID}}* otherwise.
> Store the results as a mapping:
> {{containerId -> List<ContainerOccurrence>}}
> where each occurrence contains the container directory path, volume, size,
> and metadata status.
> Use this mapping to identify duplicate container directories by detecting
> container IDs associated with more than one on-disk occurrence across storage
> volumes on the same DataNode.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]