[ 
https://issues.apache.org/jira/browse/HDDS-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarveksha Yeshavantha Raju updated HDDS-14103:
----------------------------------------------
    Description: 
Ozone currently has no way to clear missing containers from the system. Even if 
all the data is deleted from the OM, the block deletes will never leave SCM 
because it has no replicas to send them to. As a short term mitigation, we can 
add a CLI to SCM that supports “acking“ missing containers by ID if the admin 
confirms they are not a problem, so they do not mask future issues. This would 
remove them from ozone admin container report output and the missing container 
count metric. This would need to be persisted in the ContainerInfo in SCM, and 
we probably would want to show this property in ozone admin container info 
--json. There should also be a CLI to raise containers as an issue again and to 
query the list of acked missing containers. Recon can then use this API to keep 
its missing container alerts in sync with SCM. This is purely cosmetic and 
would be applied as a mask on top of existing container reporting to the user. 
We do not need to change how replication manager handles these containers 
internally.

 

New Proposed Change:

Implement {{suppress}} and {{unsuppress}} flags in {{ozone admin container 
report}} (mutually exclusive), supporting multiple container IDs from command 
line, stdin, or files.
Once the command is executed, the container report will be updated after the 
next Replication Manager cycle.

Add {{--suppressed}} filtering option to {{ozone admin container list}} to show 
suppressed/unppressed containers.

  was:
Ozone currently has no way to clear missing containers from the system. Even if 
all the data is deleted from the OM, the block deletes will never leave SCM 
because it has no replicas to send them to. As a short term mitigation, we can 
add a CLI to SCM that supports “acking“ missing containers by ID if the admin 
confirms they are not a problem, so they do not mask future issues. This would 
remove them from ozone admin container report output and the missing container 
count metric. This would need to be persisted in the ContainerInfo in SCM, and 
we probably would want to show this property in ozone admin container info 
--json. There should also be a CLI to raise containers as an issue again and to 
query the list of acked missing containers. Recon can then use this API to keep 
its missing container alerts in sync with SCM. This is purely cosmetic and 
would be applied as a mask on top of existing container reporting to the user. 
We do not need to change how replication manager handles these containers 
internally.

 

New Proposed Change:

Implement {{--suppress}} and {{--unsuppress}} flags in {{ozone admin container 
report}} (mutually exclusive), supporting multiple container IDs from command 
line, stdin, or files.
Once the command is executed, the container report will be updated after the 
next Replication Manager cycle.

Add {{--suppressed}} filtering option to {{ozone admin container list}} to show 
suppressed/unppressed containers.


> Create an option to suppress/unsuppress containers from report
> --------------------------------------------------------------
>
>                 Key: HDDS-14103
>                 URL: https://issues.apache.org/jira/browse/HDDS-14103
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Sarveksha Yeshavantha Raju
>            Assignee: Sarveksha Yeshavantha Raju
>            Priority: Major
>              Labels: pull-request-available
>
> Ozone currently has no way to clear missing containers from the system. Even 
> if all the data is deleted from the OM, the block deletes will never leave 
> SCM because it has no replicas to send them to. As a short term mitigation, 
> we can add a CLI to SCM that supports “acking“ missing containers by ID if 
> the admin confirms they are not a problem, so they do not mask future issues. 
> This would remove them from ozone admin container report output and the 
> missing container count metric. This would need to be persisted in the 
> ContainerInfo in SCM, and we probably would want to show this property in 
> ozone admin container info --json. There should also be a CLI to raise 
> containers as an issue again and to query the list of acked missing 
> containers. Recon can then use this API to keep its missing container alerts 
> in sync with SCM. This is purely cosmetic and would be applied as a mask on 
> top of existing container reporting to the user. We do not need to change how 
> replication manager handles these containers internally.
>  
> New Proposed Change:
> Implement {{suppress}} and {{unsuppress}} flags in {{ozone admin container 
> report}} (mutually exclusive), supporting multiple container IDs from command 
> line, stdin, or files.
> Once the command is executed, the container report will be updated after the 
> next Replication Manager cycle.
> Add {{--suppressed}} filtering option to {{ozone admin container list}} to 
> show suppressed/unppressed containers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to