[ 
https://issues.apache.org/jira/browse/HDDS-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17698020#comment-17698020
 ] 

Mladjan Gadzic commented on HDDS-7098:
--------------------------------------

[~erose] It looks like Recon API shows that. I followed [~NeilJoshi] 
instructions to verify this. It was done on unsecured Ozone with 3 DNs.
{code:java}
bash-4.2$ ozone admin container info 1
Container id: 1
Pipeline id: 0edb486b-4bdc-478b-bc73-e4dc69e422e5
Container State: OPEN
Datanodes: [d2ee3d1f-10f9-452d-b692-576edc6b6096/ozone-datanode-3.ozone_default,
4bac7a54-31ec-4795-b904-4ba5ff3f7343/ozone-datanode-1.ozone_default,
4ebf5f8e-7ea7-409e-a4eb-c1d0ef6c57ed/ozone-datanode-2.ozone_default]
Replicas: [State: OPEN; ReplicaIndex: 0; Origin: 
4ebf5f8e-7ea7-409e-a4eb-c1d0ef6c57ed; Location: 
4ebf5f8e-7ea7-409e-a4eb-c1d0ef6c57ed/ozone-datanode-2.ozone_default,
State: OPEN; ReplicaIndex: 0; Origin: 4bac7a54-31ec-4795-b904-4ba5ff3f7343; 
Location: 4bac7a54-31ec-4795-b904-4ba5ff3f7343/ozone-datanode-1.ozone_default,
State: OPEN; ReplicaIndex: 0; Origin: d2ee3d1f-10f9-452d-b692-576edc6b6096; 
Location: d2ee3d1f-10f9-452d-b692-576edc6b6096/ozone-datanode-3.ozone_default]

bash-4.2$ ozone admin container close 1

bash-4.2$ ozone admin container info 1
Container id: 1
Pipeline id: 0edb486b-4bdc-478b-bc73-e4dc69e422e5
Container State: CLOSING
Datanodes: [d2ee3d1f-10f9-452d-b692-576edc6b6096/ozone-datanode-3.ozone_default,
4bac7a54-31ec-4795-b904-4ba5ff3f7343/ozone-datanode-1.ozone_default,
4ebf5f8e-7ea7-409e-a4eb-c1d0ef6c57ed/ozone-datanode-2.ozone_default]
Replicas: [State: OPEN; ReplicaIndex: 0; Origin: 
4ebf5f8e-7ea7-409e-a4eb-c1d0ef6c57ed; Location: 
4ebf5f8e-7ea7-409e-a4eb-c1d0ef6c57ed/ozone-datanode-2.ozone_default,
State: OPEN; ReplicaIndex: 0; Origin: d2ee3d1f-10f9-452d-b692-576edc6b6096; 
Location: d2ee3d1f-10f9-452d-b692-576edc6b6096/ozone-datanode-3.ozone_default,
State: OPEN; ReplicaIndex: 0; Origin: 4bac7a54-31ec-4795-b904-4ba5ff3f7343; 
Location: 4bac7a54-31ec-4795-b904-4ba5ff3f7343/ozone-datanode-1.ozone_default]

bash-4.2$ ozone admin container info 1
Container id: 1
Pipeline id: e203da07-5c9e-44c3-96d7-31d960dbb4c1
Container State: CLOSED
Datanodes: [4ebf5f8e-7ea7-409e-a4eb-c1d0ef6c57ed/ozone-datanode-2.ozone_default,
d2ee3d1f-10f9-452d-b692-576edc6b6096/ozone-datanode-3.ozone_default,
4bac7a54-31ec-4795-b904-4ba5ff3f7343/ozone-datanode-1.ozone_default]
Replicas: [State: CLOSED; ReplicaIndex: 0; Origin: 
4ebf5f8e-7ea7-409e-a4eb-c1d0ef6c57ed; Location: 
4ebf5f8e-7ea7-409e-a4eb-c1d0ef6c57ed/ozone-datanode-2.ozone_default,
State: CLOSED; ReplicaIndex: 0; Origin: d2ee3d1f-10f9-452d-b692-576edc6b6096; 
Location: d2ee3d1f-10f9-452d-b692-576edc6b6096/ozone-datanode-3.ozone_default,
State: CLOSED; ReplicaIndex: 0; Origin: 4bac7a54-31ec-4795-b904-4ba5ff3f7343; 
Location: 4bac7a54-31ec-4795-b904-4ba5ff3f7343/ozone-datanode-1.ozone_default] 
{code}
Recon API [http://localhost:9888/api/v1/containers/unhealthy] response was:
{code:java}
{"missingCount": 1,"underReplicatedCount": 0,"overReplicatedCount": 
0,"misReplicatedCount": 0,"containers": []}{code}
After removal of Docker DN containers Recon API response for the same endpoint 
was:
{code:java}
{"missingCount": 1,"underReplicatedCount": 0,"overReplicatedCount": 
0,"misReplicatedCount": 0,"containers": [{"containerID": 1,"containerState": 
"MISSING","unhealthySince": 1678295590336,"expectedReplicaCount": 
3,"actualReplicaCount": 0,"replicaDeltaCount": 3,"reason": null,"keys": 
1000,"pipelineID": "0edb486b-4bdc-478b-bc73-e4dc69e422e5","replicas": 
[{"containerId": 1,"datanodeUuid": 
"4ebf5f8e-7ea7-409e-a4eb-c1d0ef6c57ed","datanodeHost": 
"ozone-datanode-2.ozone_default","firstSeenTime": 1678295188710,"lastSeenTime": 
1678295529741,"lastBcsId": 0},{"containerId": 1,"datanodeUuid": 
"4bac7a54-31ec-4795-b904-4ba5ff3f7343","datanodeHost": 
"ozone-datanode-1.ozone_default","firstSeenTime": 1678295188687,"lastSeenTime": 
1678295464727,"lastBcsId": 0},{"containerId": 1,"datanodeUuid": 
"d2ee3d1f-10f9-452d-b692-576edc6b6096","datanodeHost": 
"ozone-datanode-3.ozone_default","firstSeenTime": 1678295188710,"lastSeenTime": 
1678295464721,"lastBcsId": 0}]}]}{code}

> Provide a way for admin to identify all unhealthy container replicas
> --------------------------------------------------------------------
>
>                 Key: HDDS-7098
>                 URL: https://issues.apache.org/jira/browse/HDDS-7098
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Ethan Rose
>            Assignee: Devesh Kumar Singh
>            Priority: Major
>         Attachments: MissingContainers.png, image-2023-03-02-16-01-07-814.png
>
>
> Currently UNHEALTHY is a state that a container replica can be in 
> (ContainerReplicaProto#State), but not a state that the container can be in 
> overall (LifeCycleState). This means {{ozone admin container list}} has no 
> info about unhealthy containers, because it currently does not print replica 
> information. [Recon's 
> API|https://ozone.apache.org/docs/current/interface/reconapi.html] and UI 
> does not expose replica information either. The only way to determine 
> unhealthy containers is to run {{ozone admin container info <ID>}} for a 
> container that is already suspected to have unhealthy replicas. This jira 
> aims to provide a way to identify and filter container replica states, 
> through either Recon's UI, Recon's REST API, or client CLI.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to