[jira] [Updated] (HDDS-7327) Recon to note down replica states

Siyao Meng (Jira) Thu, 03 Nov 2022 11:10:26 -0700


     [ 
https://issues.apache.org/jira/browse/HDDS-7327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Siyao Meng updated HDDS-7327:
-----------------------------
    Description: 
Related previous discussion: HDDS-7098

Right now it seems that Recon only takes note of the overall container health 
state in the Recon SQL DB:

{code:bash}
ij version 10.14
ij> connect 'jdbc:derby:ozone_recon_derby.db';
ij> show tables;
TABLE_SCHEM         |TABLE_NAME                    |REMARKS
------------------------------------------------------------------------
...
SYSIBM              |SYSDUMMY1                     |
RECON               |CLUSTER_GROWTH_DAILY          |
RECON               |FILE_COUNT_BY_SIZE            |
RECON               |GLOBAL_STATS                  |
RECON               |RECON_TASK_STATUS             |
RECON               |UNHEALTHY_CONTAINERS          |

28 rows selected
ij> select * from RECON.UNHEALTHY_CONTAINERS;
container_id        |container_state |in_state_since      
|expected_r&|actual_rep&|replica_de&|reason
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1                   |UNDER_REPLICATED|1665692819704       |3          |2        
  |1          |NULL
{code}

but Recon does not record the [health state of individual 
replicas|https://github.com/apache/ozone/blob/1e546103f0650dadc29cc5b6c931c0040e2d1d9c/hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto#L209-L220]
 in the container. This will be useful for users to check replica states in 
Recon.

We might want to persist the info to Recon SQL DB only when datanodes report 
that a replica is unhealthy. Do not persist healthy ones to avoid too many 
writes (can lead to performance issues)

  was:
Related previous discussion: HDDS-7098

Right now it seems that Recon only takes note of the overall container health 
state in the Recon SQL DB:

{code:bash}
ij version 10.14
ij> connect 'jdbc:derby:ozone_recon_derby.db';
ij> show tables;
TABLE_SCHEM         |TABLE_NAME                    |REMARKS
------------------------------------------------------------------------
...
SYSIBM              |SYSDUMMY1                     |
RECON               |CLUSTER_GROWTH_DAILY          |
RECON               |FILE_COUNT_BY_SIZE            |
RECON               |GLOBAL_STATS                  |
RECON               |RECON_TASK_STATUS             |
RECON               |UNHEALTHY_CONTAINERS          |

28 rows selected
ij> select * from RECON.UNHEALTHY_CONTAINERS;
container_id        |container_state |in_state_since      
|expected_r&|actual_rep&|replica_de&|reason
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1                   |UNDER_REPLICATED|1665692819704       |3          |2        
  |1          |NULL
{code}

but Recon does not record the [health state of individual 
replicas|https://github.com/apache/ozone/blob/1e546103f0650dadc29cc5b6c931c0040e2d1d9c/hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto#L209-L220]
 in the container. This will be useful for users to check replica states in 
Recon.

We might want to persist the info to Recon SQL DB when datanodes report that a 
replica is unhealthy.


> Recon to note down replica states
> ---------------------------------
>
>                 Key: HDDS-7327
>                 URL: https://issues.apache.org/jira/browse/HDDS-7327
>             Project: Apache Ozone
>          Issue Type: Task
>          Components: Ozone Recon
>            Reporter: Siyao Meng
>            Priority: Major
>
> Related previous discussion: HDDS-7098
> Right now it seems that Recon only takes note of the overall container health 
> state in the Recon SQL DB:
> {code:bash}
> ij version 10.14
> ij> connect 'jdbc:derby:ozone_recon_derby.db';
> ij> show tables;
> TABLE_SCHEM         |TABLE_NAME                    |REMARKS
> ------------------------------------------------------------------------
> ...
> SYSIBM              |SYSDUMMY1                     |
> RECON               |CLUSTER_GROWTH_DAILY          |
> RECON               |FILE_COUNT_BY_SIZE            |
> RECON               |GLOBAL_STATS                  |
> RECON               |RECON_TASK_STATUS             |
> RECON               |UNHEALTHY_CONTAINERS          |
> 28 rows selected
> ij> select * from RECON.UNHEALTHY_CONTAINERS;
> container_id        |container_state |in_state_since      
> |expected_r&|actual_rep&|replica_de&|reason
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 1                   |UNDER_REPLICATED|1665692819704       |3          |2      
>     |1          |NULL
> {code}
> but Recon does not record the [health state of individual 
> replicas|https://github.com/apache/ozone/blob/1e546103f0650dadc29cc5b6c931c0040e2d1d9c/hadoop-hdds/interface-server/src/main/proto/ScmServerDatanodeHeartbeatProtocol.proto#L209-L220]
>  in the container. This will be useful for users to check replica states in 
> Recon.
> We might want to persist the info to Recon SQL DB only when datanodes report 
> that a replica is unhealthy. Do not persist healthy ones to avoid too many 
> writes (can lead to performance issues)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Updated] (HDDS-7327) Recon to note down replica states

Reply via email to