[ 
https://issues.apache.org/jira/browse/HDDS-9883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devesh Kumar Singh updated HDDS-9883:
-------------------------------------
    Description: 
Between SCM *IncrementalContainerReportHandler* and 
*ReconIncrementalContainerReportHandler* , Recon always connecting to SCM and 
verifying for each container before adding new container to its own 
containerStateManager cache. This could be a bottle neck if SCM may respond 
slow and frequent ICR requests may pile up in queue. So as of now, we'll 
improve multiple things as part of this JIRA:
 * Recon to verify containers in batches from SCM on receive of ICR request 
from DNs.
 * Reduce the scmClient configs for Recon before connecting to SCM:
 ** hdds.scmclient.rpc.timeout - 1 min (Default value is 15 mins)
 ** hdds.scmclient.failover.max.retry - 3 (Default value is dynamic and 
computed, but based on default values, computed value is 15)

          Above 2 SCM client configs will be updated to respective new values 
as mentioned for recon to connect to SCM. These 2 SCM client configs will be 
exposed and mapped with new recon configs to be able to adjust independently in 
recon.

        *New configs in Recon:*
 * 
 ** ozone.recon.scmclient.rpc.timeout
 ** ozone.recon.scmclient.failover.max.retry
 * Merge the Incremental container report (ICR) to existing list of ICR reports.

 

  was:Between SCM *IncrementalContainerReportHandler* and 
*ReconIncrementalContainerReportHandler* , Recon always connecting to SCM and 
verifying for each container before adding new container to its own 
containerStateManager cache. This could be a bottle neck if SCM may respond 
slow and frequent ICR requests may pile up in queue. So as of now, we'll 
improve this part in Recon to implement queue based logic for verifying each 
container from SCM in async way and on receive of  ICR request from DN, Recon 
will first simply add the new container if container not exists and also add 
the container info to a queue, then later an async task in Recon will process 
that queue to verify each container from SCM. This will decouple the ICR 
request processing at Recon and reduce the possibility of bottleneck in 
processing of ICR requests quickly and eventually reduce the possibility of 
getting "capacity not available" as ICR requests queue will be processed 
quickly.


> Recon - Improve the performance of processing of IncrementalContainerReport 
> requests from DN
> --------------------------------------------------------------------------------------------
>
>                 Key: HDDS-9883
>                 URL: https://issues.apache.org/jira/browse/HDDS-9883
>             Project: Apache Ozone
>          Issue Type: Task
>          Components: Ozone Recon
>            Reporter: Devesh Kumar Singh
>            Assignee: Devesh Kumar Singh
>            Priority: Major
>
> Between SCM *IncrementalContainerReportHandler* and 
> *ReconIncrementalContainerReportHandler* , Recon always connecting to SCM and 
> verifying for each container before adding new container to its own 
> containerStateManager cache. This could be a bottle neck if SCM may respond 
> slow and frequent ICR requests may pile up in queue. So as of now, we'll 
> improve multiple things as part of this JIRA:
>  * Recon to verify containers in batches from SCM on receive of ICR request 
> from DNs.
>  * Reduce the scmClient configs for Recon before connecting to SCM:
>  ** hdds.scmclient.rpc.timeout - 1 min (Default value is 15 mins)
>  ** hdds.scmclient.failover.max.retry - 3 (Default value is dynamic and 
> computed, but based on default values, computed value is 15)
>           Above 2 SCM client configs will be updated to respective new values 
> as mentioned for recon to connect to SCM. These 2 SCM client configs will be 
> exposed and mapped with new recon configs to be able to adjust independently 
> in recon.
>         *New configs in Recon:*
>  * 
>  ** ozone.recon.scmclient.rpc.timeout
>  ** ozone.recon.scmclient.failover.max.retry
>  * Merge the Incremental container report (ICR) to existing list of ICR 
> reports.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to