[ 
https://issues.apache.org/jira/browse/HDDS-13890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated HDDS-13890:
-------------------------------
    Description: 
Currently if we would like to migrate the SCM from (scm1, scm2, scm3) to (scm4, 
scm5, scm6), all the datanodes need to be restarted 2 times with updated
ozone.scm.nodes.<service> # Before migration: Update ozone.scm.nodes to (scm1, 
scm2, scm3, scm4, scm5, scm6)
 # After (scm1, scm2, scm3) are decommissioned: Update ozone.scm.nodes to 
(scm4, scm5, scm6)

As mentioned in HDDS-12391, rolling restarting all the datanodes might take a 
while. For large datanodes fleet this might take a lot of time (days or even 
weeks).

It might be good to support dynamic reconfiguration of SCM endpoints in DN to 
prevent restarts. A possible flow
 * Admin update the "ozone.scm.nodes" to the new value (with some new nodes and 
removed nodes)
 * DN will compare the new and previous configuration and find the SCM 
endpoints to add and remove
 * DN will add the SCM endpoints (e.g. SCMConnectionManager#addSCMServer) and 
then remove the SCM endpoints (e.g. SCMConnectionManager#removeSCMServer).

  was:
Currently if we would like to migrate the SCM from (scm1, scm2, scm3) to (scm4, 
scm5, scm6), all the datanodes need to be restarted 2 times with updated
ozone.scm.nodes.<service> # Before migration: Update ozone.scm.nodes to (scm1, 
scm2, scm3, scm4, scm5, scm6)
 # After (scm1, scm2, scm3) are decommissioned: Update ozone.scm.nodes to 
(scm4, scm5, scm6)

As mentioned in HDDS-12391, rolling restarting all the datanodes might take a 
while. For large datanodes fleet this might take a lot of time (days or event 
weeks).

It might be good to support dynamic reconfiguration of SCM endpoints in DN to 
prevent restarts. A possible flow
 * Admin update the "ozone.scm.nodes" to the new value (with some new nodes and 
removed nodes)
 * DN will compare the new and previous configuration and find the SCM 
endpoints to add and remove
 * DN will add the SCM endpoints (e.g. SCMConnectionManager#addSCMServer) and 
then remove the SCM endpoints (e.g. SCMConnectionManager#removeSCMServer).


> Datanode supports dynamic configuration of SCM
> ----------------------------------------------
>
>                 Key: HDDS-13890
>                 URL: https://issues.apache.org/jira/browse/HDDS-13890
>             Project: Apache Ozone
>          Issue Type: Task
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>
> Currently if we would like to migrate the SCM from (scm1, scm2, scm3) to 
> (scm4, scm5, scm6), all the datanodes need to be restarted 2 times with 
> updated
> ozone.scm.nodes.<service> # Before migration: Update ozone.scm.nodes to 
> (scm1, scm2, scm3, scm4, scm5, scm6)
>  # After (scm1, scm2, scm3) are decommissioned: Update ozone.scm.nodes to 
> (scm4, scm5, scm6)
> As mentioned in HDDS-12391, rolling restarting all the datanodes might take a 
> while. For large datanodes fleet this might take a lot of time (days or even 
> weeks).
> It might be good to support dynamic reconfiguration of SCM endpoints in DN to 
> prevent restarts. A possible flow
>  * Admin update the "ozone.scm.nodes" to the new value (with some new nodes 
> and removed nodes)
>  * DN will compare the new and previous configuration and find the SCM 
> endpoints to add and remove
>  * DN will add the SCM endpoints (e.g. SCMConnectionManager#addSCMServer) and 
> then remove the SCM endpoints (e.g. SCMConnectionManager#removeSCMServer).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to