bharatviswa504 commented on pull request #2294:
URL: https://github.com/apache/ozone/pull/2294#issuecomment-852885323


   With this approach we see an issue
   
   
   Before restart, 2 pipelines closed, and let's say it removed and create a 
new pipeline. But in the SCM pipeline table it has old 2 pipelines, as 
remove/new pipeline are not persisted to DB as SCM is force killed.
   
   As we call refresh and validate we exit safe mode after 2nd pipeline 
removes, and we validate pipeline rules, and we do not wait for all the pending 
transactions.
   
   This causes problems like reading/write will fail, even after SCM is out of 
safe mode.
   ```
   2021-06-02 05:51:04,208 INFO 
org.apache.hadoop.hdds.scm.safemode.HealthyPipelineSafeModeRule: Refreshed 
total pipeline count is 1, healthy pipeline threshold count is 1
   2021-06-02 05:51:04,208 INFO 
org.apache.hadoop.hdds.scm.safemode.OneReplicaPipelineSafeModeRule: Total 
pipeline count is 1, pipeline's with at least one datanode reported threshold 
count is 1
   2021-06-02 05:51:04,209 INFO 
org.apache.hadoop.hdds.scm.safemode.HealthyPipelineSafeModeRule: Refreshed 
total pipeline count is 0, healthy pipeline threshold count is 0
   2021-06-02 05:51:04,209 INFO 
org.apache.hadoop.hdds.scm.safemode.SCMSafeModeManager: 
HealthyPipelineSafeModeRule rule is successfully validated
   2021-06-02 05:51:04,209 INFO 
org.apache.hadoop.hdds.scm.safemode.OneReplicaPipelineSafeModeRule: Total 
pipeline count is 0, pipeline's with at least one datanode reported threshold 
count is 0
   ```
   
   After an offline discussion with @bshashikant 
   1. We thought we shall refresh SCM safe mode rule once after leader Ready on 
all SCMs.
   2. And start DN RPC port only after leader ready, so that SCM does not come 
out of safe mode early by considering not upto date DB.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to