ivandika3 opened a new pull request, #5717:
URL: https://github.com/apache/ozone/pull/5717

   ## What changes were proposed in this pull request?
   
   In XceiverServerRatis#newRaftProperties, setSyncTimeoutRetry was set twice
   
   First, it is set to 
   
   (int) nodeFailureTimeoutMs / 
dataSyncTimeout.toIntExact(TimeUnit.MILLISECONDS) 
   which by default equals to 300_000 ms / 10_000 ms  =  30 retries
   
   From the comment, the intention of setting a finite number of retries is:
   
   Even if the leader is not able to complete write calls within the timeout 
seconds, it should just fail the operation and trigger pipeline close. failing 
the writeStateMachine call with limited retries will ensure even the leader 
initiates a pipeline close if its not able to complete write in the timeout 
configured.
   
   However, it was overridden in 
   
   int numSyncRetries = conf.getInt(
       OzoneConfigKeys.DFS_CONTAINER_RATIS_STATEMACHINEDATA_SYNC_RETRIES,
       OzoneConfigKeys.
           DFS_CONTAINER_RATIS_STATEMACHINEDATA_SYNC_RETRIES_DEFAULT);
   RaftServerConfigKeys.Log.StateMachineData.setSyncTimeoutRetry(properties,
       numSyncRetries); 
   Which set it to the default value -1 (retry indefinitely). 
   
   This might cause the leader to never initiate a pipeline close when its 
writeStateMachine time out (e.g. due to I/O issue).
   
   I propose we use the finite timeout retry and drop the setSyncTimeoutRetry 
configuration.
   
   I also added related comments about state machine data cache for better 
understanding. Please let me know if the explanation can be improved.
   
   This is a good avenue to re-evaluate the state machine data policy in 
Container State Machine.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-9821
   
   ## How was this patch tested?
   
   Existing tests. Only configurations change.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to