siddhantsangwan commented on code in PR #4415:
URL: https://github.com/apache/ozone/pull/4415#discussion_r1143386451
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/balancer/MoveManager.java:
##########
@@ -109,8 +109,11 @@ public enum MoveResult {
// TODO - Should pending ops notify under lock to allow MM to schedule a
// delete after the move, but before anything else can, eg RM?
- // TODO - these need to be config defined somewhere, probably in the balancer
- private static final long MOVE_DEADLINE = 1000 * 60 * 60; // 1 hour
+ /*
+ moveTimeout and replicationTimeout are set by ContainerBalancer.
+ */
+ private long moveTimeout = 1000 * 65 * 60;
+ private long replicationTimeout = 1000 * 50 * 60;
private static final double MOVE_DEADLINE_FACTOR = 0.95;
Review Comment:
> If we have a replication timeout of 50 mins, then 95% of that is 47.5, so
the DN will give up 2.5 mins before SCM does. Feels like the DN is then giving
up too early. If we have a 10 minute timeout or a 60 minute timeout, the DN
doesn't need to give up earlier for the longer timeout.
That's a good point. Created HDDS-8230 for fixing this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]