snleee commented on code in PR #13232:
URL: https://github.com/apache/pinot/pull/13232#discussion_r1623143039
##########
pinot-controller/src/main/java/org/apache/pinot/controller/helix/core/retention/RetentionManager.java:
##########
@@ -55,7 +55,7 @@
*/
public class RetentionManager extends ControllerPeriodicTask<Void> {
public static final long OLD_LLC_SEGMENTS_RETENTION_IN_MILLIS =
TimeUnit.DAYS.toMillis(5L);
- private static final RetryPolicy DEFAULT_RETRY_POLICY =
RetryPolicies.exponentialBackoffRetryPolicy(5, 1000L, 2.0f);
+ private static final RetryPolicy DEFAULT_RETRY_POLICY =
RetryPolicies.randomDelayRetryPolicy(20, 100L, 200L);
Review Comment:
Main motivation:
- Both retention manager & segment upload were both trying to update
idealstate
- Retention Manager was trying to update the idealstate without lock and
used a simple exponential back-off
- Segment upload side idealstate update is using
`DEFAULT_TABLE_IDEALSTATES_UPDATE_RETRY_POLICY =
RetryPolicies.randomDelayRetryPolicy(20, 100L, 200L);`
- Segment upload side were updating idealstate after holding the lock
Observation
- Retention Manager failed a lot of time after 5 attempts while most of
idealstate updates via segment upload went through.
I thought about this briefly. I feel that as long as we hold the lock, retry
logic may not matter too much. @Jackie-Jiang how do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]