rmatharu commented on a change in pull request #1104: SAMZA-2266: Introduce a
backoff when there are repeated failures for host-affinity allocations
URL: https://github.com/apache/samza/pull/1104#discussion_r303545962
##########
File path:
samza-core/src/main/java/org/apache/samza/clustermanager/AbstractContainerAllocator.java
##########
@@ -198,12 +202,20 @@ protected final SamzaResourceRequest
peekPendingRequest() {
* @param preferredHost name of the host that you prefer to run the
processor on
*/
public final void requestResource(String processorId, String preferredHost) {
- SamzaResourceRequest request = getResourceRequest(processorId,
preferredHost);
+ requestResourceWithDelay(processorId, preferredHost, Duration.ZERO);
+ }
+
+ public final void requestResourceWithDelay(String processorId, String
preferredHost, Duration delay) {
+ SamzaResourceRequest request = getResourceRequestWithDelay(processorId,
preferredHost, delay);
Review comment:
Adding a delay to the timestamp on the request only delays the
"matching/assignment" of the request to a resource. However the request will be
sent to the RM immediately, since invoking
`issueResourceRequest(request) `
will call
`manager.requestResources(request)`
which will cause the AMRMClient to issue the request to the RM.
Now, at this instant, since the RM has not marked the
NM-bearing-failed-container as unhealthy, the request will land at the RM and
it may allocate a resource on the same NM again.
Whenenver this request is matched with this "bad" resource, it will fail
again, and hopefully the second (or subsequent) retry of this request will be
after the RM has marked the NM as unhealthy (due to exponential backoff).
So either we should make the actual requests to the RM spaced out in time
(exponentially), or at the very least document that the "matching" of requests
to resource is spaced out exponentially.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services