rmatharu commented on a change in pull request #1170: Samza-2330: Handle
expired resource request for Container allocator when host affinity is disabled
URL: https://github.com/apache/samza/pull/1170#discussion_r330824697
##########
File path: docs/learn/documentation/versioned/jobs/samza-configurations.md
##########
@@ -299,7 +299,7 @@ Samza supports both standalone and clustered
([YARN](yarn-jobs.html)) [deploymen
|cluster-manager.container.fail.job.after.retries|true|This configuration sets
the behavior of the job after all `cluster-manager.container.retry.count`s are
exhausted and each retry is within the
`cluster-manager.container.retry.window.ms` period on any single container. If
set to true, the whole job will fail if any container fails after the last
retry. If set to false, the job will continue to run without the failed
container. The typical use cases of setting this to false is to aid in
debugging the cluster manager when containers fail unexpectedly and also to
allow other healthy containers to continue to run so that lag does not
accumulate across all containers. Samza job operators should diligent in
monitoring the `job-healthy` and `failed-containers` metrics when setting this
configuration to false. A full restart of the job is required if another
attempt to restart the container is needed after the container failure.|
|cluster-manager.jobcoordinator.jmx.enabled|true|This is deprecated in favor
of `job.jmx.enabled`|
|cluster-manager.allocator.sleep.ms|3600|The container allocator thread is
responsible for matching requests to allocated containers. The sleep interval
for this thread is configured using this property.|
-|cluster-manager.container.request.timeout.ms|5000|The allocator thread
periodically checks the state of the container requests and allocated
containers to determine the assignment of a container to an allocated resource.
This property determines the number of milliseconds before a container request
is considered to have expired / timed-out. When a request expires, it gets
allocated to any available container that was returned by the cluster manager.|
+|cluster-manager.container.request.timeout.ms|5000|The allocator thread
periodically checks the state of the container requests and allocated
containers to determine the assignment of a container to an allocated resource.
This property determines the number of milliseconds before a container request
is considered to have expired / timed-out. When a request expires, it gets
allocated to any available container that was returned by the cluster manager
in either of the case of `job.host-affinity.enabled` is set to true or false.|
Review comment:
Nit:
When a request expires, it gets allocated to any available container that
was returned by the cluster manager, if none is available the existing resource
request is cancelled and a new ANY-HOST resource request is issued.
This behavior holds regardless of host-affinity enabled or not.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services