Jagadish created SAMZA-886:
------------------------------
Summary: Investigate 'relax locality' to improve Host Affinity
Key: SAMZA-886
URL: https://issues.apache.org/jira/browse/SAMZA-886
Project: Samza
Issue Type: Bug
Reporter: Jagadish
I ran several tests experimenting Samza with a cluster of size 36 nodes. I have
the following observations:
1.On a cluster with about 50% utilization. The percentage of requests that are
mapped to preferred hosts seems to depend on yarn.container.count. The % is
higher when yarn.container.count is comparable to the size of the cluster.
(For example.) I get about 50% of requests matched when yarn.container.count is
30. and When yarn.container.count is 10, only 27% of requests are matched. (on
a 36 node cluster)
One reason is because, when spawning a large # of containers initially, many
requests are made in bulk successively, there is a good chance that any random
host in the cluster will match with the preferred request. However, when
spawning a particular container during failure, there's only one request for
the failed container, and it has a lesser chance of a match.
The results are averaged across 20 runs in each scenario.
2. On a cluster with about zero utilization, 100% of requests are matched to
preferred hosts irrespective of yarn.container.count.
This ticket is to explore alternatives to see if they will improve % of matched
hosts.
I believe these ideas are worth trying:
1. Yarn supports the idea of relaxed locality with request. We could set
'relaxed locality' to false. (This will ensure that we get the request on the
exact same host we ask for.) If we don't get such a request within a timeout,
we may re-request the same request with 'relaxed locality' to true. (as we
currently do now.)
2. Re-issue the same preferred host request again, if the hosts returned don't
match the request.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)