-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/44820/
-----------------------------------------------------------
Review request for samza.
Repository: samza
Description
-------
The ContainerRequestState class is currently not thread-safe. The class's
methods and state variables are called from both the ContainerAllocator and the
AMRMCallback handler.
Here's an analysis I summarized by looking at the current implementation. From
the below table, getContainersOnAHost() returns the entire list of containers
to the AbstractContainerAllocator which reads it. However, the Callback thread
invokes methods that write to the allocatedContainerQueue.
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| Method | requestQueue | hostsToCountMap |
allocatedContainerQueue | ThreadsAccesedFrom |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| updateRequestState | write | write/read | write
| Callback, Allocator |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| addContainer | | read | write
| Callback |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| updateStateAfterAssignment | write | write | read
| Allocator |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| releaseExtraContainers | write | write | write
| Allocator |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| releaseUnstartableContainer | | |
| Allocator |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| getContainersOnAHost | | | read
| Allocator |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
| getRequestsQueue | read | |
| Allocator |
+-----------------------------+--------------+-----------------+-------------------------+---------------------+
I'm proposing to make the ContainerRequestState class thread-safe by:
i) Returning a copy of the list (if callers need the entire list - which they
should not) for getContainersOnAHost, getRequestsQueue
ii) Expose more granular synchronized APIs for things like:
1. Getting num containers on a host.
2. Getting the number of pending requests.
3. Checking if the queue is empty etc. / peeking a queue etc.
The other methods are already synchronized.
This improvement is beneficial because:
1. This allows us to reason about concurrency more effectively when using the
request-state in multi-threaded contexts.
2. It's cleaner for the ContainerRequestState class to expose higher level
methods and not expose the guts of the class in its entirety.
Diffs
-----
samza-yarn/src/main/java/org/apache/samza/job/yarn/AbstractContainerAllocator.java
b4789e62beb1120f11a8101664b10c34ae930e58
samza-yarn/src/main/java/org/apache/samza/job/yarn/ContainerRequestState.java
3e3f48ce2b5c0802e8c7c4f09b632df2d2265c12
samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerAllocator.java
2b1bdab3c8de3184e930c244a8cae55813c33565
samza-yarn/src/test/java/org/apache/samza/job/yarn/TestContainerRequestState.java
402fe784120ac40ed542d9fa60d6a6d7df9c8cda
samza-yarn/src/test/java/org/apache/samza/job/yarn/TestHostAwareContainerAllocator.java
0c7a09f3e4c4c2ce6788be729d0bf4a294243c68
samza-yarn/src/test/java/org/apache/samza/job/yarn/TestSamzaTaskManager.java
9da1edf6ff165ef0306de8730853ad30551a9831
Diff: https://reviews.apache.org/r/44820/diff/
Testing
-------
All unit tests pass.
Thanks,
Jagadish Venkatraman