vvcephei commented on a change in pull request #8541: URL: https://github.com/apache/kafka/pull/8541#discussion_r414208673
########## File path: streams/src/test/java/org/apache/kafka/streams/processor/internals/assignment/HighAvailabilityTaskAssignorTest.java ########## @@ -41,132 +54,107 @@ import static org.easymock.EasyMock.replay; import static org.hamcrest.CoreMatchers.equalTo; import static org.hamcrest.MatcherAssert.assertThat; -import static org.junit.Assert.assertFalse; -import static org.junit.Assert.assertTrue; - -import java.util.HashMap; -import java.util.HashSet; -import java.util.Map; -import java.util.Set; -import java.util.UUID; -import org.apache.kafka.streams.processor.TaskId; -import org.apache.kafka.streams.processor.internals.assignment.AssignorConfiguration.AssignmentConfigs; -import org.easymock.EasyMock; -import org.junit.Test; +import static org.hamcrest.Matchers.empty; +import static org.hamcrest.Matchers.is; +import static org.hamcrest.Matchers.not; public class HighAvailabilityTaskAssignorTest { Review comment: I made a bunch of changes to this test, because it was pretty brittle with respect to changes in the HighAvailabilityTaskAssignor. For context, this is the second time I've touched the assignment code since we introduced the HATA, and it's the second time I've had to deal with irrelevant test failures in this class. First, I replaced the ClientState mocks with "real" ClientStates, constructed to represent the desired scenario for each test. Mocks are really more appropriate for isolating a component from _external_ components (like mocking a remote service). Mocking data types leads to verifying that a specific set of queries happens against the data type, which is likely to break any time the logic under test changes in any way. Another problem with data-type mocks is that they can violate the invariants of the data type itself. For example, you can mock a list that both `isEmpty` and contains items. In our case, we threw NPEs in the assignor that could never happen in production when the mocked assigned/standby tasks didn't agree with the assigned tasks or the stateful assigned tasks weren't mocked to agree with the lags. Now, we just construct a ClientState for each client, representing the desired scenario and make assertions on the resulting assignment. Second, the tests as written rely heavily on shared mutable fields inserted into shared mutable collections to build the assignor. This can be a good way to minimize the text inside the test method, which lets readers focus on the proper logic of the test itself. However, it makes it harder to understand the full context of a test, and it also raises the possibility of tests polluting each others' environments. Since in this particular case, localizing all the setup code is about as compact as factoring it out, I went ahead and minimized the shared fields, and eliminated the mutability, the tests are self-contained. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org