Antony Mayi created BEAM-2398: --------------------------------- Summary: Increasing latency within DirectRunner caused by cumulated TransformWatermarks Key: BEAM-2398 URL: https://issues.apache.org/jira/browse/BEAM-2398 Project: Beam Issue Type: Bug Components: runner-direct Affects Versions: 2.0.0 Reporter: Antony Mayi Assignee: Thomas Groh Attachments: LatencyTest.java
Over the time the end-to-end latency of a pipeline running on DirectRunner is significantly increasing. This is caused by ever growing sets of: * {{WatermarkManager.TransformWatermarks.inputWatermark.pendingElements}} * {{WatermarkManager.TransformWatermarks.synchronizedProcessingInputWatermark.pendingBundles}} That means calls to {{WatermarkManager.TransformWatermarks.refresh()}} which need to iterate through that collections take longer and longer and the latency is growing. I believe it is the line {{WaterMark.updatePending()}} line: {quote} if (input != null) { // Add the unprocessed inputs completedTransform.addPending(result.getUnprocessedInputs()); {quote} that's adding the items that are never removed. See attached demo code showing the increasing latency. -- This message was sent by Atlassian JIRA (v6.3.15#6346)