[ https://issues.apache.org/jira/browse/BEAM-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Antony Mayi updated BEAM-2398: ------------------------------ Attachment: LatencyTest.java > Increasing latency within DirectRunner caused by cumulated TransformWatermarks > ------------------------------------------------------------------------------ > > Key: BEAM-2398 > URL: https://issues.apache.org/jira/browse/BEAM-2398 > Project: Beam > Issue Type: Bug > Components: runner-direct > Affects Versions: 2.0.0 > Reporter: Antony Mayi > Assignee: Thomas Groh > Attachments: LatencyTest.java > > > Over the time the end-to-end latency of a pipeline running on DirectRunner is > significantly increasing. > This is caused by ever growing sets of: > * {{WatermarkManager.TransformWatermarks.inputWatermark.pendingElements}} > * > {{WatermarkManager.TransformWatermarks.synchronizedProcessingInputWatermark.pendingBundles}} > That means calls to {{WatermarkManager.TransformWatermarks.refresh()}} which > need to iterate through that collections take longer and longer and the > latency is growing. > I believe it is the line {{WaterMark.updatePending()}} line: > {quote} > if (input != null) { > // Add the unprocessed inputs > completedTransform.addPending(result.getUnprocessedInputs()); > {quote} > that's adding the items that are never removed. > See attached demo code showing the increasing latency. -- This message was sent by Atlassian JIRA (v6.3.15#6346)