[
https://issues.apache.org/jira/browse/BEAM-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Davor Bonaci updated BEAM-2359:
-------------------------------
Fix Version/s: 2.1.0
> SparkTimerInternals inputWatermarkTime does not get updated in cluster mode
> ---------------------------------------------------------------------------
>
> Key: BEAM-2359
> URL: https://issues.apache.org/jira/browse/BEAM-2359
> Project: Beam
> Issue Type: Bug
> Components: runner-spark
> Reporter: Aviem Zur
> Assignee: Amit Sela
> Fix For: 2.1.0
>
>
> {{SparkTimerInternals#inputWatermarkTime}} does not get updated in cluster
> mode.
> This causes windows to not get closed and state to increase forever in memory
> and processing time to increase leading to eventual application crash (also,
> triggers based on the watermark do not fire).
> The root cause is
> a call from within the {{updateStateByKey}} operation in
> [SparkGroupAlsoByWindowViaWindowSet|https://github.com/apache/beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L241-L242]
> which tries to access a static reference to a {{GlobalWatermarkHolder}}
> broadcast variable, however, in cluster mode this static reference would be a
> different one in the executor's JVM and is null (this works in local mode
> since the executor and driver are on the same JVM).
> The fix is not trivial since even if we use the broadcast correctly,
> broadcast variables can't be used in this case (from within
> {{updateStateByKey}}) since {{updateStateByKey}} is a {{DStream}} operator
> and not an {{RDD}} operator so it will not be updated every micro-batch but
> rather will retain the same initial value.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)