[ 
https://issues.apache.org/jira/browse/BEAM-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Davor Bonaci updated BEAM-2359:
-------------------------------
    Fix Version/s: 2.1.0

> SparkTimerInternals inputWatermarkTime does not get updated in cluster mode
> ---------------------------------------------------------------------------
>
>                 Key: BEAM-2359
>                 URL: https://issues.apache.org/jira/browse/BEAM-2359
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-spark
>            Reporter: Aviem Zur
>            Assignee: Amit Sela
>             Fix For: 2.1.0
>
>
> {{SparkTimerInternals#inputWatermarkTime}} does not get updated in cluster 
> mode.
> This causes windows to not get closed and state to increase forever in memory 
> and processing time to increase leading to eventual application crash (also, 
> triggers based on the watermark do not fire).
> The root cause is 
> a call from within the {{updateStateByKey}} operation in 
> [SparkGroupAlsoByWindowViaWindowSet|https://github.com/apache/beam/blob/master/runners/spark/src/main/java/org/apache/beam/runners/spark/stateful/SparkGroupAlsoByWindowViaWindowSet.java#L241-L242]
>  which tries to access a static reference to a {{GlobalWatermarkHolder}} 
> broadcast variable, however, in cluster mode this static reference would be a 
> different one in the executor's JVM and is null (this works in local mode 
> since the executor and driver are on the same JVM).
> The fix is not trivial since even if we use the broadcast correctly, 
> broadcast variables can't be used in this case (from within 
> {{updateStateByKey}}) since  {{updateStateByKey}} is a {{DStream}} operator 
> and not an {{RDD}} operator so it will not be updated every micro-batch but 
> rather will retain the same initial value.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to