global events
num = events.value
print num
events.unpersist()
events = sc.broadcast(num + 1)
alert_stream.foreachRDD(test)
# Comment this line and no error occurs
ssc.checkpoint('dir')
ssc.start()
ssc.awaitTermination()
On Fri, Jul 22, 2016 at 1:50 PM, Joe Panciera
wrot
Hi,
I'm attempting to use broadcast variables to update stateful values used
across the cluster for processing. Essentially, I have a function that is
executed in .foreachRDD that updates the broadcast variable by calling
unpersist() and then rebroadcasting. This works without issues when I
execut
Hi,
I have a rather complicated situation thats raised an issue regarding
consuming multiple data sources for processing. Unlike the use cases I've
found, I have 3 sources of different formats. There's one 'main' stream A
that does the processing, and 2 sources B and C that provide elements
requir
Wed, Jun 8, 2016 at 1:27 PM, Joe Panciera wrote:
> I've run into an issue where a global variable used within an
> UpdateStateByKey function isn't being assigned after the application
> restarts from a checkpoint. Using ForEachRDD I have a global variable 'A'
>
I've run into an issue where a global variable used within an
UpdateStateByKey function isn't being assigned after the application
restarts from a checkpoint. Using ForEachRDD I have a global variable 'A'
that is propagated from a file every time a batch runs, and A is then used
in an UpdateStateBy