[
https://issues.apache.org/jira/browse/SPARK-6404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14368283#comment-14368283
]
Saisai Shao commented on SPARK-6404:
------------------------------------
Hi [[email protected]], I think for current broadcast, you cannot update
the data that is already broadcasted. If you want to update the data in each
interval, I'm not sure is the below snippet works for you:
{code}
stream.foreachRDD { rdd =>
val foo = sc.broadcast(Foo)
rdd.foreach {
foo.value.xxx
}
}
{code}
> Call broadcast() in each interval for spark streaming programs.
> ---------------------------------------------------------------
>
> Key: SPARK-6404
> URL: https://issues.apache.org/jira/browse/SPARK-6404
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core
> Reporter: Yifan Wang
>
> If I understand it correctly, Spark’s broadcast() function will be called
> only once at the beginning of the batch. For streaming applications that need
> to run for 24/7, it is often needed to update variables that shared by
> broadcast() dynamically. It would be ideal if broadcast() could be called at
> the beginning of each interval.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]