Hi broadcast variables are shipped for the first time it is accessed in a transformation to the executors used by the transformation. It will NOT updated subsequently, even if the value has changed. However, a new value will be shipped to any new executor comes into play after the value has changed. This way, changing value of broadcast variable is not a good idea as it can create inconsistency within cluster. From documentatins:
In addition, the object v should not be modified after it is broadcast in order to ensure that all nodes get the same value of the broadcast variable On Sat, May 16, 2015 at 10:39 AM, N B <nb.nos...@gmail.com> wrote: > Thanks Ilya. Does one have to call broadcast again once the underlying > data is updated in order to get the changes visible on all nodes? > > Thanks > NB > > > On Fri, May 15, 2015 at 5:29 PM, Ilya Ganelin <ilgan...@gmail.com> wrote: > >> The broadcast variable is like a pointer. If the underlying data changes >> then the changes will be visible throughout the cluster. >> On Fri, May 15, 2015 at 5:18 PM NB <nb.nos...@gmail.com> wrote: >> >>> Hello, >>> >>> Once a broadcast variable is created using sparkContext.broadcast(), can >>> it >>> ever be updated again? The use case is for something like the underlying >>> lookup data changing over time. >>> >>> Thanks >>> NB >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Broadcast-variables-can-be-rebroadcast-tp22908.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> > -- Best Regards, Ayan Guha