Re: Periodic Broadcast in Apache Spark Streaming

Tathagata Das Mon, 23 Feb 2015 00:12:55 -0800

You could do something like this.


def rddTrasnformationUsingBroadcast(rdd: RDD[...]): RDD[...] = {

   val broadcastToUse = getBroadcast()    // get the reference to a
broadcast variable, new or existing.

   rdd.map { ......  } // use broadcast variable
}


dstream.transform(rddTrasnformationUsingBroadcast)


The function `rddTrasnformationUsingBroadcast` will be called at every
interval, which will call `getBroadcast` every time. That's an arbitrary
function returning a broadcast variable, so you can update what broadcast
variable to use whenever you want.

TD



On Wed, Feb 18, 2015 at 6:11 AM, aanilpala <aanilp...@gmail.com> wrote:

> I am implementing a stream learner for text classification. There are some
> single-valued parameters in my implementation that needs to be updated as
> new stream items arrive. For example, I want to change learning rate as the
> new predictions are made. However, I doubt that there is a way to broadcast
> variables after the initial broadcast. So what happens if I need to
> broadcast a variable every time I update it. If there is a way to do it or
> a
> workaround for what I want to accomplish in Spark Streaming, I'd be happy
> to
> hear about it.
>
> Thanks in advance.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Periodic-Broadcast-in-Apache-Spark-Streaming-tp21703.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Periodic Broadcast in Apache Spark Streaming

Reply via email to