Hi,
One of our use cases utilizes instances of objects that are instantiated by
name, to do the data processing. This means that we are not able to
directly pass the broadcast variable to the method executing it.
The work around we found by looking at the code was to request the variable
form the SparkEnv, which has the downside of requiring us to know the
internal name of the broadcasted variable and it is an internal of the
system which we can not rely on:
val mMap =
org.apache.spark.SparkEnv.get.blockManager.getSingle("broadcast_0").get.asInstanceOf[Map[String,
String]]
The question is, would it be possible to access the broadcast variables by
name using something like this?
// On the main method
val mMap = sc.broadcast(getMap(...))
val bname = mMap.name()
...
// On the external resource
val mMap = sc.broadcastVariable(bname)
Thanks,
Elmer