I've figured out what the problem is, but I don't understand why. I'm
hoping somebody can explain this:
(in the spark shell)
val lb = sc.broadcast( (1 to 10000000).toSet)
val breakMe = sc.parallelize(1 to 250).mapPartitions( it => {val
serializedSet = lb.value.toString; Array(0).iterator}).count //works great
val ll = (1 to 10000000).toSet
val lb = sc.broadcast(ll)
val breakMe = sc.parallelize(1 to 250).mapPartitions( it => {val
serializedSet = lb.value.toString; Array(0).iterator}).count //Crashes
ignominiously