Hi, guys when i read in a file on spark shell, the console shows that broadcast_0 is stored to memory. i guess it's related to the file, but broadcast_0 is not the file itself, because they have different size. what does broadcast_0 stand for?
logs: 14/04/28 18:02:50 INFO MemoryStore: ensureFreeSpace(138811) called with curMem=138763, maxMem=311387750 14/04/28 18:02:50 INFO MemoryStore: Block broadcast_0 stored as values to memory (estimated size 135.6 KB, free 296.7 MB) a: org.apache.spark.rdd.RDD[String] = MappedRDD[4] at textFile at <console>:12 also when i do actions on an transfomed RDD after shuffles, the console shows: 14/04/28 16:36:15.106 INFO CoarseGrainedExecutorBackend: Got assigned task 56 14/04/28 16:36:15.106 INFO Executor: Running task ID 56 14/04/28 16:36:15.123 INFO BlockManager: Found block broadcast_0 locally.... i think after shuffles, actions have no relation to the original RDD, why the console shows: Found block broadcast_0 locally.... -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/what-does-broadcast-0-stand-for-tp4934.html Sent from the Apache Spark User List mailing list archive at Nabble.com.