I notice that spark serialize each task with the dependencies (files and JARs added to the SparkContext) , def serializeWithDependencies( task: Task[_], currentFiles: HashMap[String, Long], currentJars: HashMap[String, Long], serializer: SerializerInstance) : ByteBuffer = {
val out = new ByteArrayOutputStream(4096) val dataOut = new DataOutputStream(out) // Write currentFiles dataOut.writeInt(currentFiles.size) for ((name, timestamp) <- currentFiles) { dataOut.writeUTF(name) dataOut.writeLong(timestamp) } // Write currentJars dataOut.writeInt(currentJars.size) for ((name, timestamp) <- currentJars) { dataOut.writeUTF(name) dataOut.writeLong(timestamp) } // Write the task itself and finish dataOut.flush() val taskBytes = serializer.serialize(task).array() out.write(taskBytes) ByteBuffer.wrap(out.toByteArray) } Why not send currentJars and currentFiles to exetutor using actor? I think it's not necessary to serialize them for each task. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/send-currentJars-and-currentFiles-to-exetutor-with-actor-tp9381.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org