You should register each class you plan to use within your RDDs. In your case you only have an RDD of Strings, so you don’t even really need a registrator (strings are registered by default). But if you made custom objects you would use one.
To speed up Kryo you can also add kryo.setReferences(false) or set spark.kryo.referenceTracking = false. This disables tracking of circular references. But in general benchmarking on this small amount of data, you’ll probably have noise from the JVM starting up. Matei On Jan 22, 2014, at 12:27 PM, suman bharadwaj <[email protected]> wrote: > Hi, > > I'm using the below SPARK Code. Currently i have a file of size 25 MB. And > I'm trying to do a comparative study on Kryo and Java serialization. > > I had couple of questions: > > 1. How do you know which classes to register in Kryo ? [ highlighted in > yellow ] > 2. When data is small, I'm seeing Java Serialization has better performance > than Kryo. so was wondering whether the below code represents the correct > usage of Kryo ? > > import org.apache.spark._ > import com.esotericsoftware.kryo.Kryo > import org.apache.spark.serializer.KryoRegistrator > import org.apache.hadoop.io.LongWritable > import org.apache.hadoop.io.Text > import org.apache.spark.storage.StorageLevel > > class MyRegistrator extends KryoRegistrator { > override def registerClasses(kryo: Kryo) { > kryo.register(classOf[LongWritable]) > kryo.register(classOf[Text]) > kryo.register(classOf[Integer]) > kryo.register(classOf[Array[String]]) > } > } > > object HTest { > > def main(args: Array[String]) { > System.setProperty("spark.serializer", > "org.apache.spark.serializer.KryoSerializer") > System.setProperty("spark.kryo.registrator", "MyRegistrator") > val sc = new SparkContext("local[4]","Test") > val input = > sc.textFile("/home/Test/DataSet/cd7a58dc-2053-4811-8463-b144781352ac_000004.csv").persist(StorageLevel.MEMORY_ONLY_SER) > println(input.count()) > Thread.sleep(30000L) > println(input.count()) > Thread.sleep(30000L) > } > } > > Your Help is Highly appreciated. > > Regards, > SB
