You are only required to add classes to Kryo (compulsorily) if you use a
specific setting:
//require registration of all classes with Kyro
.set("spark.kryo.registrationRequired","true")
Here's an example of my setup, I think this is the best approach because
it forces me to really think about what I am serializing:
// for kyro serializer it wants to register all classes that need to be
serialized Class[] kryoClassArray = new Class[]{DropResult.class,
DropEvaluation.class, PrintHetSharing.class}; SparkConf sparkConf = new
SparkConf() .setAppName("MyAppName") .setMaster(spark://ipaddress:7077)
// now for the Kryo stuff .set("spark.serializer",
"org.apache.spark.serializer.KryoSerializer") //require registration of
all classes with Kyro .set("spark.kryo.registrationRequired", "true") //
don't forget to register ALL classes or will get error
.registerKryoClasses(kryoClassArray);
On 01/27/2016 12:58 PM, Shixiong(Ryan) Zhu wrote:
It depends. The default Kryo serializer cannot handle all cases. If
you encounter any issue, you can follow the Kryo doc to set up custom
serializer:
https://github.com/EsotericSoftware/kryo/blob/master/README.md
On Wed, Jan 27, 2016 at 3:13 AM, amit tewari <amittewar...@gmail.com
<mailto:amittewar...@gmail.com>> wrote:
This is what I have added in my code:
rdd.persist(StorageLevel.MEMORY_ONLY_SER())
conf.set("spark.serializer","org.apache.spark.serializer.KryoSerializer");
Do I compulsorily need to do anything via
: spark.kryo.classesToRegister?
Or the above code sufficient to achieve performance gain using
Kryo serialization?
Thanks
Amit