I'm looking at the Tuning Guide suggestion to use Kryo instead of default serialization. My questions:
Does pyspark use Java serialization by default, as Scala spark does? If so, then... can I use Kryo with pyspark instead? The instructions say I should register my classes with the Kryo Serialization, but that's in Java/Scala. If I simply set the spark.serializer variable for my SparkContext, will it at least use Kryo for Spark's own classes, even if I can't register any of my own classes? Thanks, Diana