Hello All, I implemented an algorithm using both the RDDs and the Dataset API (in Spark 1.6). Dataset version takes lot more memory than the RDDs. Is this normal? Even for very small input data, it is running out of memory and I get a java heap exception.
I tried the Kryo serializer by registering the classes and I set spark.kryo.registrationRequired to true. I get the following exception com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Class is not registered: org.apache.spark.sql.types.StructField[] Note: To register this class use: kryo.register(org.apache.spark.sql.types.StructField[].class); I tried registering using conf.registerKryoClasses(Array(classOf[StructField[]])) But StructField[] does not exist. Is there any other way to register it? I already registered StructField. Regards, Raghava.