That's the right place. Maybe try with HDP1 properties :

http://stackoverflow.com/questions/17241185/spark-standalone-mode-how-to-compress-spark-output-written-to-hdfs

About your Kryo error, you can use that if you want a coverage of scala types : https://github.com/romix/scala-kryo-serialization

Guillaume
Thanks for clarifying this.

I tried setting hadoop properties before constructing SparkContext, but it had no effect.

Where is the right place to set these properties?


On Fri, Jan 3, 2014 at 4:56 PM, Guillaume Pitel <[email protected]> wrote:
Hi,

I believe Kryo is only use during RDD serialization (i.e. communication between nodes), not for saving. If you want to compress output, you can use GZip or snappy codec like that :

val codec = "org.apache.hadoop.io.compress.SnappyCodec" // for snappy
val codec = "org.apache.hadoop.io.compress.GzipCodec" // for gzip

System.setProperty("spark.hadoop.mapreduce.output.fileoutputformat.compress", "true")
System.setProperty("spark.hadoop.mapreduce.output.fileoutputformat.compress.codec", codec)
System.setProperty("spark.hadoop.mapreduce.output.fileoutputformat.compress.type", "BLOCK")

(That's for HDP2, for HDP1, the keys are different)
Regards
Guillaume   
Hi,

I'm trying to call saveAsObjectFile() on an RDD[(Int, Int, Double Double)], expecting the output binary to be smaller, but it is exactly the same size of when kryo is not on.

I've checked the log, and there is no trace of kryo related errors.

The code looks something like:

class MyRegistrator extends KryoRegistrator {
  override def registerClasses(kryo: Kryo) {
    kryo.setRegistrationRequired(true)
    kryo.register(classOf[(Int, Int, Double Double)])
  }
}
System.setProperty("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
System.setProperty("spark.kryo.registrator", "MyRegistrator")


At the end, I tried to call:

kryo.setRegistrationRequired(true)

to make sure my class gets registered. But I found errors like:

Exception in thread "DAGScheduler" com.esotericsoftware.kryo.KryoException: java.lang.IllegalArgumentException: Class is not registered: scala.math.Numeric$IntIsIntegral$
Note: To register this class use: kryo.register(scala.math.Numeric$IntIsIntegral$.class);


It appears many scala internal types have to be registered in order to have full kryo support.

Any idea why my simple tuple type should not get kryo benefits?



--
eXenSa
Guillaume PITEL, Président
+33(0)6 25 48 86 80 / +33(0)9 70 44 67 53

eXenSa S.A.S.
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05



--
eXenSa
Guillaume PITEL, Président
+33(0)6 25 48 86 80 / +33(0)9 70 44 67 53

eXenSa S.A.S.
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05

Reply via email to