http://www.scala-lang.org/api/2.10.3/index.html#scala.Option
The signature is 'def saveAsSequenceFile(path: String, codec: Option[Class[_ <: CompressionCodec]] = None)', but you are providing a Class, not an Option[Class]. Try counts.saveAsSequenceFile(output, Some(classOf[org.apache.hadoop.io.compress.SnappyCodec])) On Wed, Apr 2, 2014 at 12:18 PM, Kostiantyn Kudriavtsev < kudryavtsev.konstan...@gmail.com> wrote: > Hi there, > > > I've started using Spark recently and evaluating possible use cases in our > company. > > I'm trying to save RDD as compressed Sequence file. I'm able to save > non-compressed file be calling: > > counts.saveAsSequenceFile(output) > > where counts is my RDD (IntWritable, Text). However, I didn't manage to > compress output. I tried several configurations and always got exception: > > counts.saveAsSequenceFile(output, > classOf[org.apache.hadoop.io.compress.SnappyCodec]) > <console>:21: error: type mismatch; > found : > Class[org.apache.hadoop.io.compress.SnappyCodec](classOf[org.apache.hadoop.io.compress.SnappyCodec]) > required: Option[Class[_ <: org.apache.hadoop.io.compress.CompressionCodec]] > counts.saveAsSequenceFile(output, > classOf[org.apache.hadoop.io.compress.SnappyCodec]) > > counts.saveAsSequenceFile(output, > classOf[org.apache.spark.io.SnappyCompressionCodec]) > <console>:21: error: type mismatch; > found : > Class[org.apache.spark.io.SnappyCompressionCodec](classOf[org.apache.spark.io.SnappyCompressionCodec]) > required: Option[Class[_ <: org.apache.hadoop.io.compress.CompressionCodec]] > counts.saveAsSequenceFile(output, > classOf[org.apache.spark.io.SnappyCompressionCodec]) > > and it doesn't work even for Gzip: > > counts.saveAsSequenceFile(output, > classOf[org.apache.hadoop.io.compress.GzipCodec]) > <console>:21: error: type mismatch; > found : > Class[org.apache.hadoop.io.compress.GzipCodec](classOf[org.apache.hadoop.io.compress.GzipCodec]) > required: Option[Class[_ <: org.apache.hadoop.io.compress.CompressionCodec]] > counts.saveAsSequenceFile(output, > classOf[org.apache.hadoop.io.compress.GzipCodec]) > > Could you please suggest solution? also, I didn't find how is it possible > to specify compression parameters (i.e. compression type for Snappy). I > wondered if you could share code snippets for writing/reading RDD with > compression? > > Thank you in advance, > Konstantin Kudryavtsev >