Re: Is there a way to write spark RDD to Avro files

Fengyun RAO Sat, 02 Aug 2014 00:28:32 -0700

Below works for me:

        val job = Job.getInstance
        val schema = Schema.create(Schema.Type.STRING)
        AvroJob.setOutputKeySchema(job, schema)


        records.map(item => (new AvroKey[String](item.getGridsumId),
NullWritable.get()))
                .saveAsNewAPIHadoopFile(args(1),
                                        classOf[AvroKey[String]],
                                        classOf[NullWritable],

classOf[AvroKeyOutputFormat[String]],
                                        job.getConfiguration)


2014-08-02 13:49 GMT+08:00 touchdown <yut...@gmail.com>:

> Yes, I saw that after I looked at it closer. Thanks! But I am running into
> a
> schema not set error:
> Writer schema for output key was not set. Use AvroJob.setOutputKeySchema()
>
> I am in the process of figuring out how to set schema for an AvroJob from a
> HDFS file, but any pointer is much appreciated! Thanks again!
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Is-there-a-way-to-write-spark-RDD-to-Avro-files-tp10947p11241.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Is there a way to write spark RDD to Avro files

Reply via email to