We use avro objects in our project, and have a Kryo serializer for generic Avro SpecificRecords. Take a look at:
https://github.com/bigdatagenomics/adam/blob/master/adam-core/src/main/scala/edu/berkeley/cs/amplab/adam/serialization/ADAMKryoRegistrator.scala Also, Matt Massie has a good blog post about this at http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/. Frank Austin Nothaft fnoth...@berkeley.edu fnoth...@eecs.berkeley.edu 202-340-0466 On Thu, Apr 3, 2014 at 7:16 AM, Ian O'Connell <i...@ianoconnell.com> wrote: > Objects been transformed need to be one of these in flight. Source data > can just use the mapreduce input formats, so anything you can do with > mapred. doing an avro one for this you probably want one of : > > https://github.com/kevinweil/elephant-bird/blob/master/core/src/main/java/com/twitter/elephantbird/mapreduce/input/*ProtoBuf* > > or just whatever your using at the moment to open them in a MR job > probably could be re-purposed > > > On Thu, Apr 3, 2014 at 7:11 AM, Ron Gonzalez <zlgonza...@yahoo.com> wrote: > >> >> Hi, >> I know that sources need to either be java serializable or use kryo >> serialization. >> Does anyone have sample code that reads, transforms and writes avro >> files in spark? >> >> Thanks, >> Ron >> > >