Hi, I would like to store some data as a seq of protobuf objects. I would of course need to beable to read that into an RDD and write the RDD back out in some binary format.
First of all, is this supported natively (or through some download)? If not, are there examples on how I might write my own RDDs? I was hoping I would be able to accomplish this using some invokation of sparkContext.newAPIHadoopFile , but the comments there are just too terse. Are there more verbose examples out there? Either on how to write new RDD inputFormats, or how to make use of newAPIHadoopFile tks Shay
