I'm in the process of writing/ debugging a MapReduce job and the Avro MapRed API seems to require that the input file be a proper Avro container file.
I was hoping to be able to use the AvroMapper interface, feeding it a JSON file just as a debugging step. That way I can use VI to modify values in the JSON structure. However, if the Avro file format has binary delimiters, then this is probably not a viable approach. Thanks, Karthik On Wed, Feb 8, 2012 at 12:57 PM, Scott Carey <[email protected]> wrote: > > > On 2/8/12 7:14 AM, "karthik ramachandran" <[email protected]> wrote: > > Hi, > > I'm trying to figure out if its possible to create an Avro container file > with JsonEnconding. It doesn't appear to be: > org.apache.avro.file.DataFileWriter seems to use a binary encoder by > default. > > > One thing to note is that if you write it to an Avro container file in > binary it will be significantly smaller. You can extract the contents as > JSON using either the C command line tools or the Java 'tojson' tool. If > the reason you want it in JSON is for human readability, this is all you > need. > > For example, I often do the following: > > java –jar avro-tools.jar tojson my_avro_file.avro | grep …. > > or pipe it to other tools to view or interpret as JSON. > > > Is there another FileWriter class that I should be using? > > > See Doug's comments. It doesn't make sense to store JSON in an Avro Data > File because it is delimited with binary markers and contains binary > metadata. > > > > Karthik > > -- > Karthik Ramachandran > > -- Karthik Ramachandran Mobile: 412-606-8981
