In an Avro file, it always writes the schema in JSON form in the header.
There may be an old JIRA ticket considering the possibility of writing the
schema in a more compact form.    The data in the file is always encoded in
Avro binary form, optionally with snappy or deflate(gzip) compression and
with a variable block size.

On 1/8/13 1:49 AM, "Pratyush Chandra" <[email protected]> wrote:

> Hi Scott,
> 
> I am able to find example for json encoding with DataFileWriter which embedds
> schema, but unable to find DataFileWriter example for binary encoding with
> schema.
> 
> Thanks
> Pratyush
> 
> On Tue, Jan 8, 2013 at 2:56 PM, Scott Carey <[email protected]> wrote:
>> Calling toJson() on a Schema will print it in json fom.  However you most
>> likely do not want to invent your own file format for Avro data.
>> 
>> DataFileWriter which will manage the schema for you, along with compression,
>> metadata, and the ability to seek to the middle of the file.    Additionally
>> it is then readable by several other languages and tools.
>> 
>> On 1/7/13 4:42 AM, "Pratyush Chandra" <[email protected]> wrote:
>> 
>>> I am able to serialize with binary encoding to a file using following :
>>>         FileOutputStream outputStream = new FileOutputStream(file);
>>>         Encoder e = EncoderFactory.get().binaryEncoder(outputStream, null);
>>>         DatumWriter<GenericRecord> datumWriter = new
>>> GenericDatumWriter<GenericRecord>(schema);
>>>         GenericRecord message1= new GenericData.Record(schema);
>>>         message1.put("to", "Alyssa");
>>>         datumWriter.write(message1, e);
>>>         e.flush();
>>>         outputStream.close();
>>> 
>>> But the output file contains only serialized data and not schema. How can I
>>> add schema also ?
>>> 
>>> Thanks
>>> Pratyush Chandra
> 
> 
> 
> -- 
> Pratyush Chandra


Reply via email to