It has been filed as AVRO-738. Thanks for the links.
Dev On Wed, Jan 19, 2011 at 12:00 AM, Scott Carey <[email protected]>wrote: > Please open a bug report in JIRA. I don't have time to look at this now, > but someone else might. > > > On the topic of per record versioning and how to design a system that does > not store schemas per record, there have been useful topics on this > mailing list in the past: > > > http://search-hadoop.com/m/66jvQoopYw/HAvroBase&subj=Re+question+about+comp > letely+untagged+data+ > > http://search-hadoop.com/m/q7lLU1GVhHd2/HAvroBase&subj=Re+Versioning+of+an+ > array+of+a+record > > On 1/18/11 10:08 AM, "David Rosenstrauch" <[email protected]> wrote: > > >I've also found this to be the case, and was wondering about it. I also > >had thought that I could just re-init an existing BinaryEncoder, but > >found that I had to create a new one each time. I didn't really think > >much of it at the time, but in retrospect it does sound like it might be > >a bug. Perhaps one of the devs can comment more. (And/or perhaps you > >might want to open a bug report about this.) > > > >DR > > > >On 01/18/2011 03:17 AM, Devajyoti Sarkar wrote: > >> Let me first give some context, I would like to store a datum serialized > >> with a BinaryEncoder without having to place a schema with it (as the > >> DataFileWriter does). Instead I have created a container record that > >>stores > >> a unique id for the schema version and a payload field of type "bytes". > >>This > >> allows me to have a self-describing data object (for example, to place > >>in a > >> cell in HBase) without the overhead of a schema per object. (Perhaps > >>there > >> is a better way to do this, if so please let me know). > >> > >> The code looks something like this: > >> > >> GenericRecord container = new GenericData.Record(containerSchema); > >> writer.setSchema(containerSchema); > >> container.put(CONTAINER_SCHEMA_ID_FIELD, > >> datum.getSchema().getProp(SCHEMA_ID_PROPERTY)); > >> container.put(CONTAINER_PAYLOAD_FIELD, > >> ByteBuffer.wrap(datumBits.toByteArray())); > >> ByteArrayOutputStream containerBits = new ByteArrayOutputStream(); > >> encoder.init(containerBits); > >> writer.write(container, encoder); > >> encoder.flush(); > >> containerBits.flush(); > >> containerBits.close(); > >> > >> I am trying to reuse an encoder by calling init() to re-initialize it. > >> Perhaps this is what creates the problem. If I create a new encoder each > >> time everything works fine. However, if I just use init, then the > >> OutputStream for the encoder is reset but the OutputStream for the > >> SimpleByteWriter within the encoder is not. This seems to be causing the > >> problem because when the encoder is flushed, it does not write the > >>bytes in > >> the ByteWriter. Perhaps the init() method is not supposed to be used > >>this > >> way. But it would be nice to not have to create a new encoder each time. > >> > >> Can you please let me know if the above looks right and advise me as to > >>what > >> is the best way to do the serialization. > >> > >> Thanks, > >> Dev > >> > >> > >> > >> On Tue, Jan 18, 2011 at 4:14 AM, Scott > >>Carey<[email protected]>wrote: > >> > >>> BinaryEncoder buffers data, you may have to call flush() to see it in > >>>the > >>> output stream. > >>> > >>> > >>> On 1/17/11 4:53 AM, "Devajyoti Sarkar"<[email protected]> wrote: > >>> > >>> Hi, > >>> > >>> I am just beginning to use Avro, so I apologize if this is a silly > >>> question. > >>> > >>> I would like to set a field of type "bytes" in Java. I am assuming > >>>that all > >>> I need to do is wrap a byte[] in a ByteBuffer to set the value. > >>> Unfortunately that does not seem to work. I am using a BinaryEncoder > >>>and > >>> looking at its output, it has not written any the bytes that were in > >>>the > >>> array. The first four values of the array are 0, -128, -128, -128. > >>> > >>> Is it because Java uses 8-bit signed bytes while the Avro spec calls > >>>for > >>> 8-bit unsigned bytes in a field of type "bytes"? If so, how does one > >>>convert > >>> Java bytes to the kind accepted by Avro? > >>> > >>> Thanks in advance. > >>> > >>> Dev > >>> > >>> > >> > > > >
