[
https://issues.apache.org/jira/browse/AVRO-118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thiruvalluvan M. G. updated AVRO-118:
-------------------------------------
Attachment: AVRO-118.patch
The basic problem with JSON encoding is that the encoder may require to write a
few symbols at the end of the object even though there won't be any call from
the client (as per the current Encoder interface). For example, if the schema
is a record, the last call the encoder gets is from the client is for the last
field of the record. We could take two approaches:
- After every write, we check if there are any implicit action required to
be completed.
- Let the client indicate that the object has ended (by calling say reset())
so that the encoder can do the clean up
The former may have performance impact and the latter means an additional call
for the clients. Also, it doesn't look natural to call reset after the object()
even if it's there is a single object. I found another way to achieve this:
- When a new object is written any pending implicit actions can be
performed. That leaves only the trailing implicit action for very last object.
I added support for that in flush(). Thus, one can write as many object as
desired, but needs to call flush() after the very last object. (This
requirement existed even for BlockingBinaryEncoder and so isn't new). This
approach takes care of decoder part automatically. (There is equivalent of
flush() in decoder which may leave some trailing symbols in the stream, which
is usually not a problem. If some other data is added after Vvro stream, this
could be a problem. But even without this, we have buffer management problems
since JSonParser may read ahead etc.). There is one problem however (this is
not new, it existed even before this patch): flush() could only be called only
at object boundaries.
In summary, the new implementation simplifies the API a bit. The writer can
write a sequence of objects and needs to call flush() after writing all the
objects. The reader doesn't need to do anything special to read a sequence of
objects.
> JSON encoder does not easily permit multiple reads or writes
> ------------------------------------------------------------
>
> Key: AVRO-118
> URL: https://issues.apache.org/jira/browse/AVRO-118
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.1.0
> Reporter: Doug Cutting
> Attachments: AVRO-118.patch, AVRO-118.patch
>
>
> The JSON encoder does not permit one to easily write or read sequences of
> instances from a single stream.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.