[ 
https://issues.apache.org/jira/browse/AVRO-245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790857#action_12790857
 ] 

Doug Cutting commented on AVRO-245:
-----------------------------------

> What's your TODO/FIXME convention?

I don't have a strong convention.  Do you?

What I meant here is that creating a new encoder per datum is unacceptable.  A 
JsonEncoder compiles the schema as a grammar and is meant to be reused.  
Jackson's JsonGenerator is reusable, but unfortunately inserts a space before 
all but the first item for some unknown reason that is at least a misfeature 
for our purposes.  Looking at the thrift-protobuf-compare benchmarks, they do 
create a new JsonGenerator per datum, so they must be lightweight.  But we 
still need to avoid re-compiling the grammer per datum.

> The core issue is that we've got two different things going on: we're both 
> line-oriented and JSON-oriented.

You're right.  I munged this together.

So we perhaps should consider lines the container, parsing them first, then 
parsing json within them, as your patch did.  But we should not create a new 
Decoder per line, since it also compiles the grammar.

To address both of these, perhaps we should add methods:

static Parser JsonEncoder#parse(Schema);
JsonEncoder(Parser, OutputStream);
static Parser JsonDecoder#parse(Schema);
JsonDecoder(Parser, InputStream);

Then we could create the parser once outside the loop and then re-create 
lightweight objects within the loop and hope that doesn't hurt performance 
much. My first choice would be to make encoders and decoders reusable, but that 
does not appear possible currently with Jackson.

> It doesn't seem that using the ValidatingDecoder makes it check that, but i 
> could be wrong.

I believe that the Json tokens are actually strictly checked.


> Commandline utility for converting to and from Avro's binary format.
> --------------------------------------------------------------------
>
>                 Key: AVRO-245
>                 URL: https://issues.apache.org/jira/browse/AVRO-245
>             Project: Avro
>          Issue Type: New Feature
>          Components: java
>            Reporter: Philip Zeyliger
>            Assignee: Philip Zeyliger
>            Priority: Minor
>         Attachments: AVRO-245.patch, AVRO-245.patch.txt, AVRO-245.patch.txt, 
> AVRO-245.patch.txt, AVRO-245.patch.txt
>
>
> A utility for avrotool that can convert between Avro binary data and the JSON 
> textual form.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to