[ 
https://issues.apache.org/jira/browse/AVRO-860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070718#comment-13070718
 ] 

Doug Cutting commented on AVRO-860:
-----------------------------------

So we have two different patches for this, one here and one in AVRO-851.  This 
one has the advantage that it uses Jackson, and is thus more likely to produce 
valid JSON.  However it makes a deep copy of data structures, which probably 
adversely affects performance.  Performance here is probably important.

We could develop an implementation that, instead of Jackson's ObjectMapper, 
uses Jackson's lower-level JsonGenerator API, as is done in Schema.java.  That 
might both perform well and delegate JSON details to Jackson.  On the other 
hand, JSON is simple enough that the approach in AVRO-851 might be less code 
and work well enough.

Thoughts?

> Invalid JSON when printing out records with unicode
> ---------------------------------------------------
>
>                 Key: AVRO-860
>                 URL: https://issues.apache.org/jira/browse/AVRO-860
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.5.1
>            Reporter: Miki Tebeka
>              Labels: java, json, unicode
>             Fix For: 1.6.0
>
>         Attachments: AVRO-860.diff, AVRO-860.diff, m.avro
>
>
> I have an avro file, that when printed returns invalid JSON.
> The code for iterating and printing is:
> {code}
>             DatumReader<GenericRecord> reader = new 
> GenericDatumReader<GenericRecord>();
>             DataFileReader<GenericRecord> dataFileReader =
>                 new DataFileReader<GenericRecord>(data, reader);
>             while (dataFileReader.hasNext()) {
>                 System.out.println(dataFileReader.next().toString());
>             }
> {code}
> and the relevant JSON snippet is
> {code}
>     "description": "Move™ offers advertisers the opportunity to deliver 
> messages to consumers at a time when consumers are making the biggest 
> purchases of their lives\uMOVE™ OFFERS ADVERTISERS THE OPPORTUNITY TO DELIVER 
> MESSAGES TO CONSUMERS AT A TIME WHEN CONSUMERS ARE MAKING THE BIGGEST 
> PURCHASES OF THEIR LIVES—OR REMODELING, REDECORATING AND MAINTAINING THEIR 
> MOST IMPORTANT ASSETS.or remodeling, redecorating and maintaining their most 
> important assets.",
> {code}
> (The \uMOVE is the problematic part).
> However if I do:
> {code}
>                 GenericRecord record = dataFileReader.next();
>                 Utf8 desc = (Utf8)record.get("description");
>                 System.out.println(desc);
> {code}
> Then I get
> {code}
> Move™ offers advertisers the opportunity to deliver messages to consumers at 
> a time when consumers are making the biggest purchases of their lives—or 
> remodeling, redecorating and maintaining their most important assets.
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to