[ 
https://issues.apache.org/jira/browse/AVRO-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15306898#comment-15306898
 ] 

Mikko Kupsu commented on AVRO-1855:
-----------------------------------

[~rdblue], I have to admit that I don't have a slightest glue. I only have 
guesses.

I know that the values in field _headers_ are *definitively* Strings since I'm 
writing storing then with a class which is created from Avro schema and it has 
above field schema defined. Also I'm even converting them specifically to 
Strings while reading them as _Map_ from a _GenericRecord_.
{code:java}
setHeaders((Map<CharSequence, CharSequence>) record.get("headers"))
{code}

I think something gets he type of the Map wrong since when I'm seeing this 
there is *always* a key-value pair where the value can interpreted as a Number. 
In the example above it's _response_status_code_. Although, this is not always 
the case and some datums go thru OK even though having similar fields.

I don't know how Avro deducts the type of a Map but could the first value it 
gets, have an impact on the resulting type?

> Avro-mapred not evaluating map schema correctly when values are expected to 
> be strings
> --------------------------------------------------------------------------------------
>
>                 Key: AVRO-1855
>                 URL: https://issues.apache.org/jira/browse/AVRO-1855
>             Project: Avro
>          Issue Type: Bug
>          Components: java
>    Affects Versions: 1.8.0
>            Reporter: Mikko Kupsu
>            Priority: Critical
>         Attachments: 20160530_AVRO-1855.patch
>
>
> When reading bunch of Avro file and concatenating them using avro-mapred, 
> there is an issue with following schema definition line:
> {code}
> {"name": "headers", "type": ["null", {"type": "map", "values": "string"}]},
> {code}
> Below exceptions are thrown:
> {code}
> Caused by: org.apache.avro.UnresolvedUnionException: Not in union 
> ["null",{"type":"map","values":"string"}]: {range=bytes=91553252-91557347, 
> accept=*/*, response_status_code=206, host=108.175.39.172}
>       at 
> org.apache.avro.generic.GenericData.resolveUnion(GenericData.java:709)
>       at 
> org.apache.avro.generic.GenericDatumWriter.resolveUnion(GenericDatumWriter.java:192)
>       at 
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:110)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>       at 
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:150)
>       at 
> org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:153)
>       at 
> org.apache.avro.reflect.ReflectDatumWriter.writeField(ReflectDatumWriter.java:182)
>       at 
> org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:143)
>       at 
> org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:105)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:73)
>       at 
> org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:150)
>       at 
> org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:60)
>       at org.apache.avro.file.DataFileWriter.append(DataFileWriter.java:302)
> {code}
> I've fixed this in my own [GitHub 
> fork|https://github.com/mikkokupsu/avro/tree/hotfix/20160530/avro-schema-map-string-problem]
>  and I've attached the patch too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to