[
https://issues.apache.org/jira/browse/AVRO-793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Doug Cutting updated AVRO-793:
------------------------------
Resolution: Fixed
Hadoop Flags: [Reviewed]
Status: Resolved (was: Patch Available)
I committed this. Thanks, Thiru!
> A strange problem when I am trying to read avro record with a subset of the
> schema.
> -----------------------------------------------------------------------------------
>
> Key: AVRO-793
> URL: https://issues.apache.org/jira/browse/AVRO-793
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.5.0
> Environment: Avro1.5,Windows xp/Ubuntu 10.0.4
> Reporter: Yingzhong Xu
> Assignee: Thiruvalluvan M. G.
> Priority: Critical
> Labels: Avro, Reading, Schema, Write
> Fix For: 1.5.1
>
> Attachments: AVRO-793-test.patch, AVRO-793.patch
>
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> Hi, all. When I am trying to read avro file with a subset of that
> schema(because I do not need all the details).I meet a strange problem.
> 1.I write data using this schema:
> {
> "name": "relation",
> "type": "record",
> "fields": [
> {
> "name": "timestamp",
> "type": "long"
> },
> {
> "name": "type",
> "type": {
> "type": "map",
> "values":{
> "type" : "array",
> "items": {
> "type":"record",
> "name":"sdf",
> "fields": [
> {
> "name": "device",
> "type": "string"
> },
> {
> "name": "children",
> "type": {
> "type": "array",
> "items": "string"
> }
> }
> ]
> }
> }
> }
> }
> ]
> }
> 2.Here is a JSONObject for that schema.
> {
> "timestamp":1234567890,
> "type":{
> "WMA":[
> {
> "device":"WMA1",
> "children":["WMB1","WMB2"]
> },
> {
> "device":"WMA2",
> "children":["WMB1","WMB2"]
> }
> ]
> }
> }
> 3.I write that record succefully.And it is okay if I use this schema for
> reading:
> {
> "name": "relation",
> "type": "record",
> "fields": [
> {
> "name": "timestamp",
> "type": "long"
> },
> {
> "name": "type",
> "type": {
> "type": "map",
> "values":{
> "type" : "array",
> "items": {
> "type":"record",
> "name":"sdf",
> "fields": [
> {
> "name": "children",
> "type": {
> "type": "array",
> "items": "string"
> }
> }
> ]
> }
> }
> }
> }
> ]
> }
> the result is :
> {
> "timestamp":1234567890,
> "type":{
> "WMA":[
> {
> "children":["WMB1","WMB2"]
> },
> {
> "children":["WMB1","WMB2"]
> }
> ]
> }
> }
> 4.But if i want to igonre the "children" part instead of "device", I use
> this schema for reading:
> {
> "name": "relation",
> "type": "record",
> "fields": [
> {
> "name": "timestamp",
> "type": "long"
> },
> {
> "name": "type",
> "type": {
> "type": "map",
> "values":{
> "type" : "array",
> "items": {
> "type":"record",
> "name":"sdf",
> "fields": [
> {
> "name": "device",
> "type": "string"
> }
> ]
> }
> }
> }
> }
> ]
> }
> Unfortunately,I get exception:
> java.lang.ArrayIndexOutOfBoundsException: -8
> cause:java.lang.ArrayIndexOutOfBoundsException
> at org.apache.avro.io.BinaryDecoder.readInt(BinaryDecoder.java:122)
> at org.apache.avro.io.BinaryDecoder.skipString(BinaryDecoder.java:262)
> at org.apache.avro.io.ValidatingDecoder.skipString(ValidatingDecoder.java:113)
> at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:60)
> at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
> at org.apache.avro.io.parsing.SkipParser.skipRepeater(SkipParser.java:83)
> at org.apache.avro.io.ValidatingDecoder.skipArray(ValidatingDecoder.java:195)
> at org.apache.avro.io.ParsingDecoder.skipTopSymbol(ParsingDecoder.java:70)
> at org.apache.avro.io.parsing.SkipParser.skipTo(SkipParser.java:71)
> at org.apache.avro.io.parsing.SkipParser.skipSymbol(SkipParser.java:93)
> at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:226)
> at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
> at
> org.apache.avro.io.ResolvingDecoder.readFieldOrder(ResolvingDecoder.java:127)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:162)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at
> org.apache.avro.generic.GenericDatumReader.readArray(GenericDatumReader.java:196)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:140)
> at
> org.apache.avro.generic.GenericDatumReader.readMap(GenericDatumReader.java:233)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:141)
> at
> org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:167)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:138)
> at
> org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:129)
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:236)
> at org.apache.avro.file.DataFileStream.next(DataFileStream.java:223)
> at AvroUtilTest.read(AvroUtilTest.java:77)
> at AvroUtilTest.main(AvroUtilTest.java:61)
> As Scott Carey said,I did like this and it worked.How to fix this bug?
> Scott Carey:
> 2: If you change the schema you write with by making reversing the order of
> the fields of "sdf" (array, then string), are the results the same?
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira