[
https://issues.apache.org/jira/browse/AVRO-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16009871#comment-16009871
]
Todd Nine commented on AVRO-2034:
---------------------------------
This is the flow I've followed through the code. My test is in this PR.
https://github.com/apache/avro/pull/224/files#diff-2a2af8f5525407454364290cc45f5cd9R1
Caller initiates record read
{{GenericDatumReader.readRecord}} is invoked on the root record
First field, "parentField1" is read via {{readString(Object old, Decoder in)}}
{{GenericDatumReader.readRecord}} is recursively called on the internal nested
record "child1"
First field "childField" is read via {{readString(Object old, Decoder in)}}
{{GenericDataReader.readRecord}} returns and the result is set into the field
"child1"
At this point, the {{protected Object readString(Object old, Decoder in)}}
method
is invoked in the GenericDatumReader when attempting to read the field
"parentField2".
Which in turn invokes {{ResolvingDecoder.readString(Utf8 old)}}
Once in this method, the encapsulated JsonDecoder's current state is incorrect.
Therefore, the call to
{{in.readString(old);}} (line 201) will fail.
This fails because the next symbol that is a string, is in fact the nested
field within the "child1" record that is unexpected
I've attempted to add "skipToEndOfRecord" functionality in the Decoder class and
subsequent subclasses. This would be invoked before
GenericDataReader.readRecord
returns with the intention of "skipping" over any subsequent fields that have
not
been read in the record. However, this seems to contradict logic I have found
in this source in JsonDecoder:500
{code:java}
if (top == Symbol.RECORD_END) {
if (currentReorderBuffer != null &&
!currentReorderBuffer.savedFields.isEmpty()) {
throw error("Unknown fields: " +
currentReorderBuffer.savedFields.keySet());
}
currentReorderBuffer = reorderBuffers.pop();
}
{code}
Which will throw an error if fields are encountered in a record, but not read
from the schema.
> Nested schema types with unexpected fields causes json parse failure
> --------------------------------------------------------------------
>
> Key: AVRO-2034
> URL: https://issues.apache.org/jira/browse/AVRO-2034
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.8.1
> Reporter: Todd Nine
>
> When parsing a nested type with an unexpected field using the JSON parser,
> this results in an error. To reproduce, see the class {{TestNestedRecords}}
> in the referenced PR.
> https://github.com/apache/avro/pull/224
> Note that this only occurs when the following pattern exists in the schema.
> # regular field
> # nested record with additional field
> # Any subsequent field following the nested record with an unexpected field
> appears to reproduce the problem.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)