[ 
https://issues.apache.org/jira/browse/AVRO-29?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721329#action_12721329
 ] 

Thiruvalluvan M. G. commented on AVRO-29:
-----------------------------------------

  - The only real use of ValidatingValueReader/Writer is validation. It can be 
used for testing new class that directly uses ValueReader/Writer objects. Since 
it is designed as a filter, it can be inserted into the chain to detect any 
corner-case bugs even in production environments. At best it can be used for 
diagnostic purposes.
  - There are two versions of readRecord because of the difference in behavior 
of ResolvingValueReader compared to ValueReader. The ValueReader returns 
objects in the order of their declaration in the reader's schema. For 
ResolvingValueReader could return in a different order depending on writer's 
schema. If we can achieve reordering of fields (which is possible with some 
more effort), then we can get rid of the second version of readRecord(). In 
fact if reader can expect its contents in the order of its schema and if 
support for default values is added, all the resolution is internal to the 
ResolvingValueReader. Any reader can simply read as if the data is serialized 
according to its schema.
 - The parsing table can be considered as a binary version of schema. (There is 
some information loss presently, but it can be taken care of). One can define 
an avro schema that serializes parsing table itself. With that, an RPC can send 
data along with its schema which a receiver can readily use to resolve against 
receiver's schema. This is functionally equivalent to sending the JSON version 
of schema, but is more efficient. This is particularly useful for 
scatter/gather kind of RPCs where many receivers receive the same request. The 
time saved thus could be significant.
- Once we agree on the usefulness of these classes, we can move them around 
appropriately.

> Validation and resolution for ValueInput/ValueOutput
> ----------------------------------------------------
>
>                 Key: AVRO-29
>                 URL: https://issues.apache.org/jira/browse/AVRO-29
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>            Reporter: Raymie Stata
>            Assignee: Thiruvalluvan M. G.
>         Attachments: AVRO-29.patch, AVRO-29.patch
>
>
> This is a companion to AVRO-25, which introduced the classes ValueOutput and 
> ValueInput.  This patch adds two capabilities: validation of 
> ValueInput/Output calls against a schema, and schema-resolution implemented 
> in the context of ValueInput.
> ValidatingValueInput and ValidatingValueOutput take a schema and will 
> validate calls against a schema.  For example, if the schema calls for a 
> record consisting of two longs and a double, then ValidatingOutput will allow 
> the call-sequence readLong, readLong, readDouble and throw an error otherwise.
> ResolvingValueInput takes two schemas, the writer's and the reader's schema, 
> and automatically performs Avro's schema-resolution logic on behalf of the 
> reader.  For example, if the writer's schema calls for a long, and the 
> readers calls for a double, then the reader can call readDouble, and 
> ResolvingValueInput will automatically decode the long sent by the writer and 
> convert it into the double expected by the reader.
> ResolvingValueInput is an alternative to Avro's current GenericDatumReader, 
> which also implements Avro's resolution logic.  In many use-cases, the 
> programmer has their own data structures into which they want to store data 
> read from an Avro stream, data structures that cannot easily be put into the 
> GenericRecord/Array class hierarchy.  With ResolvingValueInput, programmers 
> get the benefit of this resolution logic without being forced into the 
> GenericRecord/Array class hierarchy.
> We recommend that ResolvingValueInput become the standard implementation of 
> the resolution logic, and that GenericDatumReader be implemented in terms of 
> ResolvingValueInput.  However, we haven't implemented this change pending 
> feedback from others.
> We haven't implemented default values, but can add that feature.
> Implementation note: this patch is implemented by translating Avro schemas to 
> LL(1) parsing tables.  This translation is straight forward, but tedious.  If 
> you want to understand how the code works, we recommend that you look in the 
> file "parsing.html" (included in the patch), which explains the translation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to