[ 
https://issues.apache.org/jira/browse/AVRO-228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997254#comment-12997254
 ] 

Scott Carey commented on AVRO-228:
----------------------------------

Its been a while on this one.  
This separation, or something similar, was something I was looking at recently. 
 How our DatumReaders are tied to Decoders and Writers to Encoders is 
cumbersome for many use cases.  
ResolvingDecoder should not be a decoder at all, but its own concern that is 
more generally accessible.  We should not have to traverse both the parser 
(ResolvingDecoder) AND the schema (GenericDatumReader) to read.
We might take the approach similar to the new C code and change it from a 
"pull" to a "push" during resolution.  the actual/expected schemas define what 
a record reader will read vs skip, and then from there issue callbacks to the 
code that marshals data into objects.  This would make it a lot easier to do a 
custom in memory representation right from the stream.  In the current format, 
every implementer of a representation of a schema as an object has to duplicate 
parser and/or schema traversal logic.  

> Refactor to separate concerns: (de)serialization vs marshalling to data model
> -----------------------------------------------------------------------------
>
>                 Key: AVRO-228
>                 URL: https://issues.apache.org/jira/browse/AVRO-228
>             Project: Avro
>          Issue Type: Improvement
>          Components: java
>    Affects Versions: 1.3.0
>            Reporter: Justin SB
>         Attachments: 
> 0001-Separated-serialization-code-from-marshalling-code.patch
>
>
> I've attached a patch that separates out the code that deals with marshalling 
> to java data models (Generic, Specific, Reflect) from the code that deals 
> with (de)serializtion (DatumReader, DatumWriter).  This means that e.g 
> GenericDatumReader, SpecificDatumReader and ReflectDatumReader are 
> unnecessary.  Instead, a single AvroDatumReader is parameterized with a 
> ClassMapping.  The class mapping interface is currently implemented by the 
> existing GenericData, SpecificData, ReflectData mappers.
> The patch is large, so might be quite hard to follow.  Essentially I renamed 
> GenericDatumReader to AvroDatumReader, and moved any model-specific 
> functionality to the ClassMapping interface.  I did the same for DatumWriter 
> & Requestor & Responder.
> There remains some oddities for the ProxyingResponder / Requestor, but I 
> wanted to keep the patch as a straight re-organization without significant 
> code changes.
> I believe this patch will set us up for some refactors down the road - 
> ClassMapping is really 3 interfaces in one - the mapping of Java types to 
> Avro types, some utility functions (like hashCode), and field accessors.  
> Splitting this interface into 3 interfaces would allow reuse of the parts 
> separately.  Then it would be easy to use avro with arbitrary data models (by 
> implementing the accessor functions) without needing to implement 
> DatumWriters etc.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to