[
https://issues.apache.org/jira/browse/AVRO-859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13069091#comment-13069091
]
Douglas Creager commented on AVRO-859:
--------------------------------------
Awesome stuff. Whenever we decided to implement the Haskell Avro library, this
will be a good definition of the inevitable monad that we'll have to write. :-)
I've also been working on something similar in the C library. Hopefully we can
have some cross-pollination of ideas here.
It started off with the “consumer” interface that I introduced in AVRO-762. I
think this corresponds to the Target in your description above. In addition to
the generic consumer interface, I wrote an implementation of that consumer
interface that would perform schema resolution. And then a generic function
that would consume binary Avro data, and pass the results into a consumer.
The natural next step would've been to add a “producer” interface, which
would've corresponded to the Source in your model. However, the one main issue
I had with this approach is that you'd have two competing models: one where you
push data through a chain of consumers, and one where you pull data through a
chain of producers. It didn't seem like either pushing or pulling could be
used as the “one true way”.
To get around this, I decided to go with a new “value” interface (AVRO-837),
rather than separate consumer and producer interfaces. In this model, an
{{avro_value_t}} is anything that can mimic an Avro value. It's basically a
big collection of getter and setter methods for the content of an Avro value of
a particular schema. Binary decoding doesn't have its own value
implementation, but it can use the setter methods to fill in any value
implementation — including one that just immediately serializes the contents
into a JSON encoding, for instance.
Schema resolution can then be implemented as two separate value
implementations. (I have this one coded up, but I don't have an issue open for
it yet. I should get on that.) The schema resolution classes provide a “view”
into an existing Avro value, allowing you to treat it as if it were an instance
of a different schema. You need two classes because the wrapped value might be
on either the “writer schema” or “reader schema” end of the resolution process.
> Java: Data Flow Overhaul -- Composition and Symmetry
> ----------------------------------------------------
>
> Key: AVRO-859
> URL: https://issues.apache.org/jira/browse/AVRO-859
> Project: Avro
> Issue Type: New Feature
> Components: java
> Reporter: Scott Carey
> Assignee: Scott Carey
>
> Data flow in Avro is currently broken into two parts: Read and Write. These
> share many common patterns but almost no common code.
> Additionally, the APIs for this are DatumReader and DatumWriter, which
> requires that implementations know how to traverse Schemas and use the
> Resolver.
> This is a proposal to overhaul the inner workings of Avro Java between the
> Decoder/Encoder APIs and DatumReader/DatumWriter such that there is
> significantly more code re-use and much greater opportunity for new features
> that can all share in general optimizations and dynamic code generation.
> The two primary concepts involved are:
> * _*Functional Composition*_
> * _*Symmetry*_
> h4. Functional Composition
> All read and write operations can be broken into functional bits and composed
> rather than writing monolithic classes. This allows a "DatumWriter2" to be a
> graph of functions that pre-compute all state required from a schema rather
> than traverse a schema for each write.
> h4. Symmetry
> Avro's data flow can be made symmetric. Rather than thinking in terms of
> Read and Write, think in terms of:
> * _*Source*_: Where data that is represented by an Avro schema comes from --
> this may be a Decoder, or an Object graph.
> * _*Target*_: Where data that represents an Avro schema is sent -- this may
> be an Encoder or an Object graph.
> (More detail in the comments)
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira