Jakob Spörk wrote:
Hello,
I just want to give my thoughts to unified pipeline and data conversion
topic. In my opinion, the pipeline can't do the data conversion, because it
has no information about how to do this. Let's take a simple example: We
have a pipeline processing XML documents that describe images. The first
components process this xml data while the rest of the components do
operations on the actual image. Now is the question, who will transform the
xml data to image data in the middle of the pipeline?
I believe the pipeline cannot do this, because it simply do not know how to
transform, because that’s a custom operation. You would need a component
that is on the one hand a XML consumer and on the other hand an image
producer. Providing some automatic data conversions directly in the pipeline
may help developers that need exactly these default cases but I believe it
would be harder for people requiring custom data conversions (and that are
most of the cases).
Absolutely. The discussion was about having the pipeline automate the
connection of components that deal with the same data, but with
different representations of it. Think XML data represented as SAX,
StAX, DOM or even text, and binary data represented as byte[],
InputStream, OutputStream or NIO buffers.
Let's consider your example. We can have:
- an XML producer that outputs SAX events
- an XML tranformer that pulls StAX events a writes SVG as StAX events
in an XMLStreamWriter
- an SVG serializer that takes a DOM and renders it as a JPEG image on
an output stream
- and finally an image transformer that adds a watermark to the image,
reading an input stream and writing on an output stream.
The pipeline must not have the reponsibility of transforming data from
one paradigm to another (i.e. an XML document to a jpeg image) because
the way to do that highly depends on the application, But the pipeline
should allow the component developers to use whatever representation of
that data best fits their needs, and allow the user of not caring about
the actual data representation as long as the components that are added
to the pipeline are "compatible" (i.e. StAX, SAX and DOM are
compatible). This can be achieved by adding the necessary transcoding
bridges between components. And if such a bridge does not exist, then we
can throw an exception because the pipeline is obviously incorrect.
Note that XML is a quite unique area where components can allow data to
flow in one single direction through them (i.e. a SAX consumer producing
SAX events). Most components that deal with binary data pull their input
and push their output, which is actually exactly what Unix pipes do
(read from stdin, write to stderr). So wanting a universal pipeline API
that also works with binary data requires to address the push/pull
conversion problem.
Sylvain
--
Sylvain Wallez - http://bluxte.net