Sylvain Wallez schrieb:
<snip/>
Steven Dolg wrote:
Basically you're providing a buffer between every pair of components
and fill it as needed.
Yes. Now this buffer will always contain a very limited number of
events, corresponding to the result of processing an amount of input
data that is convenient to process at once to avoid complex state
management (e.g. an <i18:text> tag with all its children). And so most
often, this buffer will contain just one event.
Think of it as being just a bridge between the writer view used by a
producer and the reader view used by its consumer. These are in my
opinion the most convenient views to write StAX components.
But you need to implement both XMLStreamWriter and XMLStreamReader
and optimize that for any possible thing a transformer might do.
In order to buffer all the data from the components you will have to
create some objects as well - I guess you will end up with something
like the XMLEvent and maintaining a list of them in the StaxFIFO.
That's why I think an efficient (as in faster than the Event API)
implementation of the StaxFIFO is difficult to make.
It's certainly less trivial than maitaining a list of events, but
should be doable quite efficiently by using an int FIFO (to store
event types and attribute counts) and a String FIFO (for everything
else). I'll try find a couple of hours to prototype this.
On the other hand I do think that the cursor API is quite a bit
harder to use.
As stated in the Javadoc of XMLStreamReader it is the lowest level
for reading XML data - which usually means more logic in the code
using the API and more knowledge in the head of the developer
reading/writing the code is required.
So I second Andreas' statement that we will sacrifice simplicity for
(a small amount of ?) performance.
I understand your point, even if I don't totally agree :-) Now it
should be mentioned that if even with events, my proposal still
stands: just replace XMLStream{Reader|Writer} with
XMLEvent{Reader|Writer}.
The other thing is that - at least the way you suggested - we would
need a special implementation of the Pipeline interface.
That is something that compromises the intention behind having a
Pipeline API.
Right now we can use the new StAX components and simply put them into
any of the Pipeline implementations we already have.
Sacrificing this is completely out of the question IMO.
Actually, I'm wondering if wanting a single API is not wishful
thinking and will in the end lead to something that is overly abstract
and hence difficult to understand and use, or where underlying
implementations will leak in the high-level abstraction.
There is already some impedence mismatch appearing between pull and
push in the code:
- a StAXGenerator has to call initiatePullProcessing() on its
consumer, which in turn will have to call it on it's own consumer, etc
until we reach the Finisher that will finally start pulling events.
This moves a responsibility that belongs to the pipeline down to its
components.
Well I don't see the problem with that.
From the pipeline's point of view those are normal components just like
all the other.
The pipeline was never intended to "care" about the internals of the
components - so why bothering that the StAXGenerator calls
"initiatePullProcessing" on its consumer instead of calling some other
method like e.g. "startDocument".
- an AbstractStAXProducer only accepts a StAXConsumer, defeating the
idea of a unified pipeline implementation that will accept everything.
The idea was to have pipelines being capable of processing virtually any
data.
But that is not the same as combining components in an arbitrary way,
e.g. there is no sense in linking a FileGenerator with an (not yet
existing) ImageTransformer based on Java's Imaging API.
The components must be "compatible" - that is they must understand the
data they exchange with each other.
We may however provide some adapters/converters to make certain "types"
of components compatible, e.g. SAX <--> StAX.
So we should either have several APIs specifically tailored to the
underlying push or pull model, or make sure the unified API and its
implementations accept any kind of component and set the appropriate
conversion bridges between them.
As I tried to state above: that will not be possible for every
conceivable combination of components.
At least not when thinking beyond XML - which I do.
Sylvain