RE: Pipe-aware Selectors [was Re: XML-Based Selection (Redirect Serializer?)]

Daniel Fagerstrom Thu, 07 Mar 2002 16:02:55 -0800

Stefano Mazzocchi wrote:
> Daniel Fagerstrom wrote:
>
> > The pipe-selector (ideas about a better name?) would look something like
> > this:
> > <pipe-selector type="xpath">
> >   <when test="expr1">
> >     <!-- pipeline fragment -->
> >   </when>
> >   <when test="expr2">
> >     <!-- pipeline fragment -->
> >   </when>
> >     ...
> >   <otherwise>
> >     <!-- pipeline fragment -->
> >   </otherwise>
> > </pipe-selector>
> >
> > The general idea is that the pipe-selector buffers it input in e.g. a
> > DOM-tree, then the tests can be applied to the buffered input.
> The pipeline
> > fragment in the first when clause where the test succeed is
> then feeded with
> > the buffered input, and its output is send to the pipeline
> component after
> > the pipe selector.
>
> Ok, I see your point clearly.
Good :)


> > How can this be implemented?
I have done some more thinking and have started to build a prototype. I
decided that it would be easier to build it in the treeprocessor, so I will
describe the design this far in terms of the treeprocessor interfaces and
classes.

<background-info>
For those who had not yet studied the treeprocessor there are two main
interfaces: ProcessingNodeBuilder, implemented by e.g. SelectNodeBuilder,
and ProcessingNode, implemented by e.g. SelectNode. The node builders are
used to construct a processing tree from the sitemap, and a FooNodeBuilder
typically puts an instance of FooNode in the tree. The tree typically get
the same structure as the elements in the sitemap. The ProcessingNodes
implement:

  boolean invoke(Environment env, InvokeContext context)

where the InvokeContext among other things contains the current
EventPipeline and StreamPipeline, invoke drives the execution of a request.
Generators and Transformers are pushed into the EventPipeline, a Serializer
is put in the StreamPipeline and starts the execution of the pipeline.
Sitemap elements with children decide how to process the children.
</background-info>

To implement pipe aware selection as in the above sitemap example we need 4
classes: PipeSelectNodeBuilder, PipeSelectNode, DOMGenerator and
XPathSelector.

PipeSelectNodeBuilder: is like SelectNodeBuilder but puts PipeSelectNodes
instead of SelectNodes in the tree, it should possibly also implement
LinkedProcessingNodeBuilder to enable view-labels on a pipe-aware selector.

PipeSelectNode: extracts the current EventPipeline from the InvokeContext,
connects a DOMBuilder to the EventPipeline and executes it. The resulting
dom-tree is stored in the objectModel and a new InvokeContext is created
with a newly created EventPipeline that starts with a DOMGenerator, and the
StreamPipeline from the incoming InvokeContext. After these steps the invoke
method in PipeSelectNode will do exactly the same things as in the
SelectNode but with the new InvokeContext as input.

DOMGenerator: takes the stored dom-tree from the objectModel and applies a
DOMStreamer on it.

XPathSelector: implements the Selector interface, and its select method
takes the dom-tree from the object model and returns the (boolean) result of
the application of the XPath on it.

Given that the design above actually work, it seem possible to implement
pipe-aware selection without messing with any interfaces at all in Cocoon.
The somewhat implicit transport of the dom-tree is maybe a kludgy solution,
it might be better to have a new interface that makes that communication
more explicit. But in any case we should _not_ change the Selector
interface.

Of course all this should have been said in Java-code and not words ;) I
hope to be able to finish a first prototype soon.

<snip/>
> Hmmm, I don't think this behavior should extend Transformer's, smells
> like a bad design choice to me.
The above design takes away that.

<snip type="some incomprehensible thoughts on caching"/>
> Sorry, I think I lost you here. :/

A new trial:
With the design above I think the existing cash mechanisms should be usable
as is, the EventPipeline before the selection could be cached and then the
selection mechanism could be applied on the cached data and the content in
the chosen "when clause" can be cached in turn. But this scheme would be
unnecessarily inefficient: it would be better to also store a mapping
between the cash key for the "input pipeline" and the "when clause" that is
chosen for that input pipeline, so that the tests does not have to be
recalculated.

<snip/>

> > Performance: it would of course better to not be forced to store the
> > SAX-events in DOM-tree, but I do not see much choice.
>
> Well, this is not a problem since XSLT processing works this way anyway,
> but an XPath engine can be made much more incremental than a XSLT-one.
Might be doable in principle, but I would guess that it could be hard to
reuse this functionality from an existent XSLT implementation. I have some
previous (very bad)
experience from trying to reuse low level mechanisms in Xalan for an
extension element, and have to forget about that experience before I will
repeat that mistake ;)

> Also, it might be possible that not much load is given to these pipes
> since they are mostly used in data INPUT which is normally much less
> than data OUTPUT in any site.
Yes, that is what I believe, IMHO it seem unnecessary to implement
complicated optimizations before there are clear use cases for them. It
should also be noted that selection based on validation _can not_ be done in
streaming mode, the whole document must be validated before we know that it
is valid, i.e. buffering is necessary.

What do you think?

/Daniel Fagerstrom



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

RE: Pipe-aware Selectors [was Re: XML-Based Selection (Redirect Serializer?)]

Reply via email to