Pipe-aware Selectors [was Re: XML-Based Selection (Redirect Serializer?)]

Stefano Mazzocchi Tue, 05 Mar 2002 12:16:52 -0800

Daniel Fagerstrom wrote:

> The pipe-selector (ideas about a better name?) would look something like
> this:
> <pipe-selector type="xpath">
>   <when test="expr1">
>     <!-- pipeline fragment -->
>   </when>
>   <when test="expr2">
>     <!-- pipeline fragment -->
>   </when>
>     ...
>   <otherwise>
>     <!-- pipeline fragment -->
>   </otherwise>
> </pipe-selector>
> 
> The general idea is that the pipe-selector buffers it input in e.g. a
> DOM-tree, then the tests can be applied to the buffered input. The pipeline
> fragment in the first when clause where the test succeed is then feeded with
> the buffered input, and its output is send to the pipeline component after
> the pipe selector.


Ok, I see your point clearly.

> How can this be implemented?
> 
> The pipe-selector is a transformer (i.e. implements the transformer
> interface) extended with a method that lets the sitemap constructor send an
> object to the pipe-selector that takes care of the tests and the pipeline
> fragment construction for the selected when clause in the pipe-selector.
> 
> The actual tests can implement the selector interface if it is ok to put the
> DOM-tree with buffered input in the objectModel. A possible issue with this
> is that the tests in the selector are performed after that the whole
> pipeline is constructed. This might give the unintuitive effect that
> components later in the pipeline that effects the objectModel and is
> executed during sitemap construction time are executed before the test in
> the pipe-selector.

Hmmm, I don't think this behavior should extend Transformer's, smells
like a bad design choice to me.

> The sitemap stylesheet constructs a class for each pipe-selector instance in
> the sitemap. This class contains a method that returns a EventPipeline. The
> method executes the tests and constructs and returns an EventPipeline for
> the first when clause that succeed. The EventPipline starts with a generator
> puts the DOM-tree in the objectModell in a DOMStreamer.
> 
> class PipeSelectorInternalPipeN01234 {
>   public EventPipeLine constructPipe(SitemapRedirector redirector,
>                                      Environment environment,
>                                      List listOfMaps) { ... }
> }
> 
> The code in sitemap_xmap for constructing the pipeline that contains a
> pipe-selector is like that for any pipeline that contains a transformer,
> with the difference that the "pipe-selector transformer" is given an
> instance of its PipeSelectorInternalPipe class.
> 
> The algoritm for the PipeSelector is:
> * Its input is connected to a DOMBuilder.
> * The  DOM-tree is put in the objectModel.
> * The constructPipe(...) method of the PipeSelectors
> PipeSelectorInternalPipe  class is executed.
> * The returned EventPipeline is connected to the output of the PipeSelector
> and the EventPipelines process method is called.
> 
> I hope that the description above is comprehensible.

I'm *seriously* worried about the need to buffer the input.

> >  2) how does this impact performance? how does this impact caching?
> >
> > [the first impacts the system usage, the second the interface of
> > Selectors or PipeSelectors]
> 
> The simplest way to implement cashability is to base the cash key generation
> on _all_ of the pipeline fragments in the when clauses. This is fairly
> unsatisfying as it implies the construction of all of the pipeline fragments
> instead of only the one that is selected. It will also decrease the
> possibility to cash the PipeSelector and lead to unnecesary recalculation
> based on changes in "when clauses" that was not selected. The problem is
> that when the generateKey() method is called it is not known what "when
> clause" that will be choosen. If we however had a method like
> generateKey(key), where "key" is the hash key for the pipeline fragment
> before the PipeSelector, "key" uniquely determines the input to the
> PipeSelector and thus what "when clause" that will be selected, this
> information: the map from key to "when clause", could be stored and the used
> to compute the cach key for the PipeSelector only based on the pipeline
> fragment in the _selected_ when clause.

Sorry, I think I lost you here. :/

> Performance: it would of course better to not be forced to store the
> SAX-events in DOM-tree, but I do not see much choice. 

Well, this is not a problem since XSLT processing works this way anyway,
but an XPath engine can be made much more incremental than a XSLT-one.

Also, it might be possible that not much load is given to these pipes
since they are mostly used in data INPUT which is normally much less
than data OUTPUT in any site.

> The main use for the
> PipeSelector will probably be to make selection based on XML input to Cocoon
> and on the output from transformers with side effects, I would guess that
> for these cases it will most of the time be quite small documents. Besides
> the need for buffering I do not think there should be any sources to
> performance botlenecks in a PipeSelector, but I do not know enogh about the
> internals of Cocoon to know for sure.
> 
> What do you think?, would something along this lines work?

Yes, I think it might work, but I still can't see if this requires a
change in the Selector interface or if another sitemap component must be
added.

What do you guys think?

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<[EMAIL PROTECTED]>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Pipe-aware Selectors [was Re: XML-Based Selection (Redirect Serializer?)]

Reply via email to