Re: PROPOSAL: tag pipelining

James Strachan Tue, 20 Mar 2001 13:28:35 -0800
Pierre

Sorry for the delay getting back to you, I've been a tad busy snowboarding
in the Swiss alps ;-)

This is a little long I'm afraid. I've made a small amendment to my last
proposal and I then try to compare our two approaches...

> Finally got a minute to review your last proposal.
> There are some differences with what I had last
> submitted which I believe are worth some more discussions.
> Not clear to me what's the best approach (probably yours :-)),
> but I'd like us to discuss these differences to make sure
> we clearly understand them.

Yes I think we've diverged a bit.

If I understand it correctly, your version of the proposal has moved towards
a more generic object passing mechanism between tags - an inner tag setting
a property of an outer tag. This is a useful mechanism, particularly getting
around the 'nasty scriplet expressions for an attribute value problem'.

The original problem I was trying to tackle was that of trying to avoid the
use of double buffering via a BodyContent to do text based transformations
more efficiently.

In many ways I was trying to rework your original "iotransformer" taglib
proposal such that tags themselves could become the "transformer".
Particularly as a BodyTag already has the concept of 'input' via the
BodyContent and all tags have a natural output via the pageContent.getOut().
I was trying to have an option for other tags to override these defaults for
the 'transformer' tags. I'm briefly going to refer to an interface you used
in your orignal iotransformer proposal

  interface Transform {
      public void transform(java.io.Reader input, java.io.Writer output)
          thows JspException;
  }

My last proposal I'd gotten more generic by passing Object instances around
for "input" and "output", following your lead of making it more generic. I'm
now more tempted to go back the other way and make things more typesafe and
less generic by making the 2 interfaces based on Reader and Writer classes.
Something like:-

public interface PipeReader {
    public void setReader( Reader reader );
}

public interface PipeWriter {
    public void setWriter( Writer writer );
}

Such that any tag can become a transformer (like your Transformer interface)
and a child tag may set its reader or writer, in a type-safe manner using
either these 2 simple typesafe interfaces - or via introspection.

So an abstract base class for all transformer tags doing some kind of text
based transformation (search & replace, scrape, regexp, xml-rpc, soap call,
XSLT or whatever) could be something like the following:-

public abstract class TransformerTagSupport
        extends BodyTagSupport
        implements PipeReader, PipeWriter {

    private Reader reader;
    private Writer writer;

    // PipeReader interface
    public void setReader(reader) {
        this.reader = reader;
    }

    // PipeWriter interface
    public void setWriter(writer) {
        this.writer = writer;
    }

    public int doEndTag() throws JspException {
        if ( reader == null ) {
            // a child tag has not set my input source
            // so I'll read from my BodyContent
            reader = bodyContent.getReader();
        }
        if ( writer == null ) {
            // a child tag has not set my output destination
            // so I'll write to my enclosing writer
            writer = pageContent.getOut();
        }
        transform( reader, writer );
    }

    // actually do the transformation,
    // reading from reader & writing to writer
    protected abstract void
nsform( 
        Reader reader, Writer writer 
    ) throws JspException;
}


The above tag could have an inner tag set its writer or reader either using
the PipeReader / PipeWriter interfaces or via introspection. If no values
are passed in then it would use the JSP defaults of the BodyContent or the
outer JspWriter. e.g.

<foo:searchAndReplace from="foo" to="bah">
    foo foo black sheep
</foo:searchAndReplace>

Or an inner tag could set it
s input and output

<foo:searchAndReplace from="foo" to="bah">
    <file:read filename="foo.txt"/>
    <file:write filename="bar.txt"/>
</foo:searchAndReplace>

The pseudo code in the <file:read> and <file:write> tags would be

// file:read
Reader reader = new FileReader( name );
Tag parent = getParent();
if ( parent instanceof PipeReader ) {
    ((PipeReader)parent).setReader( reader );
}
else {
    // pipe the input to my current writer
}

> With the above example, the regexp:replace tag would then be used as
> follows:
>
>   <regexp:replace from="foo" to="bar">
>       <file:read name="/tmp/input.txt"/ pipeAttribute="in">
>       <file:write name="/tmp/output.txt" pipeAttribute="out"/>
>   </regexp:replace>
>
>   pseudoCode for file:read() is:
>      Pipe.setPipe(new FileReader(name), pipeAttribute)
>
>   and similarly, pseudoCode for file:write() is:
>      Pipe.setPipe(new FileWriter(name), pipeAttribute)

In your example the <file:read> and <file:write> could just use
introspection to set the properties "in" and "out" on the regexp tag, they
could maybe call an introspection helper class:-

public class BeanHelper {
    public void setProperty( Object bean, String property, Object value ) {
            ...
    }
}

I think its quite common for tags to behave like your original 'Transformer'
interface, having a single input and output. Thats certainly true of regexp
/ search & replace / io tags / scrape and so on. Its also true of BodyTags
in general. It could be true also of the xsl tag library if the 'stylesheet'
and 'apply stylesheet' were split into seperate tags.

Though using a helper tag, handling tags with multiple inputs (which I think
is the rarer use case) could be achieved via indirection:-

<xsl:apply>
    <io:set property="xslReader">
        <file:read file="foo.xsl"/>
    </io:set>
    <io:set property="xmlReader">
        <file:read file="bar.xml"/>
    </io:set>
</xsl:apply>

In the above example the <io:set> tag implements PipeReader. The method is
something like:-

public class IoSetTag extends BodyTagSupport implements PipeReader {
    ...
    public void setReader(Reader reader ) {
        // use introspection to pass this reader to my parent
        BeanHelper.setProperty( getParent(), property, reader );
    }
}

> With my proposal, the <io:set> tag is not required anymore.
> The code would be as follows:
>
>   <xsl:apply>
>     <file:read name="/tmp/a.xml" pipeProperty="xml"/>
>     <file:read name="/tmp/style.xsl" pipeProperty="xsl"/>
>     <file:write name="/tmp/bar.txt" pipeProperty="out"/>
>   </xsl:apply>

Agreed.

Though the converse is that each tag which is capable of providing a Reader
or Writer instance to a transformer tag must support the "pipeProperty"
property to know which property to use, plus the JSP must use this too which
makes it a little more bulky.

I'd rather multiple inputs / outputs be treated as a special case. I prefer
the simplicity of:-

   <xsl:apply xsl="foo.xsl">
     <file:read name="/tmp/style.xsl"/>
     <file:write name="/tmp/bar.txt"/>
   </xsl:apply>

By nesting tags together we get a similar effect to Unix pipelining without
the need to name the inputs and outputs.

For those special cases where multiple inputs or multiple outputs are
required or redirected to a different property is required, then
introspection or a special helper tag <io:set> can be used. Think of
<io:set> as the Unix "tee" utility, its there for those less common cases
where the pipe / nesting doesn't work by default..


> So I guess the major difference is that my proposal does not make
> any difference between input or output. All it does is setup
> the pipe between a data source (the producer tag), and a specific
> attribute in the consumer tag. If no 'pipeAttribute' is specified,
> then a default attribute in the consumer can be used.
>
> Again, just want to make sure we both are in sync on the differences.


----
Analysis

* The interfaces are now typesafe, text pipelining is based on Reader and
Writer. This mirrors the JSP tags structure where Reader and Writer are
prevelent via BodyContent.getReader() and JspWriter instances.

* either introspection or the PipeReader / PipeWriter interfaces can be used
to set the Reader or Writer instances

* standard transformers that takes input and generates output (kinda like a
unix process) just implements the 2 interfaces/properties. All transformers
could derive from a base TransformerTagSupport class which does all of this,
they just need to implement the transform() method (which is like your
Transformer interface).

* the passing of Reader / Writer to the transformer tags could be wrapped in
a helper class to use both the interfaces or introspection:-

public class PipeHelper {

    public void setReader( Tag tag, Reader reader )

            if ( tag instanceof PipeReader ) {
                ((PipeReader)parent).setReader( reader );
            }
            else {
                // use introspection instead
                BeanHelper.setProperty( tag, "reader", reader );
            }
    }
}


----
Comparison of approaches

Pierre's version:

Pros:
* more generic, can be use to pass any objects around. So more of an 'object
bus' than 'unix-like text-based pipelining'.

* solves the 'messy scriplet expression for attribute value' problem

Cons:
* provider of objects need to implement "pipeProperty".

* since objects are passed around some type conversion problems may occur.
e.g. problems with character encodings could occur if tags pass around
InputStream / OutputStream objects.

* could we just use introspection to implement the 'object bus'?


James's version:

Pros:

* typesafe, optimised for text based transformers

* cleaner seperation of responsibility:

- transformers are based just on a Reader and Writer, they are just
responsible for transforming a Reader to a Writer and do not need to concern
themselves with converting an Object to a Reader/Writer.

- providers of the Reader and Writer objects have the responsibility of
creating the Reader and Writer objects, taking care of issues like "String
as content" versus "String as a URI/URL/filename" issues, handing of
InputStream / OutputStreams, character encodings and the like.

Cons:

* not as generic, only useful for text pipelining between tags and text
transformations, so not as reusable to other problems

* Introduces 2 new interfaces, PipeReader and PipeWriter which are possibly
optional, could use introspection instead.

* more complexity for us poor old taglib developers to deal with? :)

* how do we share these interfaces across taglibs - its an added dependency.


Maybe we should split the two versions of the same proposal into 2 different
proposals?

(i) James: Text based pipelining: Transformations without double buffering
using Reader and Writer

(ii) Pierre: generic tag communication: setting attribute values cleanly


Comments?

<James/>


James Strachan
=============
email: [EMAIL PROTECTED]
web: http://www.metastuff.com

__________________________________________________________________

If you are not the addressee of this confidential e-mail and any 
attachments, please delete it and inform the sender; unauthorised 
redistribution or publication is prohibited. 
Views expressed are those of the author and do not necessarily 
represent those of Citria Limited.
__________________________________________________________________
Re: PROPOSAL: tag pipelining

Reply via email to