Re: [RT] Cocoon Blocks

Sylvain Wallez Sat, 29 Jun 2002 14:25:53 -0700

Stefano Mazzocchi wrote:

>Even if the flowscript discussion isn't finished, I think we have
>reached an important conclusion on that side: a flowscript isn't a
>sitemap replacer, but a sitemap augmenter.
>


A minor remark : could we avoid the use of "flowscript" or "flowmap" 
which imply the way they're implemented and speak simply of "flow" which 
only designates the functionality ?

<large-snip/>

>                                   - o -
>
>I'm pretty sure that if I stopped here and went on describing the schema
>of the COB descriptor file and so on, people would love it, thank me,
>run to their boss to tell them and blah blah..
>
>Sure, we could stop here, we could clone the WAR concept inside Cocoon,
>allow you to deploy your stuff and you won't be missing anything.
>
>But there is one thing that the servlet API architects didn't consider
>(not even myself at that time since I was part of that group):
>polymorphism.
>
>Ok, a few blank lines so you think about it....
>  
>

Oh, my, what a suspense ;-)

>
>
>
>
>
>
>
>
>what does it mean to have a polymorphic package?
>  
>

Please, please, explain us ;-))

>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>Applying Avalon COP philosophy over again
>-----------------------------------------
>  
>

Phewww... Here comes the answer ;-D

>If ever worked with Avalon, you know the feeling: at first it doesn't
>make any sense at all. It's a mess of stupid and very abstract
>interfaces... but after a while, a patter emerges and it sticks.
>

The same can be said about Cocoon : the first time I encountered it (it 
was 1.7.4), I didn't understand, but had the feeling there was something 
interesting in it, and gave it a try... Two years after, I'm still there :)

>Some might think that Avalon (probably Cocoon itself) includes infecting
>'memes' and I agree. [Look up the name on google if you don't know what
>I'm talking about]
>
>Once you start using COP (component oriented programming), it's very
>hard to go back (so much so that many abuse it and over-componentize
>their systems... even Cocoon itself suffers from this problem on some
>parts).
>
>COP is based on IoC (Inversion of Control) and SoC (Separation of
>Concerns) [for those who still don't know about them!] and while the
>servlet API makes extensive use of the IoC metapattern, SoC doesn't play
>a clear and defined role (they tried to patch it with RequestDispatcher,
>which is the biggest hack I ever seen, I even voted against it but I was
>overruled).
>
>Anyway, if the servlet API, internally, show use of IoC and SoC,
>externally, from the WAR point of view, there is *absolutely* no notion
>of it: a WAR is a package that includes a single and isolated
>application.
>
>Period. That's it. There are many mechanism that enforce the clear
>separation between different WARs. So, they implement monolithic web
>applications and this is *by design*.
>
>A step up: Blocks as cocoon application components
>--------------------------------------------------
>
>If we design cocoon blocks as 'isolated units of application deployment'
>we fall back in the good old web trap: making web applications
>interoperate in the same URI space is a MAJOR PITA with *ANY* web
>technology.
>
>I'm talking about making Bugzilla and Horde IMP share the same look and
>feel. Try it!
>
>'coherence' is a value, expecially on professional sites, but coherence
>shouldn't mean that everything has to be written by the same team!
>
>Sure, I want Cocoon blocks to ease deployment of cocoon-based web
>applications, but this is a secondary byproduct: what I really want is
>to make it possible to *share* cocoon web applications as we currently
>do with Avalon components.
>
>                                   - o -
>
>Ok, enough introduction, let's get to the meat.
>
>WARNING: what follows is the result of ideas collected from many people,
>and was cleared in all parts with real-life discussion between Giacomo
>and myself a few weeks ago. Anyway, what follows is still part of the RT
>flow, so it must only be considered as a proposal and not something
>carved in stone.
>
>Cocoon Blocks
>-------------
>
>A Cocoon block is a zipped archive, just like JARs and WARs.
>
>The extension of a cocoon block is .cob (for COcoon Block). The MIME
>type is yet to be determined (might be required for over-the-net block
>download).
>

Why should COBs be downloaded from a remote location ? Do you really 
think people will allow remote code to run on their production servers ?

>A Cocoon Block (COB from now on) includes a directory called
>
> /BLOCK-INF
>
>which contains all the block metadata and the resources that must not be
>directly referentiable from other blocks (for example, jars, classes or
>file resources made available thru the classloader). The directories
>
> /BLOCK-INF/classes
> /BLOCK-INF/jar
>
>are used for classes and jar files. This follows the WAR paradigm.
>
>The main COB descriptor file is found at
>
> /BLOCK-INF/block.info
>
>[FIXME: can this create conflicts with Avalon blocks?]
>
>This file MUST be an XML file, containing markup with a cob-specific
>namespace and will include the following information:
>
> 1) block implementation metadata (name, author, license, URL of the
>project and so on)
> 2) role(s): the URI(s) of the behavioral role(s) this block implements
>and exposes [optional]
> 3) dependencies: the URI(s) of the behavioral roles this block expects,
>along with the prefixes used by the block as shortcuts in protocol
>resolvin (see below for the meaning of this)
>[optional]
> 4) sitemap: the location inside the block file space of the sitemap
>[optional]
>
>Visually, the block metadata can be pictured like this:
>
> 
>                    implementation metadata
>                               ^ 
>                               |
> (exposed behaviors)? <---- [block] ----> (required behaviors)?
>
>Also, the /BLOCK-INF/ directory contains the 'roles' file for Avalon
>components:
>
> /BLOCK-INF/block.roles
>
>
>What is a 'block behavior'?
>---------------------------
>
>If you are familiar with Avalon, you probably understood the idea (it's
>very similar to the concept of Avalon roles), but if not it might be a
>little difficult, so let me write you an example of this:
>
>let's take Forrest and let it decouple in two blocks:
>
> 1) one block provides the document production
> 2) another block provides the skinning and presentation layers
>
>Currently, it is already done like this, but the change of the skin
>(this is how the second block is currently called) must be done by hand:
>there is no cocoon machinery in place to make this possible.
>
>So, let us assume the machinery is now in place:
>
>forrest itself becomes a block, but in order to function, it needs
>access to the stylesheets contained in the skin, which, in order to
>simplify decoupling, we want to implement as another block.
>
>Result: 
>
> forrest.cob/BLOCK-INF/block.info
>
>is something like
>
> <block>
>  <metadata>
>   <name>Forrest</name>
>   <organization>ASF</organization>
>   ...
>  </metadata>
>  <dependencies>
>   <block behavior="http://xml.apache.org/forrest/skin/1.0";
>prefix="skin"/>
>  </dependencies>
>  <sitemap location="sitemap.xmap"/>
> </block>
>  
>

It looks a lot like a Phoenix block .xinfo file. Couldn't the same 
syntax be used ? I'm don't know much about Phoenix services so this may 
be a silly question, but what could be the relationship between a Cocoon 
block and a Phoenix service ?

>while:
>
> skin.cob/BLOCK-INF/block.info
>
>is something like
>
> <block>
>  <metadata>
>   <name>Xmas Skin</name>
>   ...
>  </metadata>
>  <behaviors>
>   <behavior uri="http://xml.apache.org/forrest/skin/1.0"/>
>  </behaviors>
> </block>
>
>Now: suppose you have your naked cocoon running in your favorite servlet
>container, and you want to deploy forrest.cob, here is a possible
>sequence of actions on an hypotetical web interface on top of Cocoon
>(a-la Tomcat Manager)
>
> 1) upload the forrest.cob to Cocoon
> 2) Cocoon scans /BLOCK-INF/, reads block.info and finds out that
>Forrest depends on a block which the given role
> 3) then it connects to the uber "Cocoon Block Librarian" web service
>(hosted somewhere around *.apache.org) and asks for the list of blocks
>that exhibit that required behavior.
> 4) the librarian returns a list of those blocks, so the users chooses,
>or the manager allows the user to deploy its own block that implements
>the required behavior.
>

Ok, here's the need for remote code. Although a catalog is good to help 
resolving dependencies and discover existing implementations, I'm still 
not sure about auto-deployment of downloaded code on production systems.

> 5) Cocoon checks that all dependencies are met, then unpacks the blocks
> 6) Since 'forrest.cob' exposes a sitemap, the deployment manager asks
>the deploying user where he/she wants to *mount* that block in the
>managed URI space.
>

It seems to me block mounting is in fact the configuration of a "root" 
Processor component (the interface implemented by the sitemap) that 
performs first-level dispatching between mounted blocks. This allows to 
keep unchanged the architecture that exists today.

Now I don't think all blocks should be mounted on an 
externally-available URI : some blocks will provide only services to 
other blocks and no external service. The sitemap for such blocks is 
likely to contain only internal-only="true" pipelines, but then why 
should we have to mount them on an external URI ?

> 7) If no collisions in the URI spaces are found, the blocks are made
>available for servicing. [note: the skin block doesn't exposes a sitemap
>so it's not mounted on the URI space]
>
>A big issue: resource dereferencing
>-----------------------------------
>
>Security concerns aside, the above scenario shows one major issue:
>blocks are managed, deployed and mounted by the container. There is (and
>there should not be) a way for a block to directly access another block
>because this would ruin IoC.
>
>So, one block doesn't know where the blocks it depends on are located,
>both on disk *and* on the URI space as well.
>
>The proposed solution is to use block-specific protocols to identify the
>dereferenced resources.
>
>For example, the forrest.cob/sitemap.xmap file could contain a global
>matcher which works like this:
>
>   <map:match pattern="**/*.html">
>    <map:aggregate element="site">
>     <map:part src="cocoon:/{1}/book-{1}/{2}.xml"/>
>     <map:part src="cocoon:/{1}/tab-{1}/{2}.xml"/>
>     <map:part src="cocoon:/body-{1}/{2}.xml" label="content"/>
>    </map:aggregate>
>    <map:transform src="block:skin:/stylesheets/site2xhtml.xslt"/>
>    <map:serialize/>
>   </map:match>
>
>please note the
>
> block:skin:/stylesheets/site2xhtml.xslt
>

IMHO, this example goes strongly against the benefits that blocks want 
to bring. The functionnality brought by the 'skin' block is... skinning. 
It's not an XSL stylesheet at a particular location. What if someone has 
written the killer skin for his site, but this skin requires a 
multi-stage pipeline that cannot be represented by a single stylesheet ?

The contract of a block should be services identified by their URI, and 
not files at well-known locations (even if these 'files' are in fact 
produced by a pipeline).

So what about something like :
    ...
  </map:aggregate>
  <map:call resource="block:skin:/site2xhtml"/>
</map:match>

This call "jumps" to a service provided by the block and its URI is part 
of the block's contract. We don't care (because we don't have to) if the 
service is implemented by an XSL or by the next-generation transformer.

What the "jump" does is feed a pipeline in the block with the result of 
the current pipeline. The whole pipeline is terminated in the called block.

But just as a pipeline can serialize or not depending on if it's an 
internal request or not (see SitemapSource), the same service could be 
used as a transformation. We could then write something like :
    ...
  </map:aggregate>
  <map:transform type="pipeline" src="block:skin:/site2xhtml"/>
  <map:transform type="urlencoder"/>
  <map:serialize/>
</map:match>

By considering blocks as pipeline services, we really achieve true 
polymorphism for blocks, because we totally abstract the way their 
contracts are implemented.

[note that all the above isn't in fact block-specific and can be made 
today inside a single sitemap]

>which indicates
>
> block -> use the block protocol
> skin -> use the 'skin' prefix to lookup the block behavior URI and thus
>the block which implements it for this block (the block manager knows
>this)
> /stylesheets/site2xhtml.xslt -> since the 'skin' block doesn't expose a
>sitemap, give me the file located in that position of the internal block
>file space (except /BLOCK-INF/ which is protected)
>
>[in case the block exposes a sitemap, the block: protocol connects to
>the URI space exposed by the sitemap... before you start suggesting a
>block-raw: protocol to get access to that, think twice because, to me,
>it smells like FS a lot!]
>
>Dereferencing navigation
>------------------------
>
>Not only a sitemap needs to connect to the resources contained in the
>blocks on which the block depends on, but the resulting pages as well.
>
>In fact, suppose you have a block that exposes a web service and another
>one that exposes a web application that wraps that web service. For
>sure, the generated web page will have to have a URI to connect to that
>service, since it's the client's browser that makes the call (unless we
>want to virtualize everything thru the sitemaps, but I wouldn't suggest
>it).
>
>So, a possible solution is to use the "block:" protocol in the pages as
>well and have a URI-mapping transformer right before the serialization
>stage.
>
>For example, things like
>
><form action="block:web-service:/post">...</form>
>
>is trasnformed into
>
><form action="/servizio-web/post"/>...</form>
>
>                                 - 0 -
>
>Some design decision taken
>--------------------------
>
>
>o) NO BEHAVIOR VALIDATION: 
>
>I thought a lot about it but I think that having 'behavior description
>languages' (such as the WSDL-equivalent for blocks) is going to be
>terribly complicated, expensive to implement and hard to use and
>enforce, even for simple blocks which don't expose a sitemap and are
>just repositories for informations.
>
>For this reason, there is no validation taking place: if a block
>implements a particular behavior and exposes it thru its descriptor
>file, Cocoon automatically assume it implements the behavior correctly.
>
>In the future, we might think of adding a behavior description layer to
>enforce a little more validation, but I fear the complexity (for
>example) of validating stylesheets against a particular required
>behavior.
>
>IMO, only human try/fail and patching will allow interoperability.
>  
>

Agree. Let's keep validation away for now and see in the future if it's 
really usefull and if/how it can be implemented.

>o) VERSIONING AS PART OF THE BEHAVIOR URI
>
>The behavior URI *MUST* terminate with a /x.y that indicates the
>major.minor version of the behavior that a block implements.
>
>On dependencies, each block must be able to specify the 'ranges' of
>versioning that it is known to work with. For example
>
>  <block behavior="http://xml.apache.org/forrest/skin/1.x";
>prefix="skin"/>
>
>But I haven't really thought about the patterns that could be used for
>this. 
>
>Please, help on this.
>  
>

Once again, have you looked at Phoenix blocks ? The version of a block 
isn't in its name but as a separate attribute, and IIRC there are some 
features to check for version compatibility.

>o) CROSS-BLOCK SECURITY
>
>Even I don't think anybody is stupid enough to use a single Cocoon
>instance to run a full ISP and ask for sandboxing of the single blocks,
>cross-block security is a big concern, expecially since you might be
>deploying components on the fly in a binary format.
>
>So, first thing is to protect the /BLOCK-INF/ directory.
>
>The second thing is to wrap each block with its own classloader,
>connected to the block dependency map, so that each class discovery is
>done only on the class space of the dependent blocks.
>
>[NOTE: this doesn't prevent people from using blocks as trojans, but we
>won't host blocks which don't come with the source code so we solve that
>problem].
>
>
>o) COCOON MANAGER SECURITY
>
>The cocoon manager might be a block itself that connects to specific
>cocoon internals and provides a web interface for it. So, it can be
>removed or disabled when put on production.
>
>Also, the feature of automatic discovery of blocks thru the 'cocoon
>block library' can be turned off or substituted with its own (even the
>'cocoon block library' could be a block, so you could have your own
>block library on your system instead of connecting to the apache one).
>
>
>o) OPTIONAL COP 
>
>The block.info file makes it *optional* to expose behaviors or to depend
>on them. This allows the COP model to nicely downgrade to the good old
>single-archive WAR paradigm for those who don't care about block
>polymorphism.
>  
>

This point is really important : people shouldn't have to package a COB 
and deploy it just to write a few pipelines. If a Processor 
implementation is used to mount blocks, this problem disappears as the 
current architecture is kept.

>Possible Problems
>-----------------
>
>1) classloading performance:
>
>since classloading will become more complicated, it will be slower, but
>this will impact only the startup performance not the runtime
>performance so no real issues here.
>  
>

Why will it be slower ? Because there's an additional classloader in the 
hierarchy ? As you say, this shouldn't be an issue.

>2) possible reduced portability of Cocoon:
>
>some servlet containers don't like servlets to come up with their own
>classloaders. In those environments a block-enabled Cocoon might simply
>not work. This doens't mean that Cocoon won't work, but that blocks
>can't be deployed.
>
>NOTE: the next servlet API might fix that by requiring a better
>classloading behavior by the containers.
>  
>

What about the note you wrote for the servlet JSR group ?

>3) difficult block interoperability
>
>without a way to automatically validate if a block implements a behavior
>correctly, the type of that component is inherently weak and might lead
>to problems that might become hard to fix.
>
>The block manager *must* be able to *clone* a block and let you modify
>one clone without disturbine the other. [but these are implementation
>details and we'll see in the future how serious this problem becomes. in
>fact, sitemap pipelines aren't validated as well but nobody had enough
>itch to scratch this]
>
>4) difficult transition
>
>When we have blocks, it's easy to imagine that will exist pure-code
>blocks that wrap around libraries and provide only sitemap components
>(think FOP, POI, Batik and so on).
>
>In that case, a 'naked Cocoon' becomes "de facto" back incompatible
>because some sitemap components which are now included by default in
>Cocoon) might not be present anymore, unless you wrap your code in
>blocks and you depend explicitly on those blocks that expose that
>specific behavior.
>
>So, some working is required.
>  
>

Is this really an issue ? People that want to use FOP or Batik without 
blocks can still put these blocks libs into the good old WEB-INF/lib to 
have the behaviour we have today.

>This might force us to call a block-enabled Cocoon: Cocoon 3.0
>  
>

Or 2.2 if the architecture is backwards compatible, which seems to be 
possible.

>                                 - o -
>
>Conclusions
>-----------
>
>I think I have exposed a detailed plan on how to implement blocks and
>solve a number of issues we are having:
>
> 1) allow users to 'compose' Cocoon only with those modules they need
> 2) allow users to easily deploy their stuff on cocoon
> 3) allow users to easily reuse web applications components without
>sacrificing coherence
> 4) allow users to be helped by Cocoon to 'fill the gaps' and be
>suggested on what components is best required and feed it automatically
>(apt-get like)
> 5) allow the Cocoon communities to clearly separate concerns between
>the core and the application-level stuff (a-la Zope)
> 6) allows, for the first time in the history of the web, to use
>polymorphism and COP at a web application level.
>
>That's all folks.
>
>Fire your comments and try to tear it appart: I'm pretty confident this
>is really a big thing for Cocoon!
>  
>

Sure this is a big thing ! Here are some additional random thoughts :

How to "discover" available blocks to automatically update the structure 
of a portal site ?

Will it be possible to have several blocks fulfilling the same contract 
in a single Cocoon instance ? For example, a site could use different 
skins depending on the user agent or depending on the location of the 
client (intranet / internet). This could be handled by the root 
Processor that handles block mounting.


Sylvain

-- 
Sylvain Wallez
 Anyware Technologies                  Apache Cocoon
 http://www.anyware-tech.com           mailto:[EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Re: [RT] Cocoon Blocks

Reply via email to