Re: data goes in, data goes out

Jeremy Quinn Thu, 29 Nov 2001 05:10:59 -0800

At 11:41 pm +0100 28/11/01, Stefano Mazzocchi wrote:
>I'm restarting Jeremy's thread about WritableSource in order to achive a
>more architecture-oriented view of the problem.


Great to have your input on this!

>It is obvious that, in order to be a useful piece of a successful CMS,
>Cocoon must be equally able to drive both data flows: inward and
>outward.
>
>For the outward flow, Cocoon beats the crap out of almost any other
>solution (IMNSHO).
>
>For the inward flow, well.... let's be honest: it more or less sucks.

+1

>ok, ok, maybe I'm kinda rude but this is the truth: there are frameworks
>outthere that give you much better tools to handle the data that flows
>in.
>
>Cocoon is currently a publishing framework, where publishing means
>"outward-flow oriented", but all of us would like to see a better
>synergy (or equilibrium) between the two flows, in order to make web-app
>building easier, but without loosing the technological advancements that
>Cocoon introduced.
>
>Is this possible?
>
>Giacomo and I are holding back a big RT about the next generation of
>pipeline component assembling (a.k.a. flowmap, sitemapNG or more
>generally "statemap"). We spent a few days talking about it and we
>didn't come up with something solid enough to propose as a solution...
>
>... but we *did* identify some general rules and general concepts that
>might help triggering the discussion about this.
>
>Anyway, this is not what this mail is about.
>
>The question, today, gets deeper: is the current Cocoon architecture
>asymmetric in respect of data flow?
>
>First (and obvious) thought is to *reverse* the terminology:
>
> generators -> transformers -> serializers
>
>becomes
>
> deserializer -> transformers -> accumulator
>
>Is this something good?
>
>No, I don't think so: what would be the difference between a generator
>and a deserializer in terms of interfaces? I can't see one.
>
>And what about accumulators? well, they'd store information, but would
>need to give some information back. If this is in form of SAX events, an
>accumulator is a transformer, if this is a binary stream, it's a
>serializer.
>
>But one concept here is good: the concept that the opposite of a
>generator is an accumulator (I'm an electronic engineer, you know :), in
>electrical terms, a battery, a place where energy (here, data) is stored
>for future use.
>
>In software terminology, an accumulator is normally called a "store",
>"archive", "cabinet", "safe" and so on with terms that give you an idea
>of "something" that allows you to place information that will
>persistently remain there.
>
>So, does this mean we need a "Store"-behaving component in the cocoon
>pipelines?
>
>No, such a store-behaving component would not be distinguishable from
>another transformer.
>
>So, I think I just proved that the current pipeline architecture is
>*not* inherently biased toward outward data flow, despite the component
>names that seem to inspire the opposite.

excellent analysis

>                            - o -
>
>The above is an important result: it states that Cocoon doesn't require
>core architecture modifications. I already knew this, otherwise I would
>not be +1 on a final Cocoon 2.0 release. But it's good to let you guys
>know about this as well.
>
>So, now that we have an architecture that is, at least, in theory,
>general enough to handle the problem, how do we attack it?
>
>The "accumulator" concepts gives interesting ideas:
>
> 1) it must receive SAX events (it would force concern overlap if the
>binary adaptation was performed by the store itself, so
>"parsing"/"deserialization" should be performed before, at XML
>generation stage)
> 2) it must throw data out (otherwise we would not be able to know what
>happened)
> 3) it should be SAX events (otherwise, we would not be able to process
>the data further)
>
>Thus, a "store" is behaviorally equivalent to a "transformer".

I agree
It is like fitting a 'Tee' junction.

Modification of an asset is a side effect of the pipeline, the Transformer
will probably pass everything through, wrapped with a generated report
about it's activities.

It is kind of opposite to the XInclude Transformer (which was why I called
the XInclude Transformer the 'ReadableSourceTransformer' in my sample).

>                           - o -
>
>Now, how can this be handled? should we reverse the Source interface and
>come up with a "Destination" (sort of InputStream/OutputStream
>parallel)?
>
>If this XML contained data, we could come up with an automatic
>relational mapping between the arriving XML and the SQL update query on
>a RDBMS. It could be pretty straightforward.
>
>At the same time, how would you do such a thing on an article stored on
>a native XML DB?
>
>As Jeremy pointed out, XUpdate is a choice. I must tell you that I don't
>like XUpdate that much, but I admit it's a choice.

I am in two minds about an XUpdate Transformer per se.
The 'standard' is not a Standard, the draft is incomplete.

The job of modifying a fragment of XML (from file, xmldb etc.) is easily
achievable with purpose built XSLT stylesheets, so a Transformer that
specifically handles the XUpdate namespace while it might be useful for
people who are not so happy with XSLT, is not actually required for the job.

What is actually missing (as you have stated) is the ability to write to
datasources in a standard way (set up from the sitemap).

Once that is in place, adding an XUpdate Transformer to the mix merely
provides an alternative to writing your own XSLT to make the transformation.

Incedentally, I spent some time yesterday trying to work out if standard
XUpdate transformation could be handled by an XSLT Stylesheet rather than
written in Java. I suspect it would be extremely difficult, if not
impossible. XSLT is not good at selecting, using dynamic XPaths (am I
right?).

>Or we could use CVS, file systems, relational BLOBs, defragment the
>document into LDAP nodes, FTP, POST it somewhere else, email it to your
>mom, fax it to your brother, print it on a piece of paper for your
>favorite human slave to subsequently crawl...
>
>... up to you.
>
>Now, the real question is: do we need any special protocol handlers for
>these things?
>
>In all honesty, I say that it's, again, up to you: you might find it
>easier to come up with something like
>
>  Destination d = DestinationFactory.create("dbxml://host/db/path/");
>
>then serialize the input data into the source stream and have the
>protocol handler automagically generate the XUpdate for you.
>
>In this, case, admittedly, we need the "reversed source".
>
>But the above could be done by hand using the DBXML API, or simply
>calling a special xupdate wrapper around it (as an avalon component?).
>
>Why am I telling you this?
>
>Well, from an architecture design point of view, I can't see any
>limitation in what we already have, but the "reversed source".
>
>Jeremy named it "WritableSource" (which is admittedly an ossimoron).

I did not come up with the name, I will withhold the name of the author to
avoid further embarrassment ;)

>Berin proposed to look at Monitors (which is not exactly a name that
>reminds me of a storage location, BTW)
>
>Anyway, this is, at this point, more an Avalon discussion than a Cocoon
>one, since this is a general thing.

I think the Monitor package provides a JavaBean-type callback mechanism to
be notified of asset changes, this will be useful for cache control, no?

>As for helping Jeremy to implement a Cocoon 2.0 version of FP, my
>suggestion would be to aim at creating a general XUpdateTransformer that
>wraps round the existing DB:XML API to make code persistent in a native
>XML DB solution.

It is not a 'version' of FP I'd like to produce (FP taglib was horrid!),
but a way to re-create the same functionality in a non datasource-dependant
way.

How can we do this in a way that allows people to write protocols for
reading and writing from any datasource. While I applaud the use of XML:DB
(it's great!) file-systems, webdav, cvs, ftp, etc etc. will also be
relevant for others. Given the interface, I would not be surprised to see
these contributed (we got an FTP generator recently!).

For instance, imagine being able to develop a prototype CMS that stores
content in files (for ease of development) then to be able to move the
whole thing to XML:DB merely by changing a couple of protocol strings in
their sitemap and importing the files into a collection. This would deeply
rock IMHO!

>Having the ability to go both ways from a native xml db would instantly
>turn Cocoon into a CMS, or, at least, in a toolkit to create your
>customized CMS (which is what I'd really like to have, rather than a
>fixed CMS)

Definately, a toolkit is the way to go!
I have tried using other people's idea of "The Ideal CMS" it seldom is!
I have even written several of my own, they were never ideal for anybody else.

>Hope this helps.

very much, thanks for your input.
-- 
   ___________________________________________________________________

   Jeremy Quinn                                           Karma Divers
                                                       webSpace Design
                                            HyperMedia Research Centre

   <mailto:[EMAIL PROTECTED]>                    <http://www.media.demon.co.uk>
   <phone:+44.[0].20.7737.6831>             <pager:[EMAIL PROTECTED]>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Re: data goes in, data goes out

Reply via email to