Re: data goes in, data goes out

giacomo Thu, 29 Nov 2001 05:56:26 -0800

On Wed, 28 Nov 2001, Stefano Mazzocchi wrote:

> I'm restarting Jeremy's thread about WritableSource in order to achive a
> more architecture-oriented view of the problem.
>
> It is obvious that, in order to be a useful piece of a successful CMS,
> Cocoon must be equally able to drive both data flows: inward and
> outward.


Agreed.

> For the outward flow, Cocoon beats the crap out of almost any other
> solution (IMNSHO).
>
> For the inward flow, well.... let's be honest: it more or less sucks.

It was never taken into consideration so far and that sucks now.

> ok, ok, maybe I'm kinda rude but this is the truth: there are frameworks
> outthere that give you much better tools to handle the data that flows
> in.

As Prowler is one of those which abstracts the sources to look XMLish.

> Cocoon is currently a publishing framework, where publishing means
> "outward-flow oriented",

That's what I mentioned above.

> but all of us would like to see a better
> synergy (or equilibrium) between the two flows, in order to make web-app
> building easier, but without loosing the technological advancements that
> Cocoon introduced.
>
> Is this possible?
>
> Giacomo and I are holding back a big RT about the next generation of
> pipeline component assembling (a.k.a. flowmap, sitemapNG or more
> generally "statemap"). We spent a few days talking about it and we
> didn't come up with something solid enough to propose as a solution...

That RT is still in my Dreaft folder not finding the right words and
diagrams and of course a proposable solution to all that.

> ... but we *did* identify some general rules and general concepts that
> might help triggering the discussion about this.
>
> Anyway, this is not what this mail is about.
>
> The question, today, gets deeper: is the current Cocoon architecture
> asymmetric in respect of data flow?

Regarding the concepts (URLFactory, Source), yes it is.

> First (and obvious) thought is to *reverse* the terminology:
>
>  generators -> transformers -> serializers
>
> becomes
>
>  deserializer -> transformers -> accumulator
>
> Is this something good?
>
> No, I don't think so: what would be the difference between a generator
> and a deserializer in terms of interfaces? I can't see one.

Exactly.

> And what about accumulators? well, they'd store information, but would
> need to give some information back. If this is in form of SAX events, an
> accumulator is a transformer, if this is a binary stream, it's a
> serializer.
>
> But one concept here is good: the concept that the opposite of a
> generator is an accumulator (I'm an electronic engineer, you know :), in
> electrical terms, a battery, a place where energy (here, data) is stored
> for future use.
>
> In software terminology, an accumulator is normally called a "store",
> "archive", "cabinet", "safe" and so on with terms that give you an idea
> of "something" that allows you to place information that will
> persistently remain there.

A Sink as opposed to a Source. But seems to be mutually exclusive as what
Jeremy is proposing with his WritableSource (which IMO isn't a good name
as well as SinkSource).

> So, does this mean we need a "Store"-behaving component in the cocoon
> pipelines?
>
> No, such a store-behaving component would not be distinguishable from
> another transformer.

I'm not so sure. Today we feed the Serializer with an OutputStream where
it can "store" its output onto (even if this is outward from a requestors
point of view). What about using the same architecture for a
StoringPipeline where the last component (accumulator for now) will be
given a Sink to store the Output it produces.

> So, I think I just proved that the current pipeline architecture is
> *not* inherently biased toward outward data flow, despite the component
> names that seem to inspire the opposite.
>
>                             - o -
>
> The above is an important result: it states that Cocoon doesn't require
> core architecture modifications. I already knew this, otherwise I would
> not be +1 on a final Cocoon 2.0 release. But it's good to let you guys
> know about this as well.
>
> So, now that we have an architecture that is, at least, in theory,
> general enough to handle the problem, how do we attack it?
>
> The "accumulator" concepts gives interesting ideas:
>
>  1) it must receive SAX events (it would force concern overlap if the
> binary adaptation was performed by the store itself, so
> "parsing"/"deserialization" should be performed before, at XML
> generation stage)
>  2) it must throw data out (otherwise we would not be able to know what
> happened)
>  3) it should be SAX events (otherwise, we would not be able to process
> the data further)
>
> Thus, a "store" is behaviorally equivalent to a "transformer".

Well, in term of web applications (and CMS is more a web application that
a publishing system) I'd like to see the inward step separated from the
outward one. We need to be able to perform some logic in between inward
and outward processing. I don't think you can directly connect the inward
pipe to the outward pipe without giving an Action a chance to act on the
results of the inward pipe.

>
>                            - o -
>
> Now, how can this be handled? should we reverse the Source interface and
> come up with a "Destination" (sort of InputStream/OutputStream
> parallel)?
>
> If this XML contained data, we could come up with an automatic
> relational mapping between the arriving XML and the SQL update query on
> a RDBMS. It could be pretty straightforward.
>
> At the same time, how would you do such a thing on an article stored on
> a native XML DB?
>
> As Jeremy pointed out, XUpdate is a choice. I must tell you that I don't
> like XUpdate that much, but I admit it's a choice.

Please share your oppinions on XUpdate with us.

>
> Or we could use CVS, file systems, relational BLOBs, defragment the
> document into LDAP nodes, FTP, POST it somewhere else, email it to your
> mom, fax it to your brother, print it on a piece of paper for your
> favorite human slave to subsequently crawl...
>
> ... up to you.

Crys for an abstraction like Prowler!?

>
> Now, the real question is: do we need any special protocol handlers for
> these things?
>
> In all honesty, I say that it's, again, up to you: you might find it
> easier to come up with something like
>
>   Destination d = DestinationFactory.create("dbxml://host/db/path/");
>
> then serialize the input data into the source stream and have the
> protocol handler automagically generate the XUpdate for you.
>
> In this, case, admittedly, we need the "reversed source".
>
> But the above could be done by hand using the DBXML API, or simply
> calling a special xupdate wrapper around it (as an avalon component?).
>
> Why am I telling you this?
>
> Well, from an architecture design point of view, I can't see any
> limitation in what we already have, but the "reversed source".
>
> Jeremy named it "WritableSource" (which is admittedly an ossimoron).

[havn't found 'ossimoron' in my dict(?)]

>
> Berin proposed to look at Monitors (which is not exactly a name that
> reminds me of a storage location, BTW)
>
> Anyway, this is, at this point, more an Avalon discussion than a Cocoon
> one, since this is a general thing.
>
> As for helping Jeremy to implement a Cocoon 2.0 version of FP, my
> suggestion would be to aim at creating a general XUpdateTransformer that
> wraps round the existing DB:XML API to make code persistent in a native
> XML DB solution.
>
> Having the ability to go both ways from a native xml db would instantly
> turn Cocoon into a CMS, or, at least, in a toolkit to create your
> customized CMS (which is what I'd really like to have, rather than a
> fixed CMS)
>
> Hope this helps.
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, email: [EMAIL PROTECTED]

Re: data goes in, data goes out

Reply via email to