Eric van der Vlist wrote:
> The fact that we can define targets doesn't mean that the execution > of a processor is always explicit. Look for instance at make. When > you give no target, the first one in the Makefile is chosen. A rule > that won't break current applications could be that when you specify > no target, all the processors without output are executed.
I think breaking compatibility is not a big problem, because XPL 1.0 will probably feature a version attribute.
Most of the time I don't like implicit rules, especially when they look arbitrary. In your example, an XPL programmer would have to say, ok, if I add a target, then this processor executes. But if I don't, all of them execute. And then scratch his / her head ;-) I would prefer a unique ground rule, even if it has some non-intuitive ramifications.
> It would make sense IMO to be able to define the "default target" in > the pipeline and to be able to overwrite it in the pipeline > invocation (like a xsl:param).
But then you need a way of passing that parameter to the pipeline invocation. As long as that's done through an XML Infoset, that would be fine. Otherwise, again, you add a new concept to the language, that of being able to pass parameters that are not expressed as XML Infosets.
> You refer to one of them as a "black hole" :-) ... We could call > then "actions" but that could be confusing with PFC actions as they > can have outputs...
Here we are looking at an XPL that is completely independent from OPS. So if the right term was "action", we would maybe go for it. But I don't think "action" it the right term. Producing a serialized XML document on stdout does not sound more like an "action" to me than transforming an document into another through XSLT.
Others have the term "sink". In the beginning, we constantly used "generator", "transformer", and "serializer", maybe under the influence of Cocoon. But in XPL those denominations don't make that much sense purely from the perspective of the inputs and outputs they have, because often processors that "generate" an XML Infoset, like an URL "generator", actually takes an XML Infoset input to configure it as well. So you can't just look at the processor as a black box and say, it's a generator because it has only an output, because in fact it has an input as well!
So those terms seem more appropriate to describe a particular processor subjectively, based on functionality, rather than based on the inputs and outputs they have.
For now the spec just talks about "processors that don't have any connected outputs" ;-)
>>Maybe this could be changed to performing the initialization phase on >>all the processors, like suggested in the post linked above. This >>could make the processing model a little easier to understand, because >>then even a processor with an unconnected output has an opportunity to >>do something on an equal footing with processors that do not have >>outputs. If it wants to perform some actions during its initialization >>phase, it may do so; if it wants to use the lazy approach and do its >>work only when its output(s) is/are read, it may do so as well. > > Yes, at first thought, I think that this would be better...
This has pretty big implications though, on the execution of things like p:choose, on sub-pipelines, etc. I temporarily convinced myself yesterday that this was after all not so good ;-) And that if we decide to be "lazy" in the execution process, better to assume the implications. But that's probably not the end of it.
> What would also be most useful is to document for each processor, > what is done during the each initialisation and read phase.
Absolutely. Usually, the solution is simple, because when you have outputs, nothing visible is done during the initialization phase, everything appears to be done in read phases, in a lazy approach (things are used whenever they are needed). With processors that don't have outputs, everything is done in the initialization phase.
> It could also be nice to provide a kind of visualisation (graphical > or not) of the flows of actions that can be expected from a pipe. Do > you think a XSLT transformation could do that (take a XPL document > and generate a XHTML (or SVG) representation of what can be > expected) or would that be too complex?
Yes in theory, but it is difficult. We tried doing this at some point but the issue of actually doing a layout was complex and we gave up. There are some open source libraries that can help us do that though.
>>Whether the behavior remains the same or not depends on how you write >>the pipeline, but it is likely to remain the same. Clearly, reading >>the output may cause other tasks to be performed (i.e. other >>processors to be executed related to producing that output). In theory >>it could even cause the order of execution of processors to be >>different, but in your particular examples with serializers only, the >>execution order would remain the same, because all processors without >>outputs are guaranteed to be executed first. > > Hmmm... In that case, I wouldn't want the serialisers to be called when > there is an output... > > The following pipe should be valid: > > <p:config> > > <p:param name="data" type="input"> > <p:param name="target" type="input"> > <p:param name="data" type="output"> > > <p:choose href="#target"> > <p:when test="target='xml'"> xml serializer </p:when> > <p:when test="target='html'"> html serializer </p:when> > </p:choose> > > <p:processor name="oxf:identity"> > <p:input name="data" href="#data"/> > <p:output name="data" ref="data"/> > </p:processor> > > </p:config> > > and given what you've said when a target is specified and the output > isn't used, only the init phase of one of the serialisers would be > called while when a target isn't specified and the output is used > only the identity processor would be called.
Not quite. If you don't connect the "data" output of the pipeline when you use it, the pipeline doesn' have any connected outputs. So it executes as a "serializer", which means that it goes through the initialization phase. During that phase, one of the two serializer, XML or HTML, will be executed through the execution of the p:choose.
If you do connect an output, when you read that output, the pipeline also goes through the initialization phase. So your XML or HTML serializer is also run.
Which seems fair to me, because, as you said, reading the pipeline output should if possible not change the behavior of the pipeline. In this particular case, you are getting what you want from that point of view.
> If I understand correctly, the pipe itself would then be considered > as having an output and wouldn't be called at all (from another > pipe) if this output isn't connected and one would have to use the > null serializer to force its execution...
There are two use cases:
1. Call the pipeline from some Java code. Here you would have control through an API over what you do with the pipeline: initialize it, and read from it. So let's forget this one.
2. Call the pipeline through the pipeline processor within another pipeline. In this case, if the processor does not have any connected outputs, it is run as a serializer so initialized when the calling pipeline initializes. If it does have a connected output, it is initialized when that output is read, and then a read phase follows.
> That makes sense, but I feel a little bit uneasy by the way all that > is working. But I can't even explain why! Maybe that's just the fact > that actual actions (such as sending a HTTP response) is done in a > method called "init" which hurts me...
In the current implementation the method is actually called start() ;-) This said, I called this phase "initialization", but maybe we can find a better term. Maybe it should just be an "execution" phase and zero or more "read" phases. The issue is that both phases do belong to what is considered the execution of the pipeline.
The only thing that bothers me slightly at the moment is the determination of when a processor without connected outputs is executed. As you pointed out initially, that's what bothered you too. What I see at this point is that we can:
1. Leave things the way they are.
2. Try to figure out some kind of "target" approach, by which we would remove the automatic execution of processors without connected outputs. Whether real targets a la ant or simply a mechanism to tell whether such processors are executed or not.
3. Going the other way and make all processors on an equal footing, by saying that processors with connected outputs are also initialized.
> The fact that it seems possible to emulate targets with a simple > choose would tend to show that the current balance between > simplicity and features is good!
Thanks! But don't believe that we are inflexible on XPL at this point.
> What about a compact syntax (ala RELAX NG)?
That could be good too, but probably after the good old verbose XML syntax is specified - unless there is a volunteer to make a proposal in parallel.
-Erik
------------------------------------------------------- This SF.Net email is sponsored by: IntelliVIEW -- Interactive Reporting Tool for open source databases. Create drag-&-drop reports. Save time by over 75%! Publish reports on the web. Export to DOC, XLS, RTF, etc. Download a FREE copy at http://www.intelliview.com/go/osdn_nl _______________________________________________ orbeon-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/orbeon-user
