The Processor interface used to be very simple, and reasonably
documented. Over time it has adopted new methods as part of its
contract, and those have not been well documented. The only reason that
I am bringing this up is that I am trying to implement my own Processor,
and there is a lot that the interface requires that is of little or no
concern to me. First lets see what it used to be 2 years and 7 months ago:
interface Processor
{
boolean process(Environment env) throws Exception;
// the remaining methods were introduced in 2.1
ProcessingPipeline processInternal(Environment env) throws Exception;
Configuration getComponentConfigurations();
}
Already we see we added some scope creep from the 2.0 to the 2.1 series
(the last I worked on Cocoon was the 2.0 series). For example, why is
it necessary for a Processor contract to expose the component
configurations? The "processInternal" method is a coin toss.
Presumably it is to enable cocoon:// or sitemap:// psuedo-protocols to
be more consistent--allowing a parent processor to call
processInternal() on child processors. Nevertheless, one would wonder
why the original process() method wasn't changed to return a
ProcessingPipeline instead of a boolean in this case.
At this point I also want to point out that the original process()
method has decent JavaDocs so that you can understand its purpose and
why it exists, the remaining methods are not that way.
A month later the getComponentConfigurations() method was refactored to
return a Map--presumably of component Configuration objects, but there
is still no documentation on what the expected keys are.
Three months later processInternal was changed to buildPipline (same
arguments and return value)--a better picture but still nothing in the
JavaDocs to help understand the method purpose.
Two months later we add the Processor getRootProcessor() method to
support internal redirects. Now this is one thing that makes Processors
much more difficult to implement. Why can't such a thing be handled by
a ProcessorHelper or something. The root processor problem is
orthagonal to the responsibilities of just one processor.
16 months, 2 weeks ago we had the biggest change to the whole
interface. We have an interface with an internal class?! The
InternalPipelineDescription has a reason for existing, I'm sure.
However I do have to wonder why it is part of the interface. At this
point we are specifying implementation details in the interface. The
contract of the Processor is no longer an active component (i.e. I tell
you how to do something), but a passive one (i.e. I ask you how to do
stuff for myself). The buildPipeline() method is now altered to use the
InternalPipelineDescription instead of return a ProcessingPipeline. At
the same time we add the getContext() and getSourceResolver() methods.
My head is now realing. This is pure insanity. Why not just get rid of
the interface and simply use a base class? After all we are no longer
documenting a contract, we are documenting how to implement the
Processor. My guess is that limitations in the TreeProcessor approach
caused this to be necessary. But again, couldn't most of these things
have been handled by an external helper or utility class? Does it
really need to affect the interface?
11 months, 3 weeks ago we refactored the getComponentConfigurations()
again so that we now have just an array of configurations. Not a biggy,
but I'm still not convinced it is needed here.
3 months ago we have the last change to the Processor interface, and I
am convinced this should have been a TreeProcessor interface that
extends the core Processor interface. We added methods for setting,
getting, and removing attributes for the sitemap interpreters.
The bottom line is that we have exploded the complexity of what was
originally intended to be a light-weight interface. The only solution
for the processor is a complex solution. The only implementation for a
processor is the tree processor. We've made sure that the interface
requires it to be that way. I've got much simpler needs, and there is a
whole host of issues with implementing all these methods that do
nothing. I'd like to see if we can't separate all the different
concerns in the Processor interface into multiple interfaces. What is
the core concerns?
I'm in the process of identifying the real contracts, and I'll have
another post about that.