The Processor interface used to be very simple, and reasonably documented. Over time it has adopted new methods as part of its contract, and those have not been well documented. The only reason that I am bringing this up is that I am trying to implement my own Processor, and there is a lot that the interface requires that is of little or no concern to me. First lets see what it used to be 2 years and 7 months ago:

interface Processor
{
   boolean process(Environment env) throws Exception;

   // the remaining methods were introduced in 2.1
   ProcessingPipeline processInternal(Environment env) throws Exception;
   Configuration getComponentConfigurations();
}

Already we see we added some scope creep from the 2.0 to the 2.1 series (the last I worked on Cocoon was the 2.0 series). For example, why is it necessary for a Processor contract to expose the component configurations? The "processInternal" method is a coin toss. Presumably it is to enable cocoon:// or sitemap:// psuedo-protocols to be more consistent--allowing a parent processor to call processInternal() on child processors. Nevertheless, one would wonder why the original process() method wasn't changed to return a ProcessingPipeline instead of a boolean in this case.

At this point I also want to point out that the original process() method has decent JavaDocs so that you can understand its purpose and why it exists, the remaining methods are not that way.

A month later the getComponentConfigurations() method was refactored to return a Map--presumably of component Configuration objects, but there is still no documentation on what the expected keys are.

Three months later processInternal was changed to buildPipline (same arguments and return value)--a better picture but still nothing in the JavaDocs to help understand the method purpose.

Two months later we add the Processor getRootProcessor() method to support internal redirects. Now this is one thing that makes Processors much more difficult to implement. Why can't such a thing be handled by a ProcessorHelper or something. The root processor problem is orthagonal to the responsibilities of just one processor.

16 months, 2 weeks ago we had the biggest change to the whole interface. We have an interface with an internal class?! The InternalPipelineDescription has a reason for existing, I'm sure. However I do have to wonder why it is part of the interface. At this point we are specifying implementation details in the interface. The contract of the Processor is no longer an active component (i.e. I tell you how to do something), but a passive one (i.e. I ask you how to do stuff for myself). The buildPipeline() method is now altered to use the InternalPipelineDescription instead of return a ProcessingPipeline. At the same time we add the getContext() and getSourceResolver() methods. My head is now realing. This is pure insanity. Why not just get rid of the interface and simply use a base class? After all we are no longer documenting a contract, we are documenting how to implement the Processor. My guess is that limitations in the TreeProcessor approach caused this to be necessary. But again, couldn't most of these things have been handled by an external helper or utility class? Does it really need to affect the interface?

11 months, 3 weeks ago we refactored the getComponentConfigurations() again so that we now have just an array of configurations. Not a biggy, but I'm still not convinced it is needed here.

3 months ago we have the last change to the Processor interface, and I am convinced this should have been a TreeProcessor interface that extends the core Processor interface. We added methods for setting, getting, and removing attributes for the sitemap interpreters.

The bottom line is that we have exploded the complexity of what was originally intended to be a light-weight interface. The only solution for the processor is a complex solution. The only implementation for a processor is the tree processor. We've made sure that the interface requires it to be that way. I've got much simpler needs, and there is a whole host of issues with implementing all these methods that do nothing. I'd like to see if we can't separate all the different concerns in the Processor interface into multiple interfaces. What is the core concerns?

I'm in the process of identifying the real contracts, and I'll have another post about that.

Reply via email to