Re: [Chain] DTD, Design Considerations, etc. (was Re: [Chain] examples XML file available?)

Craig R. McClanahan Wed, 24 Sep 2003 10:01:20 -0700

Sgarlata Matt wrote:

The conversation threads are getting kind of crazy, so I'm going to skip
inline quotes for parts of this.

Craig, now I do see your point about using the ConfigRuleSet to digest a
portion of
an arbitrary XML file.  That is very cool, and now I understand why the
names of the XML elements are configurable.  I always understood the value
of letting attributes to the <command> element (and other elements).  This
is a much slicker approach than creating nested <set-property> elements.
Still, I think it could be useful to have a DTD for the *default* behavior
of the ConfigRuleSet.  I think in general users will start off using the
default behavior, and then may at a later date decide to fold their commands
into some other file, so a DTD will be nice to get people started with the
Chain package.  I agree with most your points about the constraints I placed
in the DTD being inappropriate, and will explicitly address each if we ever
decide to make a DTD.

Unfortunately, none of the Command implementations I am interested in building into my Chains would be usable in a document parsed against such a DTD, for the technical reasons we discussed earlier. Therefore, I'm not interested in any sort of DTD that someone would actually try to use in a validating parser. If what you want is a more high level view of the typical structure of such a file (i.e. documentation), that's a different matter, and should be addressed with diagrams, examples, and how-to guides; minimally, to the level of detail you see in the "Package Description" docs for things like Digester and BeanUtils.

As I've thought more about the package, I don't understand why some design
decisions were made.

Ted points out the GoF patterns this package was created to realize. I discuss below some of my motivations, upon observing limitations in the extensibility of Struts based on the design decisions we made.

 First let me say that I think of Chains as being an OO
way to simulate procedural logic.

It is actually intended to *compose* procedural logic, not to *simulate* it.

In an O-O world, you often compose complex things out of simple things by encapsulating the result in a "black box" method that can be called by your user. Ideally, the user doesn't have to know much about what happens inside the box -- they just use it. But, sometimes, the default behavior is not enough and you need to specialize. A typical approach to specializing in O-O languages is to subclass, and then override the appropriate methods.

Now, it's real easy to add some logic at the beginning, then call super.foo(), and perhaps add some logic at the end. But what happens if you need to add a bunch of stuff in the middle? Or replace one of the method calls performed inside the black box with one that does something slightly different. You're often stuck with having to cut-n-paste the logic of the method you're specializing, then change the calls you want, and hope you notice whenever the class you copied has changed so that you can replicate those changes in your version. Yuck.

One approach to making this work better was tried in Struts 1.1, where the RequestProcessor has a single public process() method to process each request. In turn, the process() method called a bunch of protected processFoo() and processBar() methods that performed a single step. People who wanted to specialize only that step needed to subclass RequestProcessor and then just override that single step. Because the steps were fairly fine-grained, this was a big improvement, and lots of innovative things were done to extend Struts. However, there's two big areas of difficulty left:

* What happens when someone wants to insert new processing phases,
 remove old ones, or reorder the existing ones?  They have to override
 the relatively complex process() method, and hope they get all the
 details exactly right in their cut-n-paste.

* What happens when you, as an application author, want to take advantage
 of customized extensions provided by more than one library, where both
 of them have extended RequestProcessor?  Java does not support multiple
 inheritance, so you're stuck manually weaving the RequestProcessor changes
 together.  Again, yuck.

Far better would be to divide the procedural flow into small steps that are externally configurable. And, let each step have its own arbitrarily complex internal structure (by virtue of the fact that it can be a Command or a Chain of its own). Oh, by the way, the procedural steps become small enough and narrowly focused enough to write high quality unit tests for. And, because commands interact with each other *solely* through a Context, you can easily create mock objects (like a ServletContext or an HttpSession, in a chain destined for a web applicaton) that let you thoroughly test things in a standalone environment (in case it's not obvious, I'm a *huge* fan of JUnit :-).

In an environment like this, people who extend the basic Struts request processing pipeline will do so by providing implementations of the fine-grained commands and chains that make up their extension, and an example of how to update the standard processing chain to include their functionality. For the 80% case, we should even be able to do this without actually modifying the standard chain (by having the standard chain include a step between each actual processing step that says "if there is a command registered under name XYZ, process it here"; so the act of registering your customized chain under a particular name automatically customizes the standard chain).

For a beginning glimmer of this, see how the struts-chain example code (in the Struts sources referenced earlier) uses the generic LookupCommand to conditionally execute a chain named "servlet-complete-preprocess" if and only if it has been registered. Because the "optional" flag is set to true, it doesn't complain if there is no such chain -- it just proceeds on.

 Many of my comments will basically be
concerning why some of the standard control flow abilities (if/then
statements, loops, exception handling, etc) in Java aren't more easily
done/simulated using the Chain package.

For an example of "been there, done that" on trying to script control flows in XML, see the commons-workflow package that is also in the Sandbox. For a far more complex and successful approach at scripting in XML, see commons-jelly (in the main jakarta-commons repository).

Fundamentally, if you feel a big need for if/then and loops, I would suggest you haven't decomposed your procedural logic into small enough pieces yet. That sort of control flow does not itself need to be expressed in chains -- it can be done in the language in which you embed commons-chain (typically Java).

Exception handling is a slightly different story -- it was one of the motivations for the Filter API (see below).

 So, here I go with questions:
1) How come Chains have a static structure?

Chains are static so that you can reuse them in an application without any fear that they are mutating underneath you. In addition, the static structure allows optimized processing in a multithread environment like a web application. Nothing stops you, of course, from (inside a Command) building up your own custom Chain, then executing it, then throwing it away.

 Related to this, how come
Command.execute returns boolean instead of returning Command?

Commands should not know whether or not they have been composed into a chain. Returning a Command would require that type of knowledge. Instead, a Command only says whether or not the computation has been finished.

Other approaches to this kind of thing have also been tried -- see, for example, how javax.servlet.Filter works in Servlet 2.3. This adds the complexity of a FilterContext that has to be passed along (where the knowledge of the chain's constituent Commands is maintained), plus it creates a horrendously deep call stack.

I opted for the simplest possible APIs to make commons-chain very easy to understand and use.

 If it
returned Command this would basically eliminate the need for a Chain
interface altogether.

Your approach only works for the top-level chain -- the current APIs allow chains to use arbitrarily complex subchains. Essentially, you'd force anyone who wants subchains to re-invent what Chain already does.

 Chain would become a concrete implementation of
Command that repeatedly executed Commands until the last command executed
returned null (which would be the new value to indicate the end of a chain).
Static chains (such as those configured using an XML file) would easily be
supported by another concrete implementation of Command which executed a
series of commands in order, completely ignoring their return values.

You can already do exactly this kind of thing with the current API, without imposing on yourself the restrictions described above. Your proposed changes would remove functionality and add complexity.

2) How come Filters have a postprocess method but no catchexception method?

The postprocess method does both things, so you don't need a separate method.

The postprocess block can deal with exceptions, but it seems to me like it
would be more natural for exceptions to be dealt with in a catchexception
block and for postprocess to be strictly for freeing resources that the
Command acquired when its execute method was called.

My experience has been that the catch exception processing normally has to free the resources anyway, so it would end up either calling the postprocess method or duplicating the logic. Either of those is tedious and error prone. Far better just to have a single cleanup method no matter what happened.

Ted Husted wrote:

Of course, another way to go would be to make the Catalog a singleton, or available through some registry, but I'm thinking that going through the Context may be the cleanest approach, since the Context is essentially a Registry too.
How about making a Register class which is an implementation of Context
which stores only references to Catalogs?  We could make the registry itself
a singleton, and write in design notes that since the registry is shared
between apps, each app should store its Catalog(s) in an
application-specific attribute like below:
Registry
|
|---org.apache.struts
|---|---actions.RequestProcessor
|---|---somethingElse
|---com.bah.krm
|---org.apache.commons.something
That's not explained incredibly well, but if each application component reserves its own spot in the registry, we should be able to make the registry a singleton everyone can share. This keeps us from tying ourselves to the Servlet API, as was mentioned in the ChainServlet discussion thread.

It should be obvious that creating such a class is trivially simple, but you should first study how class loaders work in servlet containers, and then think about what happens if commons-chain were placed in a parent class loader (meaning that any singletons you try to create with static variables are shared across *all* webapps, not just the current one). The docs for Tomcat's class loader are specific to Tomcat, but typical of the architecture that most web containers offer:

http://jakarta.apache.org/tomcat/tomcat-4.1-doc/class-loader-howto.html

The other reason I didn't go here is that we're actually at the edge of the patterns that commons-chain was created to implement. How it gets embedded into a particular application should be up to that application.

Indeed, the only reason that Catalog exists is to allow chains to refer to other chains in an organized way. One could argue that even this is out of scope; however, it's very useful to be able to write a Command that uses complex processing logic to decide which other commands (or chains, since you can't tell in the catalog what something is) should be used to actually perform a task.

Class libraries should be narrowly focused on as few fundamental ideas as possible, and then do them there. They should not impose onerous restrictions on applications that want to use them (which is why the config package that uses Digester is optional -- you can compose chains any way you like). The fundamental APIs we've defined so far (Command, Chain, Context, and Filter) are the minimum needed to implement the desired design patterns. Catalog is on the edge, but useful for combining chains/commands in a controlled manner. Registry is more about how catalogs would be accessed, rather than what they do. Indeed, you don't even need such a thing in many environments; for example, a web app's ServletContext attributes do exactly what your registry does for that particular use case.

Matt

Craig

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [Chain] DTD, Design Considerations, etc. (was Re: [Chain] examples XML file available?)

Reply via email to