Hi James developers,

I enclose a design that uses the composite pattern in the James mail-server
to permit declaration of complex matchers in the james-config.xml file
(deployed as config.xml).

A summary of the Mailet API changes for consideration and voting are
enclosed:

• Matchers can be pre-declared before use in a Mailet through a <matcher>
element declaration which must precede the first use in a Mailet.
• The Mailet refers to the pre-declared matchers via the supplied name
attribute, the name being an alias to the composite class instance.
• The Matchers are loaded and inited via the JamesMatcherLoader derived from
the MatcherLoader interface that has been modified to include an additional
signature accepting the alias name.
• A Not matcher has been proposed to negate another matcher's result to
provide negated logic construction - it mimics the implementation of the Not
functionality performed in processor recipient handling.
• And there are three initially proposed composites: And, Or and Xor. 
• The And produces the intersection of two or more child-matcher recipient
results.
• The Or produces the union of two or more child-matcher recipient results.
• The Xor produces the exclusive or (non-identity) composition of two or
more child-matcher recipient results. 
• All operations are commutative and are applied to each child matcher
recipient collection in order but under certain cases it is short-circuited
to optimise performance (assuming that matchers do not have side-effects).

The following composite pattern declaration is typical of what can be
achieved, and is an extract from my James server config.xml file:

        <!-- this isn't a good spam-check but it illustrates what you can do
-->
          <matcher name="spam-check" match="Or">
                <matcher
match="HasRecipientsInDomainNotMatchingRegex=arising.com.au,.*(ralph|pocketf
ms|angelflight|vk1brh|accounts|cuteftp|ralph\.holland|resume|trx)@arising.co
m.au.*"/>
                <matcher match="And">
                        <matcher match="Not">
                            <matcher match="HostIs=65.55.116.84"/>
                        </matcher>
                        <matcher
match="HasHeaderWithRegex=X-Verify-SMTP,Host(.*)sending to us was not
listening"/>
                </matcher>
                <matcher match="HasHeaderWithRegex=X-DNS-Paranoid,(.*)"/>
                <matcher
match="HasHeaderWithRegex=Subject,(.*)([Vv][Ii][Aa][Gg][Rr][Aa]|[Cc][Ii][Aa]
[Ll][Ii][Ss]|[Vv][Ii][Cc][Oo][Dd][Ii][Nn])(.*)"/>
                <matcher
match="HasHeaderWithRegex=From,(.*)([Vv][Ii][Aa][Gg][Rr][Aa]|[Cc][Ii][Aa][Ll
][Ii][Ss]|[Vv][Ii][Cc][Oo][Dd][Ii][Nn])(.*)"/>
                <matcher match="HasHeaderWithRegex=Subject,.*Download Adobe
PDF Reader For Windows.*"/>
                <matcher
match="HasHeaderWithRegex=From,(.*)([Ee][Nn][Ll][Aa][Rr][Gg][Ee][Mm][Ee][Nn]
[Tt])(.*)([Pp][Ii][Ll][Ll][Ss])(.*)"/>
                <matcher
match="HasHeaderWithRegex=Subject,(.*)([Ee][Nn][Ll][Aa][Rr][Gg][Ee][Mm][Ee][
Nn][Tt])(.*)([Pp][Ii][Ll][Ll][Ss])(.*)"/>
                <matcher match="InSpammerBlacklist=zen.spamhaus.org"/>
                <matcher
match="SenderIsRegex=(.*)([Vv][Ii][Aa][Gg][Rr][Aa]|[Cc][Ii][Aa][Ll][Ii][Ss]|
[Vv][Ii][Cc][Oo][Dd][Ii][Nn])(.*)"/>
         </matcher>

         <mailet match="spam-check" class="ToProcessor">
                <processor>spam</processor>
         </mailet>

The affected code is:

MatcherLoader and the JamesMatcherLoader derivative with the new included
signature:

         /**
         * @param matchName is the regular className with optional condition
expression
         * @param name is the alias or name attribute
         */
       public Matcher getMatcher(String matchName,String alias) throws
MessagingException;

New org.apache.mailet.CompositeMatcher:

/**
 * A CompositeMatcher contains child matchers that are invoked in turn and
their
 * recipient results are composed using the composite class operation. (See
And, Or, Xor and Not.)
 * One or more children may be supplied to a composite via declarations
inside a <processor> element
 * in the james-config.xml file. When the composite is the outer-level
declaration it must be named as in the example below.
 * The composite matcher is referenced by name in the match attribute of a
subsequent mailet. It may be referenced any number 
 * of times in this way. Any matcher may be included as a child of a
composite matcher, including another composite matcher or Not. 
 * As a consequence, the class names: And, Or, Not and Xor are permanently
reserved.
 * <pre>
 *   <matcher name="a-composite" match="Or">
 *              <matcher match="And">
 *                      <matcher match="Not">
 *                          <matcher match="HostIs=65.55.116.84"/>
 *                      </matcher>
 *                      <matcher
match="HasHeaderWithRegex=X-Verify-SMTP,Host(.*)sending to us was not
listening"/>
 *              </matcher>
 *              <matcher match="HasHeaderWithRegex=X-DNS-Paranoid,(.*)"/>
 *       </matcher>
 *       <mailet match="a-composite" class="ToProcessor">
 *              <processor>spam</processor>
 *       </mailet>
 * </pre>
 * @author Ralph Holland
 *
 */
public interface CompositeMatcher extends Matcher 
{
     
   /**
    * @return iterator to children matchers
    */
    public Iterator iterator();
    
  /**
    * Add a child matcher to this composite matcher. This is called by
SpoolManager.setupMatcher()
    * @param matcher
    */
    public void add(Matcher matcher);
    
    
}

New org.apache.mailet.CompositeMatcherBase:

public abstract class CompositeMatcherBase extends GenericMatcher implements
CompositeMatcher
{
    /**
     * This lets the configurator build up the composition (which might be
composed of other composites
     * @param matcher
     */
    public void add(Matcher matcher)
    {
        matchers.add(matcher);
    }
    
    /**
     * @return iterator to child-matchers
     */
    public Iterator iterator()
    {
        return matchers.iterator();
    }
    
    
    private Collection matchers = new ArrayList();
    
}

New org.apache.james.transport.matchers.And:

public class And extends CompositeMatcherBase
{
   /**
     * This is the And CompositeMatcher - consider it to be an intersection
of the results. 
     * If any match returns an empty recipient result the matching is
short-circuited.
     * @return the And composition of the nested matchers.
     */
    public Collection match(Mail mail) throws MessagingException
...

New org.apache.james.transport.matchers.Or:

public class Or extends CompositeMatcherBase {
    

    /**
     * This is the Or CompositeMatcher - consider it to be a union of the
results. 
     * If any match returns an empty recipient result the matching is
short-circuited.
     * @return the Or composition of the nested matchers.
     */
    public Collection match(Mail mail) throws MessagingException
...

New org.apache.james.transport.matchers.Xor:
public class Xor extends CompositeMatcherBase {

    /**
     * This is the Xor CompositeMatcher - consider it to be the inequality
operator for recipients
     * If any recipients match in all the matcher results then the result
does not include that recipient.
     * @return the Xor composition of the composed matchers.
     */
    public Collection match(Mail mail) throws MessagingException
...

New org.apache.james.transport.matchers.Not:
public class NotMatcher extends CompositeMatcherBase {

    /**
     * This is the Not CompositeMatcher - consider what wasn't in the result
set of each matcher.
     * Of course it is easier to understand if it only includes one matcher.

     * @return the Negated composition of the nested matchers.
     */
    public Collection match(Mail mail) throws MessagingException
    ...

Modifications to SpoolManager.initialize() to handle loading and
initialization of the composites.


Various unit tests have been constructed to test the composite operations,
though there are no unit tests YET to ensure that the actual JamesLoader
loads any matcher correctly - I think such a test should exist rather than
mocking the class loading.

If you like this design and vote for it, I will provide the complete source
code for evaluation and re-distribution in James, with the only caveat being
attribution acknowledgement. 

I am also open to the re-naming of the composite classes because: And is
Intersection, Or is Union, and Xor is NotEqual, in terms of the recipient
result sets. A result set is true if at least one recipient is returned, as
per the James processing chain, and should not all recipients be returned,
then a clone of the mail will be sent with the Not of the recipients to the
next processor in the processing chain until the mail is acquitted.


Regards,
Ralph Holland
Managing Director
www.arising.com.au
BH:    61 2 61271265
AH:    61 2 62312869
Fax:    61 2 62312768
Mob:  0417 312869 (AH/weekends only)
www.arising.com.au/aviation
_______________________________________________________________________
This email message and any accompanying attachments may contain
information that is confidential and intended only for the use
of the addressee(s) named above. It may also be privileged. 
If you are not the intended recipient do not read, use, 
disseminate, distribute or copy or take any action in reliance on it. 
If you have received this message in error, please notify the sender 
immediately, and delete this message. It is your responsibility to 
check attachments for viruses or defects. 
_______________________________________________________________________



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to