While working on the configuration file definition of Licenses it became
apparent that a "NOT" IHeaderMatcher was required that would reverse the
state of an other IHeaderMatcher.

Specifically the problem arises when looking at the Apache 2.0 matcher and
the Applied Apache 2.0 matcher.  The difference appears to be that the
"applied" matcher has a copyright statement.  This singular difference
means that the Applied will ony match if the copyright statement comes
before the Apache 2.0 license statement in the file and then only if the
applied matcher is queried for approval first.  What is needed is a "not"
so that the Apache 2.0 can be written as:{noformat}
<all>
    <any id="ALStandard">
        <text>Licensed under the Apache License, Version 2.0 (the
"License")</text>
        <text>Licensed to the Apache Software Foundation (ASF) under one
            or more contributor license agreements; and to You under the
            Apache License, Version 2.0.</text>
        <text>http://www.apache.org/licenses/LICENSE-2.0</text>
        <text>https://www.apache.org/licenses/LICENSE-2.0</text>
        <text>http://www.apache.org/licenses/LICENSE-2.0.html</text>
        <text>https://www.apache.org/licenses/LICENSE-2.0.html</text>
        <text>http://www.apache.org/licenses/LICENSE-2.0.txt</text>
        <text>https://www.apache.org/licenses/LICENSE-2.0.txt</text>
        <spdx name='Apache-2.0' />
    </any>
    <not><copyright /></not>
</all>
{noformat}

To ensure that it does not have a copyright while the applied license is
written as:
{noformat}
<all>
    <matcher_ref refid="ALStandard" />
    <copyright />
</all>
{noformat}
So the basic text or spdx matches but the ASF 2.0 does not have the
copyright statement and the applied does.

All IHeaderMatchers return false by default, except "not" which logically
would return true. However because we read the file one line at a time and
try to perform the matches in the most efficient way possible, "not" is
undecidable until either the enclosed macher returns true or the end of the
file is reached.

The only matchers that contain other matchers are "not", "any", and "all".
The "not" case is described above, "any" and "all" must behave as follows
if they contain a "not".

 * any - any enclosed matcher may trigger a "true"
 * all - All may trigger "true" if all enclosed processes have returned
true".

 So the problem is to determine how to trigger "not" to return true only at
EOF.  Since this we don't have pointers to all of the "not" IHeaderMatcher
instances and we don't have a mechanism to traverse the IHeaderMatcher
tree, I think that the easiest solution is to add a method to
IHeaderMatcher called poll().  Poll will query the state of the
IHeaderMatcher to return the result for EOF.  For most instances this is
 the last state returned, for any and all it means polling the enclosed
objects, for "not" it means polling the contained matcher and returning the
logical opposite of that.  This also means that "not" alwasy returns false
for the test(String) call.

 There are other possible solutions but I think this may be the cleanest
and I wanted to run it by any interested parties before I made the change
as this will change the logic in the reporting engine itself.

 Thanks for your time and attention,
 Claude

Reply via email to