Started to type this in the previous email and wow did it get too long....
In general I'm still not sure what kind of properties we might want to use to
configure all this. Here's a sample of the xml I imagine:
An include based approach:
<scanning>
<includes>
<package>org.superbiz</package>
<package>org.wonderbiz</package>
<class>com.techie.Widget</class>
<exceptions>
<package>org.superbiz.util</package>
<class>com.superbiz.Foo</class>
<pattern>.*Test</pattern>
</exceptions>
</includes>
</scanning>
An exclude based approach:
<scanning>
<excludes>
<pattern>org.*</pattern>
<exceptions>
<package>org.superbiz</package>
<package>org.wonderbiz</package>
</exceptions>
</excludes>
</scanning>
Let me get straight to the point and say I'm not sure an exclude based approach
is useful. Factually, whichever has the fewest rules will be faster. Our
current classpath filtering has a lot of built-in rules we turn on if you
change the default settings, and is hence a little slow if you actually use it.
I copied the Include+Exclude vs Exclude+Include from HTTPD. They call it
Allow,Deny vs Deny,Allow. The names and descriptions are not great and people
clearly misunderstand them. Here's how they have them documented:
Ordering is one of:
Allow,Deny
First, all Allow directives are evaluated; at least one must
match, or the request is rejected. Next, all Deny directives are
evaluated. If any matches, the request is rejected. Last, any
requests which do not match an Allow or a Deny directive are
denied by default.
Deny,Allow
First, all Deny directives are evaluated; if any match, the
request is denied unless it also matches an Allow directive. Any
requests which do not match any Allow or Deny directives are
permitted.
The descriptions are OK enough, but the names imply the opposite in my brain.
They seem to conflict in my reading of them because there are actually THREE
levels of things going on. What the default behavior is is NOT in the name.
It should be something like:
Default Deny, Allow, Deny
First, all Allow directives are evaluated....
Default Allow, Deny, Allow
First, all Deny directives are evaluated....
As a result of the default not being in the name and the topic being somewhat
confusing, you can find tons of blog posts from people advising using
"Deny,Allow" with a DenyAll rule. It feels good to have deny listed first in
your ordering and it feels great to *see* "Deny All" as rule, but what you've
actually done is:
Default Allow, Deny All rule, Allow rules
You just lost one of your three levels. That very last level is meant to allow
you to configure exceptions to your rules. So I don't think they should be
called "Deny,Allow" but simply referred to as they are: exceptions. That gives
you:
Default Deny, Allow rules, Allow rule exceptions
First, all Allow directives are evaluated....
Default Allow, Deny rules, Deny rule exceptions
First, all Deny directives are evaluated....
If people actually had to *see* Default Allow sitting next to Deny All, they'd
probably be more inclined to read the docs better.
Anyway I'm not sure you ever really need both. The most useful to our
situation is likely Default Deny, Allow Rules, Allow Rule Exceptions. Or in
our terms, Default Exclude, Include Rules, Include Rule Exceptions.
Implementation-wise it's extremely easy to support both. They're one or two
lines of code different. But...
Configuration and documentation wise, I'm not sure that we want to present it
all or how to present it all. If we do accept both we should definitely flag
situations were people make the above mistake.
Open to any thoughts on how to expose the possibilities in both property and
xml forms.
I'm somewhat leaning towards only ever showing and documenting an default deny,
includes, include exceptions approach and just have the inverse (default allow,
excludes, exclude exceptions) be something that's just there and we don't
mention unless someone complains.
Technically speaking, whatever has the fewest rules will perform the best.
-David