There are examples of regexp matchers in the core sitemap. I'm pretty
poor with regular expressions, if you don't know what to put in the
pattern ask here, I'm sure there will be someone who can tell you how
to match
**.html but not (**/menu-*.html or **/body-*.html or **/tabs-*.html)
(I think they are the only ones you need to avoid).
So this would be something like ^(?!tab-|menu-|body-).*.html$ and
^.*/(?!tab-|menu-|body-).*.html$ respectivly.
Unfortunatly jakarta-regexp (which is used inside cocoon) doesn't seem
to support the negative lookahead (?!...) and gives me a
'RESyntaxException: Syntax error: Missing operand to closure'.
This already been reported on the regexp mailing list (See:
http://permalink.gmane.org/gmane.comp.jakarta.regexp.user/168).
Too bad - jakarta-oro supports perl5 regexps.
I'll go hunting for a supported regexp and will report in later.
Since I promised an update:
A working regular expression (without negative lookahead) is the following:
^(([^t^m^b].*)|((t[^a].*)|(ta[^b].*)|(tab[^\-].*))|((m[^e].*)|(me[^n].*)|(men[^u].*)|(menu[^\-].*))|((b[^o].*)|(bo[^d].*)|(bod[^y].*)|(body[^\-].*)))\.html$
But then again jakarta-regexp leaves me standing in the cold with:
java.lang.StackOverflowError
at org.apache.regexp.RE.matchNodes(Unknown Source)
at org.apache.regexp.RE.matchNodes(Unknown Source)
...
at
org.apache.cocoon.matching.AbstractRegexpMatcher.preparedMatch(AbstractRegexpMatcher.java:86)
Again jakarta-oro matches this without problems.
*sigh*
Torsten