Hi,
I am not exactly sure if this was meant to be a feature or whether it is
a bug, but when parsing files as input streams (with the help of the
org.apache.oro.text.awk package), patterns are matched on a line-by-line
basis (i.e. delimitted by a new line character). Therefore, if a
pattern extends beyond a single line (as is often the case with HTML
files), it would not be matched.
If this is indeed a bug, all that has to be done is to comment out line
614 in the file AwkCompiler.java:
...
else if(__lookahead == '.') {
CharacterClassNode characterSet;
__match('.');
characterSet = new NegativeCharacterClassNode(__position++);
// characterSet._addToken('\n');
current = characterSet;
...
I am presently using this modified version of the class. However, if
you see fit to include it in the official distribution, it would be much
appreciated.
Take Care.
Terence Jacyno