Github user nishihatapalmer commented on the issue:
https://github.com/apache/incubator-metron/pull/541
There is a slightly out of date (note to self: update this!) syntax
document at:
https://github.com/nishihatapalmer/byteseek/blob/master/src/main/java/net/byteseek/parser/regex/Regular%20Expression%20syntax.txt
It gives an overview of most of the syntax, but some of it is only usable
by full regexes, not sequence matchers. In particular it can only accept
syntax which leads to a fixed length expression, so these are **excluded**:
```
* zero to many
+ one to many
() groups
{n,n} n to m copies.
X | Y alternatives.
```
Shorthands defined in this document also do not currently function properly
(e.g. [ascii].
Finally note that inversion ^ functions differently to most regular
expression syntaxes. The token being inverted is the following token, not the
entire set. So most regex would say something like [^ 01 02 03] meaning every
byte except 01, 02 and 03. In byteseek this would be ^[ 01 02 03], as you are
inverting the set. [ ^01 02 03] is also valid - except you are now specifying
a set containing everything but 01 (which already covers 02 and 03).
It's fairly easy to create a different parser if necessary, but most of
byteseek regex syntax is fairly standard - but oriented towards bytes rather
than strings as the default atomic unit.
Any questions please feel free to ask (and I really must update the syntax
document!).
Regards,
Matt.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---