> > Although fpc has a regexpr unit: > > http://svn.freepascal.org/svn/fpc/trunk/packages/base/regexpr/regexpr.pp > > It has many todos, such as adding support for | in the search expression. > > So this > > unit doesn't have enough functionality. > > While '|' support is to be considered basic regex functionality, what is the > really expected functionality? > > Basic seems to be: |()?*+ (non-UNICODE) support (from wikipedia).
| is not basic afaik. From re_format BSD Manpage: Obsolete (``basic'') regular expressions differ in several respects. `|' is an ordinary character and there is no equivalent for its functional- ity. `+' and `?' are ordinary characters, and their functionality can be expressed using bounds (`{1,}' or `{0,1}' respectively). Also note that `x+' in modern REs is equivalent to `xx*'. The delimiters for bounds are `\{' and `\}', with `{' and `}' by themselves ordinary characters. The parentheses for nested subexpressions are `\(' and `\)', with `(' and `)' by themselves ordinary characters. `^' is an ordinary character except at the beginning of the RE or= the beginning of a parenthesized subex- pression, `$' is an ordinary character except at the end of the RE or= the end of a parenthesized subexpression, and `*' is an ordinary charac- ter if it appears at the beginning of the RE or the beginning of a paren- thesized subexpression (after a possible leading `^'). Finally, there is one new type of atom, a back reference: `\' followed by a non-zero deci- mal digit d matches the same sequence of characters matched by the dth parenthesized subexpression (numbering subexpressions by the positions of their opening parentheses, left to right), so that (e.g.) `\([bc]\)\1' matches `bb' or `cc' but not `bc'. _______________________________________________ fpc-pascal maillist - fpc-pascal@lists.freepascal.org http://lists.freepascal.org/mailman/listinfo/fpc-pascal