Hi everybody. Been lurking for a while, nice to meet you all.

How simple is simple?

If it's as simple as fragments of words, and your text is really long,
and doesn't change, you'll want to index the text in advance, to speed
the searches. For this case, suffix trees or suffix arrays are very
fast. "Algorithms on strings, trees and sequences" has good analysis on
many methods on long strings (it's oriented towards genetics...).

If your regexes are not quite that simple, but do contain fragments of
text, you might still use the above for an initial screening and then
match the regexes around those parts of the text.

Daniel Vainsencher

Eli Marmor <[EMAIL PROTECTED]> wrote:
> And a similar question: If I have a collection of hundreds (simple)
> regular expressions, and want to find all the matches of them in a long
> free text, is there any Open Source library for this purpose?  (like
> flex, but without generating C code + compilation to machine code; Just
> a function library).

=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Reply via email to