In message <[EMAIL PROTECTED]>, "Noel J. Bergman" w rites: >An alternative would be to add an Observer (actually, wouldn't that be a >Listener, to remain consistent with Java terminology? :-)) with the pattern, >although it seems that none of the regex engines support compiling multiple >patterns, which I find truely bizzare. That would allow the Listener code >to execute as each pattern is found in the stream.
There are some classes related to this in org.apache.oro.text that I've never been particularly satisfied with. MatchAction is basically a listener/observer for MatchActionProcessor, which process input line by line awk-style and invokes registered MatchActions (when their respective patterns are matched. The classes are too special-purpose and are geared toward filtering operations, requiring an output stream to be provided along with an input and supporting an awk-style field separator. However, the motivation for them is similar and this might be a good opportunity to generalize their basis to accommodate the use-case you have in mind. Actually, on r-reading your original message I think I didn't understand what you were looking for. You want to be able to read from an input stream or write to an output stream as you normally would. But transparent to this reading or writing, you want the data read or the data written to be tested for pattern matches (on the continuous stream of data) and be notified of these matches. If I understand that correctly, then that would be a different from what I had initially understood (although you can use a tee-like stream copier to graft on AwkStreamInput). In that case, the kicker is the problem I mentioned in my last message about not being able to definitively identify matches in a stream without reading (and buffering) the entire stream. This is not a problem with strictly DFA matching as per AwkMatcher, but is a problem for Perl-type matching. If you are willing to live with an Expect-like compromise of limited buffering or my suggestion of specifying a bound on the length of a match, this is tractable and perhaps an appropriate addition to org.apache.oro.io (at the same time we can take the opportunity to implement the matcher factories we've been putting off so we can wrap jakarta-regexp and java.util.regex or whatever and you can use whatever regular expression package you want with the class). I think the interface is will be some variation of what Leo offered (with whatever additional tweaks may arise; e.g., you have to be able to specify the patterns to be matched, maybe you want to register multiple listeners instead of just one), but the internals of an implementation are dependent on the limitations I mentioned. There's a quick and dirty way to get this done, but I believe later fine-tuning for performance or soundness of overall design may require whatever regex matcher is used to support incremental matching. For example, if you're in the middle of a match but more data hasn't been written/read to determine whether a match exists, you want to be able to continue the matching process from where it left off rather than restarting from scratch. Preserving matching progress is definitely doable for DFAs, but NFAs may just have to start over again. In practice, it may not make a big difference which is why I would go the quick and dirty way first. daniel --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
