Wouldn't this interact rather badly with the /gc option (which also leaves
C<pos> set on failure)?
This question arose because I was trying to work out how one would write a
lexer with the new /z option, and it made my head ache ;-)
> As you can see from the example code, the program flow stays very close
> to what people would ordinarily program under normal circumstances.
>
> By contrast, RFC 93 proposes another solution to the same problem, but
> using callbacks. Since the same sub must do one of several things, the
> first thing that needs to be done is to channel different kinds of
> requests to their own handler. As a result, you need a complete rewrite
> from what you'd use in the ordinary case.
>
> I think that a lot of people will find my approach far less
> intimidating.
I'm not sure I see that this:
> my $chunksize = 1024;
> while(read FH, my $buffer, $chunksize) {
> while(/(abcd|bc)/gz) {
> # do something boring with the matched string:
> print "$1\n";
> }
> if(defined pos) { # end-of-buffer exception
> # append the next chunk to the current one
> read FH, $buffer, $chunksize, length $buffer;
> # retry matching
> redo;
> }
> }
is less intimidating or closer to the "ordinary program flow" than:
\*FH =~ /(abcd|bc)/g;
(as proposed in RFC 93).
> =head2 Match prefix
>
> It can be useful to be able to recognize if a string could possibly be a
> prefix for a potential match. For example in an interactive program,
> you want to allow a user to enter a number into an input field, but
> nothing else. After every single keystroke, you can test what he just
> entered against a regex matching the valid format for a number, so that
> C<1234E> can be recognized as a prefix for the regex
>
> /^\d+\.?\d*(?:E[+-]?\d+)$/
Isn't this just:
\*STDIN =~ /^\d+\.?\d*(?:E[+-]?\d+)$/
or die "Not a number";
???
Damian