Paul Tremblay wrote at Mon, 15 Jul 2002 18:45:21 +0200:

> On Mon, Jul 15, 2002 at 10:26:25AM +0200, Janek Schleicher wrote:
> 
> 
>> To increase speed, we can make also a lookahead statement:
>> 
>> my @tokens = split / ( \\ (?=\S)            # there's never a whitespace
>>                           (?: [^\s{}]+  |
>>                               [^\s\\}]+ |
>>                               [\\}]       )
>>                        | }
>>                      )/x
>>              => $line;
>> 
>> 
> Can you explain the lookahead statement to me, or better yet, point me to some good 
>documentation?
> It is not explained in *Perl Cookbook.*

The Camel Book and Mastering Regular Expressions.


You can say (?= ... ) in a regex,
to say that this ... has to come.
But the regex machine stays where it it.
E.g. /(?=\w)(?=\d)(?=[^39])/ matches still a "4".

A typical example is something like

my $keyword =~ /(for|while|do|nothing|die)/;

what is quite slow as alternations are always slow.

But you can tell perl not to go into the alternation 
when there's a comma or so with
my $keyword =~ /(?=\w)(for|while|do|nothing|die)/;

as it will only go into the alternation when the next char is a word character,
what is a quick standard test - implemented directly in C.

In your case,
the next character can be a lot,
but it is definitly not a whitespace.
That's why I proposed to use (?=\S).

> As John pointed out, your solution leaves off the leading open bracket. However, it 
>is about twice
> as quick as the previous solution. You solution tokenizes this line:

A tried something different,
look at the answer to John's response.
Perhaps (?!) that works :-)


Best Wishes,
Janek


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to