[REBOL] Regular Expressions Re:(4)

ole_f Sat, 8 Jan 2000 16:27:27 -0800
Hi Petr, 8-Jan-2000 you wrote:

[...]
>> Ah, then I just got you wrong. The easy way to do the above is first to
split
>> the text up with
>>
>> parse str none
>>
>> and then match the individual words. However, this should work too:
>>
>> sep: charset " ,.!?" ; and whatever else you want to split up words
>> b: [skip b | "ing"]
>> a: [b | to "ing" [sep to end | a]]
>> parse str a
>>
>> though it's completely untested, and I agree that it's ugly :-)

>This will not work imho :-)

You're right. As I said, it was completely untested ;-)

I used "to" instead of "thru", and we need the /all refinement. But apparently
that's still not enough, though I don't know why...

To make things even more spooky, look at this:

sep: charset " ,.!?"
b1: [skip b1 | "ing"]
b2: [skip b2 | "ing" sep to end]
a: [b1 | b2]

The idea here is that b1 will match any string ending in "ing", and b2 will
match any string which contains "ing" followed by a separator. Now, watch
this:

>> parse/all "ringing bells" b2
== true
>> parse/all "ringing bells" a 
== false

Since the parse with b2 returned true, the parse with 'a should indeed be
true, too, since 'a is true if either b1 or b2 gives true. Now, if instead I
define 'a as [b2 | b1], we get true with 'a instead.

I just cannot see why this should be so?

>as for "ringing bells"

>Again, you are going to reach end of the string, then going back recursively
>until first "ing" (applied from end of the string) is not matched. :

>>> b: [skip markb: (print ["markb: " markb index? markb]) b | back-b: (print
>["back-b: " back-b in
>dex? back-b]) "ing"]
>== [skip markb: (print ["markb: " markb index? markb]) b | back-b: (print
>["back-b: " back-b index
>? back-b]) "ing"]
>>> parse str a
[...]
>back-b:  g bells 7
>back-b:  ng bells 6
>back-b:  ing bells 5
>== false

>Your code will fail with something like "ringing sounding bell" ... it will
match
>sounding, so generally said - the last occurance of "ing" contained in the
string
>...

It should then try to match the part to the right of the "|" in the top rule
(rule a), which should match. But that doesn't seem to happen.

>And because of that, second part of 'a - {to "ing" [sep to end | a]}will be
NEVER
>applied, as your pointer is just behind last occurance of "ing" contained in
your
>string ... that's why you got false result.

So the second part of 'a will pick up where the first part left? If that's the
case, then Parse _really_ has a bug, since the second part of 'a should pick
up at the very same spot as where the first part started, which is the start
of the entire string.

>change 'a to [b to end] and once succesfully back from 'b, it will continue
"to
>end" and return "true" ...

But that's not what it should do. Then the rule would match all strings with
"ing" in it. We only want to match strings that either end in "ing" or have a
word inside it (words are separated by spaces, commas, etc.) that end with
"ing".

Kind regards,
-- 
Ole Friis <[EMAIL PROTECTED]>

"Ignorance is bliss"
(Cypher, The Matrix)
[REBOL] Regular Expressions Re:(4)

Reply via email to