# [racket-users] Unexpected pattern matching behavior

```I'm confused by the behavior of pattern matching in some cases.  I'll
discuss a few sets of examples involving segment patterns, then a mix of
`and`, `or`, and `not` patterns.```
```

Here is a basic segment pattern matching the entire contents of a list:
> (match '(1 2 3)
((list x ...) `(result: ,x)))
'(result: (1 2 3))

Let's add a prefix to the scrutinized list, and match it with a separate
pattern variable:
> (match '((1 2 3) 1 2 3)
((list w x ...) `(result: ,w ,x)))
'(result: (1 2 3) (1 2 3))

Now let's constrain the prefix to be the same as the matched segment:
> (match '((1 2 3) 1 2 3)
((list x x ...) `(result: ,x)))
'(result: (1 2 3))

Everything seems fine so far.  Let's make the segment pattern more
interesting.

This time, we'll have our segment pattern match elements that are lists of
two elements:
> (match '((a 1) (b 2) (c 3))
((list (list k v) ...) `(result: ,k ,v)))
'(result: (a b c) (1 2 3))

We can add a prefix as before, corresponding to the first element of each
two-element list in the remaining segment, and match it with a separate
pattern variable:
> (match '((a b c) (a 1) (b 2) (c 3))
((list j (list k v) ...) `(result: ,j ,k ,v)))
'(result: (a b c) (a b c) (1 2 3))

So far so good.  Now let's try constraining the prefix to be the same as
the first element of each two-element list in the segment pattern:
> (match '((a b c) (a 1) (b 2) (c 3))
((list k (list k v) ...) `(result: ,k ,v)))
; k113: unbound identifier;
;  also, no #%top syntax transformer is bound
;   in: k113
; [,bt for context]

Something went wrong.  Is this behavior expected?  This error doesn't look
like the kind you'd expect a user to see.

Let's try another example with segment patterns:
> (match '(1 2 3 : 1 2 3)
((list x ... ': y ...) `(result: ,x ,y)))
'(result: (1 2 3) (1 2 3))

We're using : as a divider, allowing both the x and y segments to be
non-empty.  In this case, they are bound to the same values.

What happens if we try to constrain the two segments to be identical by
repeating the variable?:
> (match '(1 2 3 : 1 2 3)
((list x ... ': x ...) `(result: ,x)))
non-linear pattern used in `match` with ... at #<syntax:4:19 x> and
#<syntax:4:10 x>
'(result: (1 2 3))

It looks like we got the right result (see the next example), but it looks
like an error message was printed beforehand.  Is this message intended as
a warning?  Should it actually be raising an error instead?  It almost
looks like debug logging.

Why might an error be the expected behavior?  It seems we get the right
result only coincidentally.  If we remove the divider and use separate
variables, we can see the greedy behavior of segment patterns:
> (match '(1 2 3 1 2 3)
((list x ... y ...) `(result: ,x ,y)))
'(result: (1 2 3 1 2 3) ())

When we try to constrain the segment by repeating the same variable, we get
the printed warning followed by an unexpected result:
> (match '(1 2 3 1 2 3)
((list x ... x ...) `(result: ,x)))
non-linear pattern used in `match` with ... at #<syntax:8:16 x> and
#<syntax:8:10 x>
'(result: (1 2 3 1 2 3))

Is this greedy behavior expected even when the segment pattern variable is
constrained in this way?  It seems like we should at least get an error.

Moving on, here's an example using `and` and `not` that behaves reasonably:
> (match (list 1 2 3)
((and (list a _ _) (not (list _ a _))) `(result: ,a)))
'(result: 1)

We can see how the scrutinized list is being constrained to not repeat the
same first two elements:
> (match (list 1 1 3)
((and (list a _ _) (not (list _ a _))) `(result: ,a)))
; match: no matching clause for '(1 1 3) [,bt for context]

If we swap the order of the sub-patterns, expecting the same behavior as
the first example, we get a surprise:
> (match (list 1 2 3)
((and (not (list _ a _)) (list a _ _)) `(result: ,a)))
; match: no matching clause for '(1 2 3) [,bt for context]

It seems that the order of sub-patterns in an `and` pattern matters, and
determines logical scoping of pattern variables.  In the first pattern, we
are getting the behavior of the logical proposition:
(exists (a)
(and (== VALUE (list a _ _)) (not (== VALUE (list _ a _)))))

While in the second example, we seem to be getting the behavior of the
logical proposition:
(exists (a)
(and (== VALUE (list a _ _)) (not (exists (a) (== VALUE (list _ a _))))))
Or, pushing the `not` inward slightly:
(exists (a)
(and (== VALUE (list a _ _)) (forall (a) (not (== VALUE (list _ a _))))))

Is this non-commutative behavior of `and` intended?

Here's an example involving `and` and `or` that attempts to ensure that a
three-element list either has identical first and second elements, or
identical second and third elements:
> (match (list 1 2 2)
((and (list a b c) (or (list _ a _) (list _ _ b))) 'success))
; readline-input:2:47: match: variable not bound in all or patterns
;   at: b
;   in: (or (list _ a _) (list _ _ b))
; [,bt for context]

Given the binding behavior we saw with `and` in the previous example, I
figured this one would work, since we're just reusing the same pattern
variables.  But instead we get an error about a variable not being bound in
all `or` patterns.

The documentation suggests something like this might be the case,
"instances of an id in different or and not sub-patterns are independent.
The binding for id is not available in other parts of the same pattern."

But this seems inconsistent with the successful `(and ... (not ...))`
example.  Why does that one work, but this one not?

What makes this really perplexing is that a slightly different set of
examples, where we make sure to mention the same pattern variables in each
disjunct, does work, this time ensuring that the first element is repeated
in at least one of the other two positions:
> (match (list 1 2 3)
((and (list a b c) (or (list _ a _) (list _ _ a))) 'success))
; match: no matching clause for '(1 2 3) [,bt for context]
> (match (list 1 2 1)
((and (list a b c) (or (list _ a _) (list _ _ a))) 'success))
'success
> (match (list 1 1 3)
((and (list a b c) (or (list _ a _) (list _ _ a))) 'success))
'success

Here's an example involving `not` and `or` that ensures the scrutinized
three-element list does not repeat consecutive elements:
> (match (list 1 2 3)
((not (or (list a a _) (list _ a a))) 'success))
'success

If we change the pattern variable in the second disjunct, the intended
logical meaning is the same, but the example breaks:
> (match (list 1 2 3)
((not (or (list a a _) (list _ b b))) 'success))
; readline-input:13:34: match: variable not bound in all or patterns
;   at: b
;   in: (or (list a a _) (list _ b b))
; [,bt for context]

I can understand this behavior in the case of `or` appearing in a positive
context (since otherwise we could introduce ambiguous bindings on the
right-hand-side), but when negated, bindings for these pattern variables
won't be available on the right-hand-side anyway.  Is this behavior
intended for simplicity of implementation?

--
You received this message because you are subscribed to the Google Groups
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email