John Cowan <co...@mercury.ccil.org> writes: > Jim Wise scripsit: > >> Are there plans to include explicit pattern matching operations, such as >> Common Lisp's DESTRUCTURING-BIND or Racket's match[1]? > > We don't work in terms of plans, but in terms of proposals, and right now > there are no concrete proposals. If you put together a concrete proposal > (doesn't have to be detailed, just clear) and send it to me, I will be > happy to post it for WG consideration.
Enclosed is the draft I've put together for this. Let me know if you would like me to send this to one or more of the wg or reports lists for further comment. Thanks,
== Pattern Matching == This is a proposal for a WG2 pattern matching library. The motivation of this proposal is to provide a pattern-matching facility similar to that in the ML family of languages [Harper 2008], as such a facility allows the succinct expression of some algorithms in a language very similar to the underlying mathematics of the problem set. The Pattern Matching package can be imported as (rnrs match), and an additional library, (rnrs match core) provides a versions of several core forms which use the pattern syntax of this library. Pattern matching algorithms have been provided as an extension to standard Scheme syntax by a number of existing implementations, including Bigloo [Serrano 2010], Chicken Scheme [Chicken Team 2009], Racket [Flatt 2010] and others. Many of these implementations have been influenced by a widely used platform independent pattern matching library by Andrew K. Wright [Wright 1994]. A more recent library by Alex Shinn [Shinn 2010] provides similar functionality in a mostly R5RS-compliant implementation. A similar pattern matching language for macro definition is also provided by the core `syntax-case` form. In addition, very limited forms of non-pattern value matching are supported by the argument syntax of the R6RS Scheme `lambda`, `case-lambda`, and `define` forms. A limited pattern functionality is provided in Common Lisp by the `DESTRUCTURING-BIND` macro. Where the above sources for pattern matching come into conflict, this proposal attempts to provide compatibility first with the RNRS `syntax-case` form, and secondly to the existing implementations. == Patterns == The forms in this library work by attempting to ''match'' a Scheme value of arbitrary complexity against one or more patterns. If such a match succeeds, zero or more variable bindings are created for the scope of a corresponding body expression or expressions. Match failure generally triggers attempted matching against another pattern specified in the same form, and a ''match error'' is generated if none of the provided patterns match the given value. === Simple Patterns === A pattern may be * a boolean, number, string, character, or byte-vector literal, or the empty list examples: {{{ #\A 42 () }}} in this case, the value must match in the sense of `equal?`, and no variable is bound by this match. * a quoted datum examples: {{{ 'b (quote (a b (c d))) }}} in this case, the value must match in the sense of `equal?`, and no variable is bound by this match * the special symbol `_` in this case, any value will match, and no variable is bound by this match. * any other symbol examples: {{{ a matched-val }}} in this case, any value will match, and that symbol is bound to the matched value for the scope of the body expression or expressions of the pattern matching form used. === Aggregate Patterns === Patterns may be aggregated into more complex patterns in a couple of ways. When this is done, multiple constituent patterns ("subpatterns") may be required to match, and all symbols bound to a value by any matching subpattern are bound for the scope of the body expression or expressions of the pattern matching form used. It is a syntax violation for the same symbol to appear more than once in a pattern. An aggregate pattern may be * a proper list of N subpatterns examples: {{{ (a b c) (a '= b) (_ middle _) (a (b c) d) }}} in this case, any list value of the same length where each subpattern in the pattern list matches the corresponding element in the value list will match, and the bindings corresponding to each subpattern will be carried out. * an improper list of N subpatterns examples: {{{ (a b . c) (a (b c) . d) }}} in this case, every subpattern but the last must match the corresponding element of the value list, and the last subpattern of the improper pattern list must match the corresponding element of the value list if the value list is improper and has the same length as the pattern list, or the (possibly empty) remainder of the value list otherwise, and the bindings corresponding to each subpattern will be carried out. * a proper or improper list of N subpatterns with the literal `...` symbol occurring once or more in any position except either the first position of any list, or the last position of an improper list. examples: {{{ (a ... b c) (a (b c) ... d) (a b ... . d) }}} in this case, the pattern list is matched as in the above proper or improper list cases, except that the subpattern before the `...` symbol may match zero or more instances of the pattern preceding the `...` symbol. Variables bound by a subpattern preceding a `...` are bound to a (possibly empty) list of all values matched by the pattern as if by a `map` of the `match` operator for that subpattern over each individual value matched. * a vector of N subpatterns with the literal `...` symbol in any position except the first examples: {{{ #(a b ... c) #(1 ... b) #((a b) ... (c d)) }}} in this case, the pattern list is matched as in the above vector case, except that the subpattern before the `...` symbol may match zero or more instances of the pattern preceding the `...` symbol. Variables bound by a pattern preceding a `...` are bound to a (possibly empty) list of all values matched by the pattern as if by a `map` of the `match` operator over each individual value matched. == Exceptions == In all of the matching forms, a failure to match the input value against any of the supplied patterns shall result in an error of type `&match`. The following condition constructor and predicate shall be exported by the `(rnrs match)` library `(make-match-condition)` procedure `(match-condition?)` procedure This condition type could be defined by {{{ (define-condition-type &match &serious make-match-condition match-condition?) }}} == Basic Forms == The `(rnrs match)` library shall export the form `(match ....)` syntax `match` shall have the syntax {{{ (match <val> (<pat> <body> <body> ...) (<pat> <body> <body> ...) ...) }}} and shall attempt to match the input value ''<val>'' against each specified pattern ''<pat>'' in order. For the first pattern which successfully matches the input value, `match` shall evaluate the corresponding expressions ''<body>'' in the scope of the bindings created by the successful match, and shall return the value of the last corresponding ''<body>'' expression, which shall be evaluated in tail context. If no pattern ''<pat>'' successfully matches the input value ''<val>'', a condition of type `&match` shall be raised. == Derived Forms == For convenience, the following derived forms shall also be exported by the `(rnrs match)` library `(match-lambda ....)` syntax `(match-lambda* ....)` syntax `(match-let ....)` syntax `(match-let* ....)` syntax `(match-letrec ....)` syntax `match-lambda`, which is useful for creating a function of one argument which matches that argument against one or more patterns, shall have the syntax {{{ (match-lambda (<pat> <body> <body> ...) (<pat> <body> <body> ...) ...) }}} and shall be equivalent to {{{ (lambda (id) (match (<pat> <body> <body> ...) (<pat> <body> <body> ...) ...)) }}} `match-lambda*`, which is useful for creating a function of zero or more arguments which matches its arguments taken as a list against one or more patterns, shall have the syntax {{{ (match-lambda (<pat> <body> <body> ...) (<pat> <body> <body> ...) ...) }}} and shall be equivalent to {{{ (lambda id (match (<pat> <body> <body> ...) (<pat> <body> <body> ...) ...)) }}} Note that it is not required that each pattern ''<pat>'' be a list, but the meaning of a non-list pattern should be considered carefully. In particular, a pattern consisting of a single symbol will always match, and bind the entire list of arguments to that symbol (this can be used in a manner similar to the `else` clause in a `cond` form). `match-let`, analogous to `let`, shall have the syntax {{{ (match-let ((<pat> <expr>) ...) <body> <body> ...) }}} Each ''<expr>'' shall be evaluated and then pattern matched against the corresponding pattern ''<pat>'' in the current environment (in some unspecified order). The expressions ''<body>'' shall then be evaluated in an environment containing all of the bindings resulting from each pattern match, and the value of the last ''<body>'' expression, which shall be evaluated in tail context, shall be returned. If any of the patterns ''<pat>'' fail to match the value of the corresponding expression <expr>, none of the expressions ''<body>'' shall be evaluated, and a condition of type `&match` shall be raised. It is not specified which of the expressions '<expr>' shall have already been evaluated when this exception is raised. `match-let*`, analogous to `let*`, shall have the syntax {{{ (match-let* ((<pat> <expr>) ...) <body> <body> ...) }}} `match-let* `is similar to `match-let`, except that each `(<pat> <expr>)` clause shall be evaluated and then pattern matched sequentially, in left-to-right order, and identifiers bound by matching each pattern ''<pat>'' match shall be available in the environment in which each subsequent expression ''<expr>'' is evaluated. If any of the patterns ''<pat>'' fail to match the value of the corresponding expression <expr>, none of the subsequent expressions ''<expr>'' and none of the expressions ''<body>'' shall be evaluated, and a condition of type `&match` shall be raised. `match-letrec`, analogous to `letrec`, shall have the syntax {{{ (match-letrec ((<pat> <expr>) ...) <body> <body> ...) }}} `match-letrec `is similar to `match-let`, except that each expression ''<expr>'' clause shall be evaluated in an environment containing all bindings from all patterns ''<pat>''. As with the core `letrec` form, each variable to be bound by a pattern ''<pat>'' of the `match-letrec` form is bound (in a new environment) to a fresh location prior to the evaluation of any expression ''<expr>'', the expressions ''<body>'' are then evaluated (in some unspecified order) in this environment. As with `match-let`, if any of the patterns ''<pat>'' fail to match the value of the corresponding expression <expr>, none of the expressions ''<body>'' shall be evaluated, and a condition of type `&match` shall be raised. It is not specified which of the expressions '<expr>' shall have already been evaluated when this exception is raised. == Pattern Versions of Core RNRS forms == The `(rnrs match core)` library module contains no forms of its own, but imports and re-exports several forms from the `(rnrs match)` module under new names: || `(rnrs match)` form || core form | match-lambda* | lambda | match-let | let | match-let* | let* | match-letrec | letrec This library module is intended to allow a style of programming where `matching` is a core feature of the language, and to allow experimenting with wider support of the pattern-matching syntax defined here in a future RNRS revision. It should be noted that the RNRS library ''import spec'' syntax can be used to select which of these forms to import. The `(rnrs match core)` library is in no way needed for use of the `(rnrs match)` library. == Notes == The following questions are not dealt with in this draft, but merit consideration as part of the RNRS standardization process, or in future RNRS revisions: * Quasipattern forms, defined with quasiquotation syntax, as supported by several existing implementations. These are most powerful when used with the splicing operator `,@` to extend `...` notation to a repeating sequence of more than one subpattern occuring directly within a list. * boolean patterns defined with `and`, `or`, or `not`, allowing more complex combinations of subpatterns in an aggregate pattern. The `or` form carries with it the possibility of binding a matching value to a given variable in some matches of the same pattern, but not others, while the `not` form does not perform variable binding itself (but is often used within an `and` form). * Extending the provided pattern grammar to include constructors of RNRS syntactic or procedural record types, or enumeration types, either automatically at record or enumeration type definition time, or by manually specifying records to be eligible for such matching through some additional syntax. Such support might allow, given an enumeration construcor `e` and a record constructor `make-r`, pattern matches on the form {{{ ((make-r w x y) (e z)) }}} against a value of {{{ (a b) }}} where `a` is a record of the type constructed by `r`, and `b` is an enumeration of the type constructed by `e`, resulting in the variables `w`, `x`, `y`, and `z` being bound for the scope of the match operator used. Exceptions, as a special case of record types, could be similarly matched, and pattern-matching variants of `with-exception-handler` and `guard` might be provided. Such matching is not specified in this proposal, as it would require changes to the `(rnrs records ...)` and `(rnrs enums)` libraries. Typically, such interaction would extend to definining appropriate pattern-matching scaffolding when a new record or enumeration constructor is defined. Such an interaction should be considered if this form of pattern matching is later aded to the core language. * Allowing the specification of guard expressions in `match` clauses which must evaluate to true for that clause to match. This has proved a very useful feature in languages of the ML family [Ruegg 2009], but adds some complexity to the matching implementation. If this is examined in future versions, a variant of the ''fender'' syntax of the RNRS `syntax-case` library form or the `=>` failure continuation support of the Wright [Wright 1994] pattern matching library merit particular consideration. * Laying the groundwork for replacing the core version of `lambda`, `let`, `let*`, and `letrec` with the versions provided by the `(rnrs match core)` library in a future revision in the RNRS series * Unifying the syntax used here with that of the core syntax-case form, which uses a very similar pattern matching language == Sources == [Chicken Team 2009] Chicken Team, "Pattern Matching" from the Chicken Scheme Wiki, revision 14108, 2009, available at http://wiki.call-cc.org/man/3/Pattern%20matching [Harper 2008] Harper Robert, ''Programming in Standard ML'', 2008, Carnegie Melon, ch. 5, available at http://www.cs.cmu.edu/~rwh/smlbook/ [Flatt 2010] Flatt, Matthew and PLT, ''Reference: Racket'', Version 5.0, 2010, available at http://download.racket-lang.org/docs/5.0/pdf/reference.pdf [Ruegg 2009] Ruegg, Michael "Pattern Matching in Scala", University of Applied Sciences, Rapperswil, 2009, available at http://wiki.ifs.hsr.ch/SemProgAnTr/files/PatternMatchingInScala.pdf which provides an overview of pattern matching in Haskell, Erlang, F#, XSLT, and Prolog as well as in Scala [Serrano 2010] Serrano, Manuel ''Bigloo: A practical Scheme compiler, User manual for version 3.4b'', 2010, Bigloo Project, ch. 7, available at http://www-sop.inria.fr/mimosa/fp/Bigloo/doc/bigloo.html [Shinn 2010] Shinn, Alex ''match.scm'', 2010, available at http://synthcode.com/scheme/match.scm a USENET post describing the initial release of this library is available at https://groups.google.com/forum/?pli=1#msg/comp.lang.scheme/Rc2gH1YJpDA/_S8R500jQQkJ [Wright 1996] Wright, Andrew K. "Pattern Matching for Scheme", 1996, Rice University, available at http://download.plt-scheme.org/doc/103p1/pdf/match.pdf
-- Jim Wise jw...@draga.com
pgpbgdUnBOtbw.pgp
Description: PGP signature
_______________________________________________ r6rs-discuss mailing list r6rs-discuss@lists.r6rs.org http://lists.r6rs.org/cgi-bin/mailman/listinfo/r6rs-discuss