Looks good—glad it worked out! My main question: what happens if the whole string is matched but a particular subgroup isn't? This can happen with alt or star, for instance. I couldn't tell from the types. I would expect a return type for match to be something like `match r (option string)` instead of `option (match r string)`. I don't know the underlying regex library, so perhaps there's a good reason for this.
I also thought it may be useful if alt allowed the same group to be captured on either side, but I realize that takes further fiddling with subgroup indices and post-processing of the results (each group field name would map to a list of indices instead of a single index), so it can wait for version 2 :). On Tue, Mar 7, 2017 at 02:11 Artyom Shalkhakov <[email protected]> wrote: > Hello Ziv, > > I've implemented your idea [1], and it works. Looking for feedback. > > 2017-02-20 15:48 GMT+06:00 Ziv Scully <[email protected]>: > > By parametrizing over a type-level record, you can encode the names of > > matched subgroups in the type of the regular expression. The interface > > probably looks something like: > > > > (* Abstract type of uncompiled regular expressions. *) > > con t :: {Unit} (* names of captured subgroups *) -> Type > > > > val literal : > > string -> t [] > > val any : t [] > > val concat : r1 ::: {Unit} -> r2 ::: {Unit} -> [r1 ~ r2] => > > t r1 -> t r2 -> t (r1 ++ r2) > > val star : r ::: {Unit} -> > > t r -> t r > > val alt : r ::: {Unit} -> s1 ::: {Unit} -> s2 ::: {Unit} -> [r ~ s1] => > [r ~ > > s2] => [s1 ~ s2] => > > t (r ++ s1) -> t (r ++ s2) -> t (r ++ s1 ++ s2) > > val capture : r ::: {Unit} -> > > nm :: Name -> [r ~ [nm]] => t r -> t (r ++ [nm]) > > > > (* Abstract type of compiled regular expressions *) > > con compiled :: {Unit} -> Type > > > > val compile : r ::: {Unit} -> > > t r -> compiled r > > > > Obviously this probably won't exactly work. For instance, if the > underlying > > library requires explicit names for captured groups, you'll need > something > > more like > > > > val capture : r ::: {Unit} -> nm :: Name -> [r ~ [nm]] => > > {nm : string} -> t r -> t (r ++ [nm]) > > > > because there's no way to extract a string from a type-level Name. In > fact, > > this way has another benefit: the internal representation of the type can > > just be a string! All the tracking of subgroups can happen at the type > > level. You may not even need compile. > > > > This interface assumes each subgroup may appear 0 or 1 times. You could > > maybe keep track of the number of matches more precisely by using {Type} > > instead of {Unit} and using a type class to implement type-level > functions. > > > > con t : {Type} (* maps names of captured subgroups to type-level > > representation of number of matched fields *) -> Type > > > > type one > > type oneOrNone > > > > class meet : Type -> Type -> Type > > val meet_one_oneOrNone : weaken one oneOrNone oneOrNone > > val meet_oneOrNone_one : weaken oneOrNone one oneOrNone > > val meet_oneOrNone_oneOrNone : weaken oneOrNone oneOrNone oneOrNone > > > > This makes the types more complicated, but I suspect (though I don't know > > for sure) that Ur/Web will be able to infer the extra stuff in any > concrete > > (that is, monomorphic) use, so the user code will still look pretty. > Here's > > how some of the functions look with this. The place weaken gets used is > alt. > > > > val star : r ::: {Type} -> > > t r -> t (map (fn _ => oneOrNo) r) > > > > val alt : r ::: {Type * Type * Type} -> s1 ::: {Type} -> s2 ::: {Type} -> > > [map fst3 r ~ s1] => [map snd3 r ~ s2] => [s1 ~ s2] => > > $(map (fn (t1, t2, t3) => weaken t1 t2 t3) r) > > t (map fst3 r ++ s1) -> t (map snd3 r ++ s2) > > -> t (map thd3 r ++ map (fn _ => oneOrNo) (s1 ++ s2)) > > > > You could extend the same sort of idea to types like oneOrMore and > > noneOrMore—useful if the underlying library can return a list of matches > > inside a + or *—by adding more cases to weaken. > > > > Here's a Haskell library that takes basically this approach: > > https://github.com/sweirich/dth/tree/master/popl17/src > > > > Hope this helps! > > Ziv > > > > > > On Mon, Feb 20, 2017 at 12:00 AM, Artyom Shalkhakov > > <[email protected]> wrote: > >> > >> 2017-02-20 10:47 GMT+06:00 <[email protected]>: > >>> > >>> Maybe just define another attribute in basis.urs? > >>> > >>> It would be much interesting to define a DSEL though. I have not seen > any > >>> attempts to create a DSEL for regular expressions. > >>> > >> > >> Indeed! And it seems like it would be a nice exercise as well having > some > >> practical uses to it. > >> > >> I envision something along the lines of: > >> > >> datatype RegExp = ... (* constructors *) > >> > >> type t (* abstract, not exported outside the module *) > >> val compile : RegExp -> t (* the trusted function *) > >> val matches : string -> t -> bool (* check if passed-in string matches > the > >> expression *) > >> > >> And the implementation might use something unsafe (like a string, or an > >> actual regexp object in JS). Then we could offload the interpretation of > >> regexps to an off-the-shelf library. The [compile] function would print > and > >> escape the AST to a regexp string, and the Ur/Web compiler might > actually > >> optimize all of this away. > >> > >> If we are to take care of matching subgroups, that would require a more > >> complicated type. I guess that should be doable as well, but I don't see > >> immediately how to handle this. > >> > >>> > >>> > >>> > >>> 20. Feb 2017 16:28 by [email protected]: > >>> > >>> > >>> Hello all, > >>> > >>> 'm doing form validation stuff, following examples on MDN. [1] > >>> > >>> It seems like Ur/Web doesn't handle regular expressions? > >>> > >>> I guess it should not be too diffcult to define a DSEL in Ur/Web (just > >>> AST constructors and a few trusted functions: e.g. [compile : AST -> > >>> regexp], where [regexp] is an abstract type, or maybe just a string > :-)) and > >>> have it work at least in the browser. > >>> > >>> Could somebody point me in the right direction here? > >>> > >>> -- > >>> Cheers, > >>> Artyom Shalkhakov > >>> > >>> [1] > >>> > https://developer.mozilla.org/en-US/docs/Web/Guide/HTML/HTML5/Constraint_validation > >>> > >> > >> > >> > >> -- > >> Cheers, > >> Artyom Shalkhakov > >> > >> _______________________________________________ > >> Ur mailing list > >> [email protected] > >> http://www.impredicative.com/cgi-bin/mailman/listinfo/ur > >> > > > > > > _______________________________________________ > > Ur mailing list > > [email protected] > > http://www.impredicative.com/cgi-bin/mailman/listinfo/ur > > > > -- > Cheers, > Artyom Shalkhakov > > [1] https://github.com/bbarenblat/urweb-regex/pull/1 > > _______________________________________________ > Ur mailing list > [email protected] > http://www.impredicative.com/cgi-bin/mailman/listinfo/ur >
_______________________________________________ Ur mailing list [email protected] http://www.impredicative.com/cgi-bin/mailman/listinfo/ur
