On 12/06/2010 12:43 PM, zaid khalid wrote:
> I want some help in writing regular expressions in Ocaml, as I know
how to write it in informal way but in Ocaml syntax I can not. For
example I want to write "a* | (aba)* ".
>
> Another question if I want the string to be matched against the
regular expression to be matched as whole string not as substring what
symbol I need to attach to the substring, i.e if I want only concrete
strings accepted (like (" ", a , aa , aaa, aba, abaaba), but not ab or
not abaa).
>
I also had problems with Str (regexp descriptions being unreadable,
error-prone and hard to generate dynamically) and decided just to stop
using Str.
I have a tiny module [1] made with clarity in mind. It is pure OCaml. It
defines operators like $$ to be used in regexp construction. This way
syntax of the expressions is checked at compile time. Also, it is
trivial to build them at run time.
The whole "engine" is contained in a relatively short function
HRegex.subwords_of_subexpressions, so I believe anybody can hack it
without much effort.
I haven't measured performance of this implementation. I expect it to be
slow when processing long strings. It's just OK for my needs so far.
Anyway, the important part is the module interface. It expresses my
point of view on this topic.
The code is available in a mercurial repository [2].
The exemple "a* | (aba)* " would become:
open HRegex.Operators
let rx = (!* !$ "a") +$ (!* !$ "aba")
Dawid
[1]
http://hg.ocamlcore.org/cgi-bin/hgwebdir.cgi/hlibrary/hlibrary/raw-file/tip/HRegex.mli
[2] http://hg.ocamlcore.org/cgi-bin/hgwebdir.cgi/hlibrary/hlibrary
_______________________________________________
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs