Hi Bryan,
I wrote the current regex API, so your suggestions are interesting to me. The
also goes for anyone else's regex API opinions, of course.
Bryan O'Sullivan wrote:
Ketil Malde wrote:
Python used to do pretty well here compared
to Haskell, with rather efficient hashes and text parsing, although I
suspect ByteString IO and other optimizations may have changed that
now.
It still does just fine. For typical munge a file with regexps, lists,
and maps tasks, Python and Perl remain on par with comparably written
Haskell. This because the scripting-level code acts as a thin layer of
glue around I/O, regexps, lists, and dicts, all of which are written in
native code.
The Haskell regexp libraries actually give us something of a leg down
with respect to Python and Perl.
True, the pure Haskell library is not as fast as a C library. In particular,
the current regex-tdfa handles lazy bytestring in a sub-optimal manner. This
may eventually be fixed.
But the native code libraries have also been wrapped in the same API, and they
are quite fast when combined with strict ByteStrings.
The aggressive use of polymorphism in
the return type of (=~) makes it hard to remember which of the possible
return types gives me what information. Not only did I write a regexp
tutorial to understand the API in the first place, I have to reread it
every time I want to match a regexp.
The (=~) operator uses many return types provided by the instances of
RegexContext. These are all thin wrappers around the unpolymorphic return types
of the RegexLike class. So (=~) could be avoided altogether, or another API
created.
A suitable solution would be a return type of RegexpMatch a = Maybe a
(to live alongside the existing types, but aiming to become the one
that's easy to remember), with appropriate methods on a, but I don't
have time to write up a patch.
b
The (=~~) is the monadic wrapper for (=~) to allow for different failure
behaviors. So using (=~~) with Maybe is already possible, and gives Nothing
whenever there are zero matches.
But more interesting to me is learning what API you would like to see.
What would you like the code that uses the API to be?
Could you sketch either the definition or usage of your RegexMatch class
suggestion?
I don't use my own regex API much, so external feedback and ideas would be
wonderful.
--
Chris
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe