Jason Dagit wrote:
> On Mon, Oct 5, 2009 at 6:25 PM, Trent W. Buck
> <[email protected]>wrote:
>> Ben Franksen <[email protected]> writes:
>>
>> > Jason Dagit wrote:
>> >> It's possible that regex-pcre gives better performance than
>> >> regex-posix
>> >
>> > I made some tests using the new criterion package (excellent for stuff
>> > like that) and found regex-pcre to be faster by a factor of 3 to 9,
>> > depending on regex and test string. I did not test it with darcs, but
>> > I took some random regexes from the standard boring file and a random
>> > file with a long path.
>>
>> Does the default ERE matcher use the OS's regexp implementation, or is
>> it purely done in Haskell? In the former case, the OS and OS version
>> might be significant -- e.g. AIX 4 might have much slower EREs than a
>> recent GNU/Linux.
>
> I don't think regex-posix is the OS's implementation. Or if it is, then
> I'm not sure what magic is done to provide that implementation on windows.
> Perhaps there is a bundled source version which is used if the OS's
> doesn't
> provide an implementation? I guess this makes performance testing
> worthwhile on windows.
>
> Ben, could you make the source for your test publicly available (why
> duplicate effort, if we don't have to). Maybe someone will volunteer to
> crunch some numbers on windows.
Sure, file is attached. It's not much, though! Just played around with a few
regexes from the latest default boringfile and a random path from my darcs
repo. It should be easy to extend, though. Note I am not explicitly
compiling the regexes, so one could argue that in fact this tests speed of
the regex-compiler plus speed of the regex-engine. This can also easily be
fixed.
import Control.Monad
import Criterion.Main
import Text.Regex.Base
import qualified Text.Regex.PCRE
import qualified Text.Regex.Posix
import qualified Text.Regex.TDFA
texts = [
"src/Darcs/Commands/ShowRepo.lhs"
]
regexes = [
"(^|/)\\.waf-[[:digit:].]+-[[:digit:]]+($|/)",
"(^|/)autom4te\\.cache($|/)",
"\\.(obj|a|exe|so|lo|la)$"
]
data Engine = Pcre | Posix | Tdfa deriving (Enum,Show)
test :: Engine -> String -> String -> Int -> Bool
test eng text re _ = text * re where
(*) = case eng of
Posix -> (Text.Regex.Posix.=~)
Tdfa -> (Text.Regex.TDFA.=~)
Pcre -> (Text.Regex.PCRE.=~)
main = defaultMain $
flip map texts $ \t ->
bgroup (show t ++ "\n") $
flip map regexes $ \re ->
bgroup (show re ++ "\n") $
flip map [Pcre ..] $ \eng ->
bench (show eng) $ test eng t re
_______________________________________________
darcs-users mailing list
[email protected]
http://lists.osuosl.org/mailman/listinfo/darcs-users