Hi,
Our team is affected by issue 0012381, that causes extremely poor performance
by CTest. Details here:
http://public.kitware.com/Bug/view.php?id=12381
I've created a small test case that demonstrates the problem. Please find the
.cpp file attached.
From what I see, the RegularExpression class uses Henry Spencer regex
implementation, which is known to be slow for some cases.
On my machine, the attached example runs in 0.8 sec. Just to process one string!
$ time ./repr
real 0m0.865s
user 0m0.862s
sys 0m0.002s
Grep can process 100k such strings in 0.5 sec (which includes reading a 570MB
file from disk):
$ wc -l big.str.txt
100000 big.str.txt
$ ls -lh big.str.txt
-rw-r--r-- 1 alex staff 572M 14 Nov 12:30 big.str.txt
$ time grep "([^:]+): warning[ \t]*[0-9]+[ \t]*:" big.str.txt
real 0m0.525s
user 0m0.255s
sys 0m0.269s
I see three ways to fix this problem:
A) use a trusted 3rd party regex library, like re2 or pcre
B) find another self-contained regex implementation
C) try to use the standard POSIX regex available in regex.h on most systems
I tried to find another self-contained regex implementation, that we could use.
I found Tiny REX, but it is as slow, in this case, as Henry Spencer's
implementation.
So what do you think is the best way to proceed about this problem?
sincerely,
Alex Ciobanu
repr.cpp
Description: Binary data
Makefile
Description: Binary data
-- Powered by www.kitware.com Visit other Kitware open-source projects at http://www.kitware.com/opensource/opensource.html Please keep messages on-topic and check the CMake FAQ at: http://www.cmake.org/Wiki/CMake_FAQ Follow this link to subscribe/unsubscribe: http://public.kitware.com/cgi-bin/mailman/listinfo/cmake-developers
