Hi,

Our team is affected by issue 0012381, that causes extremely poor performance 
by CTest. Details here: 
     http://public.kitware.com/Bug/view.php?id=12381

I've created a small test case that demonstrates the problem. Please find the 
.cpp file attached.

From what I see, the RegularExpression class uses Henry Spencer regex 
implementation, which is known to be slow for some cases.

On my machine, the attached example runs in 0.8 sec. Just to process one string!
   $ time ./repr
       real     0m0.865s
       user     0m0.862s
       sys      0m0.002s

Grep can process 100k such strings in 0.5 sec (which includes reading a 570MB 
file from disk):
   $ wc -l big.str.txt 
      100000 big.str.txt
   $ ls -lh big.str.txt 
       -rw-r--r--  1 alex  staff   572M 14 Nov 12:30 big.str.txt
   $ time grep "([^:]+): warning[ \t]*[0-9]+[ \t]*:" big.str.txt
       real     0m0.525s
       user     0m0.255s
       sys      0m0.269s

I see three ways to fix this problem:
  A) use a trusted 3rd party regex library, like re2 or pcre
  B) find another self-contained regex implementation 
  C) try to use the standard POSIX regex available in regex.h on most systems

I tried to find another self-contained regex implementation, that we could use. 
I found Tiny REX, but it is as slow, in this case, as Henry Spencer's 
implementation.

So what do you think is the best way to proceed about this problem?

sincerely,
Alex Ciobanu 



Attachment: repr.cpp
Description: Binary data

Attachment: Makefile
Description: Binary data

--

Powered by www.kitware.com

Visit other Kitware open-source projects at 
http://www.kitware.com/opensource/opensource.html

Please keep messages on-topic and check the CMake FAQ at: 
http://www.cmake.org/Wiki/CMake_FAQ

Follow this link to subscribe/unsubscribe:
http://public.kitware.com/cgi-bin/mailman/listinfo/cmake-developers

Reply via email to