On Sun, Feb 7, 2010 at 10:00 AM, Paolo Bonzini <[email protected]> wrote: > It means we identified the culprit (the regex engine; possibly the > strcoll/wcscoll calls used to handle [0-9]), but it is still strange > because, AFAICT, the regex code should not even be invoked unless you use > grep -o or --color or similar options. Instead, grep should use its own DFA > implementation. > > Running under LC_ALL=C should mitigate or eliminate the bad performance.
Thanks Paolo! Indeed I get a factor ~8 speedup on Linux and factor >1400 on OSX. If you want me to run any analyses on OSX, let me know. Titus
