> I am writing a multithreaded Apache log parser that uses the Boost > 1_29_0 regex split function to separate elements in the entry. Each > thread parses a separate log file. The code seems to be working > correctly on a 1-CPU system, but when I use a 14-CPU Sun server, I > see massive locking (LCK column of prstat -amLvu username), and > performance suffers horribly (as measured by the lines processed per > second). I spent a lot of time checking to see where the locking was > occurring. I went so far as to compile the code with Sun's Forte 6u2 > and use their analysis tools to identify the problem area. I've > compiled all code (including Boost) with both gcc 3.2.2 and Forte to > create 64-bit binaries, if that makes any difference. > > If I read the Forte analysis tools correctly, the place I'm seeing > all the locking is the call to malloc in the void *operator > new(unsigned long), which is called by > boost::re_detail::match_results_base and _priv_match_data. Those are > in turn called by query_match_aux, which is called by reg_grep2. > Assuming I'm reading it right... > > At this point it seems like the issue is either with the library or > my usage of it. Has anyone seen this before? Any pointers on what I > may be doing wrong and how to fix it would be appreciated.
The looking is occurring in your runtime library rather than boost.regex as such. You have two choices: 1) Use a custom allocator for the match_results class instance that you are using that uses thread-specific memory pools. 2) Wait for the next release (probably still a couple of months away), which will use much less dynamic memory allocation (almost none at all in recursive mode). John. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
