On Fri, Sep 27, 2013 at 12:00:45AM +0200, JR wrote: > I'm working on a toy IRC bot. Much of the logic involved is > translating the incoming raw IRC string into something that makes > sense (so now I have two problems, etc). But I managed to cook up a > regex that so far seems to work well. Time for callgrind! > > Grouped by source file, most time is spent in regex.d (as would seem > natural) but more time is spent in gc.d than I would have expected. > Looking at the callgraph I see that there's a curious amount of > calls to _d_arrayliteralTX from (around) where the regex matching is > done. (There's some inlining going on.) > > Example: http://dpaste.dzfl.pl/3932a231 (needs dmd head) > > Callgraph: http://i.imgur.com/AZEutCE.png
Actually, nevermind what I said in the last post. Obviously you're already using ctRegex. The problem is in this code: scope fields = raw.match(ircRegexPattern).front; > TL;DR: 67 regex matches are done in that example snippet, on real > but (hopefully) anonymized raw irc strings; _d_arrayliteralTX sees > 800+ calls. Herein lies the hint: the exact number of calls (as seen from your callgraph) is 804, and 804 / 67 = 12 (exactly). This means that there are precisely 12 calls to _d_arrayliteralTX per regex match. So that leads to the question of why this is happening. I don't know the answer, but does it help if you don't call .front on the match object? I.e., try this: auto m = raw.match(ircRegexPattern); auto c = m.captures; // c now contains the captured fields, for example, c[1] returns // matching text for the first pair of parentheses, c[2] returns // the matching text for the second pair, etc.. c[0] returns the // entire match (uninteresting in your case). T -- Windows: the ultimate triumph of marketing over technology. -- Adrian von Bidder