Paul Eggert wrote: > Anyway, an 18% speedup is still a speedup, so I looked into it. > GCC 4.9.0 misses a non-obvious opportunity for function inlining. I > installed a tweak (attached) that should make the inlining opportunity > obvious to compilers nowadays. On my platform this gave a 28% speedup, > i.e., a bit better than the macro-using patch would have.
You are right. My compiler was too old. It was GCC 4.1.2 on CentOS 5.10. I retried it with GCC 4.4.7, and got the good performance. # Although I tried to build GCC 4.9.0, it hasn't carried out well yet. By the way, I examined the reason why it was slow on GCC 4.1.2, and I found that tr() isn't inlining without `-finline-loops' option, because `-finline-small-functions' option can be used from GCC 4.3. Although I submit the patch, it mayn't be so important. Thanks, Norihiro
From 4ce067b4d75bc65c25431aa9defbb2a07fc3df23 Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka <[email protected]> Date: Mon, 28 Apr 2014 21:28:51 +0900 Subject: [PATCH] kwset: improve the performance by inlining tr * src/kwset.c (tr): Make it inline. --- src/kwset.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/kwset.c b/src/kwset.c index 8e9b510..6d21893 100644 --- a/src/kwset.c +++ b/src/kwset.c @@ -114,7 +114,7 @@ struct kwset }; /* Use TRANS to transliterate C. A null TRANS does no transliteration. */ -static char +static inline char tr (char const *trans, char c) { return trans ? trans[U(c)] : c; -- 1.9.2
