-#ifdef MBS_SUPPORT
Please drop the !__STDC__ part.
It stopped being useful about 10 years ago.
With pleasure.
I also expanded the macros, since FUNC was a bit redundant.
+#ifdef __STDC__
+#define FUNC(F, P) static int F(int c) { return P(c); }
+#else
+#define FUNC(F, P) static int F(c) int c; { return P(c); }
+#endif
...
+static predicate *
+find_pred (const char *str)
+{
+ int i;
s/int/unsigned int/
Please use "unsigned int" not just "unsigned".
Fine. I fixed it throughout dfa.[ch], see posted patch. I also prefer
"unsigned int" but I used the latter since it prevailed in those two files.
+ if (MB_CUR_MAX> 1)
+ {
+ REALLOC_IF_NECESSARY(dfa->mbcsets, struct mb_char_classes,
+ dfa->mbcsets_alloc, dfa->nmbcsets + 1);
+
+ /* dfa->multibyte_prop[] hold the index of dfa->mbcsets.
+ We will update dfa->multibyte_prop[] in addtok(), because we can't
+ decide the index in dfa->tokens[]. */
+
+ /* Initialize work area. */
+ work_mbc =&(dfa->mbcsets[dfa->nmbcsets++]);
+ work_mbc->nchars = work_mbc->nranges = work_mbc->nch_classes = 0;
+ work_mbc->nequivs = work_mbc->ncoll_elems = 0;
I was going to write "Please don't put multiple initializations on
the same line." but then saw this was merely an indentation change,
so never mind.
Indeed; the logical step is to change it to a single memset, actually.
I did it as a follow-up.
ACK!
Finally. Thanks!
You didn't ack one patch, but it's quite trivial so I pushed it too.
The only missing optimization WRT glibc would be to expand UTF-8 "." in
terms of character sets. I don't plan to do it anytime soon, but it
should be easy, just special-case ANYCHAR in atom().
Anyway, I think we are now on comparable performance, and considerably
less bugginess, than most GNU/Linux distros around.
Thanks for the reviews!
Paolo