RE_DOT_NEW_LINE and NOT_NULL work for '.' only in regex. OTOH, they work for MBCSET in addition to '.' in DFA. This patch adapts the behavior of DFA to of regex.
BTW, at the moment, grep and gawk never use match_mb_charset function to be fixed by it.
From 10876f05010e2df3a0705e95a1be62cbb990fbfa Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka <nori...@kcn.ne.jp> Date: Wed, 15 Oct 2014 08:24:23 +0900 Subject: [PATCH] dfa: don't consider RE_DOT_NEWLINE and RE_DOT_NOT_NULL in matching with a bracket expression RE_DOT_NEWLINE and RE_DOT_NOT_NULL should be apply to a dot only which matches any character. So don't consider RE_DOT_NEWLINE and RE_DOT_NOT_NULL in matching with a bracket expression. * src/dfa.c (match_mb_charset): Remove RE_DOT_NEWLINE and RE_DOT_NOT_NULL. --- src/dfa.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-) diff --git a/src/dfa.c b/src/dfa.c index 58a4b83..a4c48b5 100644 --- a/src/dfa.c +++ b/src/dfa.c @@ -2998,17 +2998,7 @@ match_mb_charset (struct dfa *d, state_num s, position pos, int context; /* Check syntax bits. */ - if (wc == (wchar_t) eolbyte) - { - if (!(syntax_bits & RE_DOT_NEWLINE)) - return 0; - } - else if (wc == (wchar_t) '\0') - { - if (syntax_bits & RE_DOT_NOT_NULL) - return 0; - } - else if (wc == WEOF) + if (wc == WEOF) return 0; context = wchar_context (wc); -- 2.1.1