RE_DOT_NEW_LINE and NOT_NULL work for '.' only in regex.  OTOH, they
work for MBCSET in addition to '.' in DFA.  This patch adapts the behavior
of DFA to of regex.

BTW, at the moment, grep and gawk never use match_mb_charset function to
be fixed by it.
From 10876f05010e2df3a0705e95a1be62cbb990fbfa Mon Sep 17 00:00:00 2001
From: Norihiro Tanaka <nori...@kcn.ne.jp>
Date: Wed, 15 Oct 2014 08:24:23 +0900
Subject: [PATCH] dfa: don't consider RE_DOT_NEWLINE and RE_DOT_NOT_NULL in
 matching with a bracket expression

RE_DOT_NEWLINE and RE_DOT_NOT_NULL should be apply to a dot only
which matches any character.  So don't consider RE_DOT_NEWLINE and
RE_DOT_NOT_NULL in matching with a bracket expression.

* src/dfa.c (match_mb_charset): Remove RE_DOT_NEWLINE and RE_DOT_NOT_NULL.
---
 src/dfa.c | 12 +-----------
 1 file changed, 1 insertion(+), 11 deletions(-)

diff --git a/src/dfa.c b/src/dfa.c
index 58a4b83..a4c48b5 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -2998,17 +2998,7 @@ match_mb_charset (struct dfa *d, state_num s, position 
pos,
   int context;
 
   /* Check syntax bits.  */
-  if (wc == (wchar_t) eolbyte)
-    {
-      if (!(syntax_bits & RE_DOT_NEWLINE))
-        return 0;
-    }
-  else if (wc == (wchar_t) '\0')
-    {
-      if (syntax_bits & RE_DOT_NOT_NULL)
-        return 0;
-    }
-  else if (wc == WEOF)
+  if (wc == WEOF)
     return 0;
 
   context = wchar_context (wc);
-- 
2.1.1

Reply via email to