Ted Unangst <t...@tedunangst.com> writes:

> For simple patterns, grep has an optimization to avoid regex and run
> about 50% faster. The problem is its idea of simple patterns is too
> simple.

IIUC the idea is to optimize for a lazy user that didn't want to
type grep -F or fgrep:

$ grep foo /bar
$ grep -R foo /baz/

Thus I think it would make sense to test whether ERE are on, since '|',
'+', etc aren't special under BRE.

The following diff also tests for ']', ')' and '}' so that unbalanced
use of those can be catched by regcomp if the latter happens to become
stricter.

Regress passes; does it seem OK?
-- 
Jérémie Courrèges-Anglas
PGP Key fingerprint: 61DB D9A0 00A4 67CF 2A90  8961 6191 8FBF 06A1 1494

Index: util.c
===================================================================
RCS file: /cvs/src/usr.bin/grep/util.c,v
retrieving revision 1.45
diff -u -p -r1.45 util.c
--- util.c      29 Dec 2012 01:32:44 -0000      1.45
+++ util.c      1 May 2013 02:17:41 -0000
@@ -348,15 +348,8 @@ fastcomp(fastgrep_t *fg, const char *pat
        /* Look for ways to cheat...er...avoid the full regex engine. */
        for (i = 0; i < fg->patternLen; i++)
        {
-               /* Can still cheat? */
-               if ((isalnum(fg->pattern[i])) || isspace(fg->pattern[i]) ||
-                   (fg->pattern[i] == '_') || (fg->pattern[i] == ',') ||
-                   (fg->pattern[i] == '=') || (fg->pattern[i] == '-') ||
-                   (fg->pattern[i] == ':') || (fg->pattern[i] == '/')) {
-                       /* As long as it is good, upper case it for later. */
-                       if (iflag)
-                               fg->pattern[i] = toupper(fg->pattern[i]);
-               } else if (fg->pattern[i] == '.') {
+               switch (fg->pattern[i]) {
+               case '.':
                        hasDot = i;
                        if (i < fg->patternLen / 2) {
                                if (firstHalfDot < 0)
@@ -368,11 +361,28 @@ fastcomp(fastgrep_t *fg, const char *pat
                                if (firstLastHalfDot < 0)
                                        firstLastHalfDot = i;
                        }
-               } else {
+                       break;
+               case '(': case ')':
+               case '{': case '}':
+                       /* Special in BRE if preceded by '\\' */
+               case '?':
+               case '+':
+               case '|':
+                       /* Not special in BRE. */
+                       if (!Eflag)
+                               goto nonspecial;
+               case '\\':
+               case '*':
+               case '[': case ']':
                        /* Free memory and let others know this is empty. */
                        free(fg->pattern);
                        fg->pattern = NULL;
                        return (-1);
+               default:
+nonspecial:
+                       if (iflag)
+                               fg->pattern[i] = toupper(fg->pattern[i]);
+                       break;
                }
        }
 

Reply via email to