James Howard scribbled this message on Jul 29:
> On Thu, 29 Jul 1999, Tim Vanderhoek wrote:
> 
> > fgetln() does a complete copy of the line buffer whenever an
> > excessively long line is found.  On this point, it's hard to do better
> > without using mmap(), but mmap() has its own disadvantages.  My last
> > suggestion to James was to assume a worst case for long lines and mark
> > the worst worst case with an XXX "this is unfortunate".
> 
> <warning type="Anything said here wrong is my fault, not DES's">
> 
> DES tells me he has a new version (0.10) which mmap()s.  It supposedly
> cuts the run time down significantly, I do not have the numbers in front
> of me.  Unfortunetly he has not posted this version yet so I cannot
> download it and run it myself.  He also says that if mmap fails, he drops
> back to stdio.  This should only happen in the NFS case, the > 2G case,
> etc.
> 
> </warning>
> 
> > [Never mind that it should be spending near 100% of its time in
> >  procline...that just means he's still got work to do... :-]
> 
> I'd rather see it spending 100% of its time in regexec(), then I can just
> blame Henry Spencer :)
> 
> Someone said there was new regex code out, is this true?  Can anyone with
> a copy test grep with it?

ok, I just made a patch to eliminate the copy that was happening in
procfile, and it sped up a grep of a 5meg termcap from about 2.9sec
down to .6 seconds... this includes time spent profiling the program..
GNU grep w/o profiling only takes .15sec so we ARE getting closer to
GNU grep...

it was VERY simple to do... and attached is the patch... this uses the
option REG_STARTEND to do what the copy was trying to do... all of the
code to use REG_STARTEND was already there, it just needed to be enabled..

enjoy!

-- 
  John-Mark Gurney                              Voice: +1 541 684 8449
  Cu Networking                                   P.O. Box 5693, 97405

  "The soul contains in itself the event that shall presently befall it.
  The event is only the actualizing of its thought." -- Ralph Waldo Emerson
diff -u grep-0.10.orig/util.c grep-0.10/util.c
--- grep-0.10.orig/util.c       Thu Jul 29 05:00:15 1999
+++ grep-0.10/util.c    Thu Jul 29 16:38:06 1999
@@ -93,7 +93,6 @@
        file_t *f;
        str_t ln;
        int c, t, z;
-       char *tmp;
 
        if (fn == NULL) {
                fn = "(standard input)";
@@ -119,13 +118,8 @@
                initqueue();
        for (c = 0; !(lflag && c);) {
                ln.off = grep_tell(f);
-               if ((tmp = grep_fgetln(f, &ln.len)) == NULL)
+               if ((ln.dat = grep_fgetln(f, &ln.len)) == NULL)
                        break;
-               ln.dat = grep_malloc(ln.len + 1);
-               memcpy(ln.dat, tmp, ln.len);
-               ln.dat[ln.len] = 0;
-               if (ln.len > 0 && ln.dat[ln.len - 1] == '\n')
-                       ln.dat[--ln.len] = 0;
                ln.line_no++;
 
                z = tail;
@@ -133,7 +127,6 @@
                        enqueue(&ln);
                        linesqueued++;
                }
-               free(ln.dat);
                c += t;
        }
        if (Bflag > 0)
@@ -174,7 +167,8 @@
        pmatch.rm_so = 0;
        pmatch.rm_eo = l->len;
        for (c = i = 0; i < patterns; i++) {
-               r = regexec(&r_pattern[i], l->dat, 0, &pmatch, eflags);
+               r = regexec(&r_pattern[i], l->dat, 0, &pmatch,
+                   eflags | REG_STARTEND);
                if (r == REG_NOMATCH && t == 0)
                        continue;
                if (wflag && r == 0) {

Reply via email to