On Sat, Apr 04, 2015 at 09:11:10PM -0400, Jeff King wrote:

> I also considered optimizing the "term == '\n'" case by using fgets, but
> it gets rather complex (you have to pick a size, fgets into it, and then
> keep going if you didn't get a newline). Also, fgets sucks, because you
> have to call strlen() immediately after to find out how many bytes you
> got!

My initial attempt at this had been to _just_ use fgets, but the
optimization becomes much simpler if you just do an initial fgets, and
then follow up with character reads. In most cases, the initial fgets
is big enough to get the whole line.

I.e., doing:

diff --git a/strbuf.c b/strbuf.c
index 2facd5f..f319d8d 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -443,6 +443,18 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int 
term)
                return EOF;
 
        strbuf_reset(sb);
+
+       if (term == '\n') {
+               strbuf_grow(sb, 256);
+               if (!fgets(sb->buf, sb->alloc - 1, fp)) {
+                       strbuf_release(sb);
+                       return EOF;
+               }
+               sb->len = strlen(sb->buf);
+               if (sb->buf[sb->len - 1] == '\n')
+                       return 0;
+       }
+
        flockfile(fp);
        while ((ch = getc_unlocked(fp)) != EOF) {
                strbuf_grow_ch(sb);

on top of the series drops me from:

  real    0m8.573s
  user    0m8.072s
  sys     0m0.508s

to:

  real    0m6.671s
  user    0m6.216s
  sys     0m0.460s

which is back to the v2.0.0 number. Even with the extra strlen, it seems
that what fgets does internally beats repeated getc calls. Which I guess
is not too surprising, as each getc() will have to check for underflow
in the buffer. Perhaps there is more room to micro-optimize
strbuf_getwholeline, but I kind of doubt it.

The big downside is that our input strings are no longer NUL-clean
(reading "foo\0bar\n" would yield just "foo". I doubt that matters in
the real world, but it does fail a few of the tests (e.g., t7008 tries
to read a list of patterns which includes NUL, and we silently truncate
the pattern rather than read in the NUL and barf).

So we'd have to either:

  1. Decide that doesn't matter.

  2. Have callers specify a "damn the NULs, I want it fast" flag.

  3. Find some alternative that is more robust than fgets, and faster
     than getc. I don't think there is anything in stdio, but I am not
     above dropping in a faster non-portable call if it is available,
     and then falling back to the current code otherwise.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to