On Wed, Jun 08, 2016 at 06:20:41PM +0200, Michael Haggerty wrote:
> Instead, one could write
>
> > static int feed_object(const unsigned char *sha1, int fd, int negative)
> > {
> > char buf[GIT_SHA1_HEXSZ + 2];
> > struct strbuf line = WRAPPED_FIXED_STRBUF(buf);
> >
> > if (negative && !has_sha1_file(sha1))
> > return 1;
> >
> > if (negative)
> > strbuf_addch(&line, '^');
> > strbuf_add(&line, sha1_to_hex(sha1), GIT_SHA1_HEXSZ);
> > strbuf_addch(&line, '\n');
> > return write_or_whine(fd, line.buf, line.len, "send-pack: send refs");
> > }
Hmm. I'm not sure that just replacing that with a regular heap-allocated
strbuf is so bad. It additionally gets rid of the SHA1_HEXSZ math in the
allocation.
So from your list of advantages:
> * It's a little less manual bookkeeping, and thus less error-prone,
> than the current code.
We have this, but better.
> * If somebody decides to add another character to the line but
> forgets to increase the allocation size, the code dies in testing
> rather than (a) overflowing the buffer, like the current
> code, or (b) silently becoming less performant, as if it used a
> preallocated but non-fixed strbuf.
Instead of overflowing, it just silently works.
> * There's no need to strbuf_release() (which can be convenient in
> a function with multiple exit paths).
Same.
The downside, obviously, is the cost of malloc/free. It may even be
noticeable here here because this really is a tight loop of strbuf
allocation (OTOH, we immediately make a syscall; how expensive is
write() compared to malloc()?).
We can hack around that by reusing the same strbuf.
Unfortunately the usual trick of:
struct strbuf buf = STRBUF_INIT;
for (...) {
strbuf_reset(&buf);
...
}
strbuf_release(&buf);
does not work, because we are in a sub-function. We can pass in the
buffer as scratch space, but that makes the function interface a little
uglier than it needs to be.
Likewise, we could make the strbuf static inside feed_object(). It's
not so bad here, where we know there aren't re-entrancy issues, but it's
not a very safe pattern in general (and it leaks the memory when we're
done with the function).
That made me wonder if we could repeatedly reuse a buffer attached to
the file descriptor. And indeed, isn't that what stdio is? The whole
reason this buffer exists is because we are using a direct descriptor
write. If we switched this function to use fprintf(), we'd avoid the
whole buffer question, have a fixed cap on our memory use (since we just
flush anytime the buffer is full) _and_ we'd reduce the number of
write syscalls we're making by almost a factor of 100.
> I don't know whether this particular function should be rewritten; I'm
> just giving an example of the type of scenario where I think it could be
> useful.
>
> In a world without fixed strbufs, what would one use in this situation?
I know I've done the exact opposite of what you wanted here and talked
about this specific function. But I _do_ think this is a pattern I've
seen several times, where we format into a buffer only to write() it
out. I think they may comprise a reasonable number of our buffer-using
loops.
-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html