[PATCHv2 06/10] pkt-line: share buffer/descriptor reading implementation

2013-02-18 Thread Jeff King
The packet_read function reads from a descriptor. The
packet_get_line function is similar, but reads from an
in-memory buffer, and uses a completely separate
implementation. This patch teaches the internal function
backing packet_read to accept either source, and use the
appropriate one. As a result:

  1. The function has been renamed to packet_read_from_buf,
 which more clearly indicates its relationship to the
 packet_read function.

  2. The original packet_get_line wrote to a strbuf; the new
 function, like its descriptor counterpart, reads into a
 buffer provided by the caller.

  3. The original function did not die on any errors, but
 instead returned an error code. Now we have the usual
 normal and gently forms.

There are only two existing calls to packet_get_line which
have to be converted, and both are in remote-curl. The first
reads and checks the # service=git-foo line from a smart
http server.  The second just reads past any additional
smart headers, without bothering to look at them.

This patch converts both to the new form, with a few
implications:

  1. Because we use the non-gentle form, the first caller
 can drop its own error checking. As a result, we will get
 more accurate error reporting about protocol breakage,
 since the errors come from inside the protocol code. We
 will no longer print the URL as part of the error, but
 that's OK. Protocol breakages should be rare (and we
 are pretty sure at this point in the code that it is a
 real smart server, so we won't be confused by dumb
 servers), and the first debugging step would probably
 be GIT_CURL_VERBOSE, anyway.

  2. The second caller did not error check at all, and now
 does. This can help us catch broken or truncated input
 close to the source.

  3. Since we are no longer using a strbuf, we now have a
 1000-byte limit on the smart-http headers. That should
 be fine, as the only header that has ever been sent
 here is the short service=git-foo header.

Signed-off-by: Jeff King p...@peff.net
---
The diffstat shows more lines appearing, but it is mainly from comments
and from the various parse_line{_from_buf,}{,_gently} variants; we
really do get rid of a duplicate parsing implementation, and we
harmonize all of the error conditions and messages.

We can also make gently a parameter to avoid the proliferation of
related functions, but would mean all but one callsite would have to
pass an extra 0. Choose your poison, I guess.

 pkt-line.c| 69 ---
 pkt-line.h| 11 +-
 remote-curl.c | 19 
 3 files changed, 53 insertions(+), 46 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index 85faf73..7ee91e0 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -103,12 +103,26 @@ static int safe_read(int fd, void *buffer, unsigned size, 
int gently)
strbuf_add(buf, buffer, n);
 }
 
-static int safe_read(int fd, void *buffer, unsigned size, int gently)
+static int get_packet_data(int fd, char **src_buf, size_t *src_size,
+  void *dst, unsigned size, int gently)
 {
-   ssize_t ret = read_in_full(fd, buffer, size);
-   if (ret  0)
-   die_errno(read error);
-   else if (ret  size) {
+   ssize_t ret;
+
+   /* Read up to size bytes from our source, whatever it is. */
+   if (src_buf) {
+   ret = size  *src_size ? size : *src_size;
+   memcpy(dst, *src_buf, ret);
+   *src_buf += size;
+   *src_size -= size;
+   }
+   else {
+   ret = read_in_full(fd, dst, size);
+   if (ret  0)
+   die_errno(read error);
+   }
+
+   /* And complain if we didn't get enough bytes to satisfy the read. */
+   if (ret  size) {
if (gently)
return -1;
 
@@ -143,12 +157,13 @@ static int packet_read_internal(int fd, char *buffer, 
unsigned size, int gently)
return len;
 }
 
-static int packet_read_internal(int fd, char *buffer, unsigned size, int 
gently)
+static int packet_read_internal(int fd, char **src_buf, size_t *src_len,
+   char *buffer, unsigned size, int gently)
 {
int len, ret;
char linelen[4];
 
-   ret = safe_read(fd, linelen, 4, gently);
+   ret = get_packet_data(fd, src_buf, src_len, linelen, 4, gently);
if (ret  0)
return ret;
len = packet_length(linelen);
@@ -162,7 +177,7 @@ static int packet_read_internal(int fd, char *buffer, 
unsigned size, int gently)
if (len = size)
die(protocol error: line too large: (expected %u, got %d),
size, len);
-   ret = safe_read(fd, buffer, len, gently);
+   ret = get_packet_data(fd, src_buf, src_len, buffer, len, gently);
if (ret  0)
return ret;
buffer[len] = 0;
@@ -172,40 

Re: [PATCHv2 06/10] pkt-line: share buffer/descriptor reading implementation

2013-02-18 Thread Jonathan Nieder
Jeff King wrote:

 The packet_read function reads from a descriptor.

Ah, so this introduces a new analagous helper that reads from
a strbuf, to avoid the copy-from-async-procedure hack?

[...]
 --- a/pkt-line.c
 +++ b/pkt-line.c
 @@ -103,12 +103,26 @@ static int safe_read(int fd, void *buffer, unsigned 
 size, int gently)
   strbuf_add(buf, buffer, n);
  }
  
 -static int safe_read(int fd, void *buffer, unsigned size, int gently)
 +static int get_packet_data(int fd, char **src_buf, size_t *src_size,
 +void *dst, unsigned size, int gently)
  {
 - ssize_t ret = read_in_full(fd, buffer, size);
 - if (ret  0)
 - die_errno(read error);
 - else if (ret  size) {
 + ssize_t ret;
 +
 + /* Read up to size bytes from our source, whatever it is. */
 + if (src_buf) {
 + ret = size  *src_size ? size : *src_size;
 + memcpy(dst, *src_buf, ret);
 + *src_buf += size;
 + *src_size -= size;
 + }
 + else {

Style: git cuddles its elses.

assert(src_buf ? fd  0 : fd = 0);

if (src_buf) {
...
} else {
...
}

 + ret = read_in_full(fd, dst, size);
 + if (ret  0)
 + die_errno(read error);

This is noisy about upstream pipe gone missing, which makes sense
since this is transport-related.  Maybe that deserves a comment.

[...]
 --- a/remote-curl.c
 +++ b/remote-curl.c
 @@ -138,28 +138,29 @@ static struct discovery* discover_refs(const char 
 *service)
   if (maybe_smart 
   (5 = last-len  last-buf[4] == '#') 
   !strbuf_cmp(exp, type)) {
 + char line[1000];
 + int len;
 +
   /*
* smart HTTP response; validate that the service
* pkt-line matches our request.
*/
 - if (packet_get_line(buffer, last-buf, last-len) = 0)
 - die(%s has invalid packet header, refs_url);
 - if (buffer.len  buffer.buf[buffer.len - 1] == '\n')
 - strbuf_setlen(buffer, buffer.len - 1);
 + len = packet_read_from_buf(line, sizeof(line), last-buf, 
 last-len);
 + if (len  line[len - 1] == '\n')
 + len--;

Was anything guaranteeing that buffer.len  1000 before this change?

The rest looks good from a quick glance.

Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 06/10] pkt-line: share buffer/descriptor reading implementation

2013-02-18 Thread Jeff King
On Mon, Feb 18, 2013 at 02:43:50AM -0800, Jonathan Nieder wrote:

 Jeff King wrote:
 
  The packet_read function reads from a descriptor.
 
 Ah, so this introduces a new analagous helper that reads from
 a strbuf, to avoid the copy-from-async-procedure hack?

Not from a strbuf, but basically, yes.

  +   ret = read_in_full(fd, dst, size);
  +   if (ret  0)
  +   die_errno(read error);
 
 This is noisy about upstream pipe gone missing, which makes sense
 since this is transport-related.  Maybe that deserves a comment.

That is not new code; it is just reindented from the original safe_read.
But is it noisy about a missing pipe? We do not get EPIPE for reading.
We should just get a short read or EOF, both of which is handled later.

  +   len = packet_read_from_buf(line, sizeof(line), last-buf, 
  last-len);
  +   if (len  line[len - 1] == '\n')
  +   len--;
 
 Was anything guaranteeing that buffer.len  1000 before this change?

No. That's discussed in point (3) of the implications in the commit
message.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2 06/10] pkt-line: share buffer/descriptor reading implementation

2013-02-18 Thread Jonathan Nieder
Jeff King wrote:

 But is it noisy about a missing pipe? We do not get EPIPE for reading.

Ah, that makes more sense.

[...]
 +   len = packet_read_from_buf(line, sizeof(line), last-buf, 
 last-len);
 +   if (len  line[len - 1] == '\n')
 +   len--;

 Was anything guaranteeing that buffer.len  1000 before this change?

 No. That's discussed in point (3) of the implications in the commit
 message.

Thanks.  Sorry I missed it the first time.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html