Re: [PATCH v2 00/21] object_id part 7
On Tue, Mar 28, 2017 at 12:40:29PM -0700, Junio C Hamano wrote: > > Here's that minor tweak, in case anybody is interested. It's less useful > > without that follow-on that touches "eol" more, but perhaps it increases > > readability on its own. > > Yup, the only thing that the original (with Brian's fix) appears to > be more careful about is it tries very hard to avoid setting boc > past eoc. As we are not checking "boc != eoc" but doing the > comparison, that "careful" appearance does not give us any benefit > in practice, other than having to do an extra "eol ? eol+1 : eoc"; > the result of this patch is easier to read. > > By the way, eoc is "one past the end" of the array that begins at > boc, so setting a pointer to eoc+1 may technically be in violation. > I do not know how much it matters, though ;-) I think that is OK. We are reading a strbuf, so eoc must either be the first character of the PGP signature, or the terminating NUL if there was no signature block[1]. So it's actually _inside_ the array, and eoc+1 is our "one past". -Peff [1] Arguably we should bail when parse_signature() does not find a PGP signature at all. We already bail with "malformed push certificate" when there are other syntactic anomalies.
Re: [PATCH v2 00/21] object_id part 7
Jeff Kingwrites: > Here's that minor tweak, in case anybody is interested. It's less useful > without that follow-on that touches "eol" more, but perhaps it increases > readability on its own. Yup, the only thing that the original (with Brian's fix) appears to be more careful about is it tries very hard to avoid setting boc past eoc. As we are not checking "boc != eoc" but doing the comparison, that "careful" appearance does not give us any benefit in practice, other than having to do an extra "eol ? eol+1 : eoc"; the result of this patch is easier to read. By the way, eoc is "one past the end" of the array that begins at boc, so setting a pointer to eoc+1 may technically be in violation. I do not know how much it matters, though ;-) > -- >8 -- > Subject: [PATCH] receive-pack: simplify eol handling in cert parsing > > The queue_commands_from_cert() function wants to handle each > line of the cert individually. It looks for "\n" in the > to-be-parsed bytes, and special-cases each use of eol (the > end-of-line variable) when we didn't find one. Instead, we > can just set the end-of-line variable to end-of-cert in the > latter case. > > For advancing to the next line, it's OK for us to move our > pointer past end-of-cert, because our loop condition just > checks for pointer inequality. And it doesn't even violate > the ANSI C "no more than one past the end of an array" rule, > because we know in the worst case we've hit the terminating > NUL of the strbuf. > > Signed-off-by: Jeff King > --- > builtin/receive-pack.c | 6 -- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c > index 5d9e4da0a..58de2a1a9 100644 > --- a/builtin/receive-pack.c > +++ b/builtin/receive-pack.c > @@ -1524,8 +1524,10 @@ static void queue_commands_from_cert(struct command > **tail, > > while (boc < eoc) { > const char *eol = memchr(boc, '\n', eoc - boc); > - tail = queue_command(tail, boc, eol ? eol - boc : eoc - boc); > - boc = eol ? eol + 1 : eoc; > + if (!eol) > + eol = eoc; > + tail = queue_command(tail, boc, eol - boc); > + boc = eol + 1; > } > }
Re: [PATCH v2 00/21] object_id part 7
On Tue, Mar 28, 2017 at 01:35:36PM -0400, Jeff King wrote: > I thought I'd knock this out quickly before I forgot about it. But it > actually isn't so simple. > > The main caller in read_head_info() does indeed just pass strlen(line) > as the length in each case. But the cert parser really does need us to > respect the line length. So we either have to pass it in, or tie off the > string. > > The latter looks something like the patch below (on top of a minor > tweak around "eol" handling). It's sufficiently ugly that it may not > count as an actual cleanup, though. I'm OK if we just drop the idea. Here's that minor tweak, in case anybody is interested. It's less useful without that follow-on that touches "eol" more, but perhaps it increases readability on its own. -- >8 -- Subject: [PATCH] receive-pack: simplify eol handling in cert parsing The queue_commands_from_cert() function wants to handle each line of the cert individually. It looks for "\n" in the to-be-parsed bytes, and special-cases each use of eol (the end-of-line variable) when we didn't find one. Instead, we can just set the end-of-line variable to end-of-cert in the latter case. For advancing to the next line, it's OK for us to move our pointer past end-of-cert, because our loop condition just checks for pointer inequality. And it doesn't even violate the ANSI C "no more than one past the end of an array" rule, because we know in the worst case we've hit the terminating NUL of the strbuf. Signed-off-by: Jeff King--- builtin/receive-pack.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 5d9e4da0a..58de2a1a9 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -1524,8 +1524,10 @@ static void queue_commands_from_cert(struct command **tail, while (boc < eoc) { const char *eol = memchr(boc, '\n', eoc - boc); - tail = queue_command(tail, boc, eol ? eol - boc : eoc - boc); - boc = eol ? eol + 1 : eoc; + if (!eol) + eol = eoc; + tail = queue_command(tail, boc, eol - boc); + boc = eol + 1; } } -- 2.12.2.845.g55fcf8b10
Re: [PATCH v2 00/21] object_id part 7
On Tue, Mar 28, 2017 at 11:13:15AM +, brian m. carlson wrote: > > I suggested an additional cleanup around "linelen" in one patch. In the > > name of keeping the number of re-rolls sane, I'm OK if we skip that for > > now (the only reason I mentioned it at all is that you have to justify > > the caveat in the commit message; with the fix, that justification can > > go away). > > Let's leave it as it is, assuming Junio's okay with it. I can send in a > few more patches to clean that up and use skip_prefix that we can drop > on top and graduate separately. > > I think the justification is useful as it is, since it explains why we > no longer want to check that particular value for historical reasons. I thought I'd knock this out quickly before I forgot about it. But it actually isn't so simple. The main caller in read_head_info() does indeed just pass strlen(line) as the length in each case. But the cert parser really does need us to respect the line length. So we either have to pass it in, or tie off the string. The latter looks something like the patch below (on top of a minor tweak around "eol" handling). It's sufficiently ugly that it may not count as an actual cleanup, though. I'm OK if we just drop the idea. --- diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 58de2a1a9..561a982e7 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -1483,13 +1483,10 @@ static void execute_commands(struct command *commands, } static struct command **queue_command(struct command **tail, - const char *line, - int linelen) + const char *line) { struct object_id old_oid, new_oid; struct command *cmd; - const char *refname; - int reflen; const char *p; if (parse_oid_hex(line, _oid, ) || @@ -1498,9 +1495,7 @@ static struct command **queue_command(struct command **tail, *p++ != ' ') die("protocol error: expected old/new/ref, got '%s'", line); - refname = p; - reflen = linelen - (p - line); - FLEX_ALLOC_MEM(cmd, ref_name, refname, reflen); + FLEX_ALLOC_STR(cmd, ref_name, p); oidcpy(>old_oid, _oid); oidcpy(>new_oid, _oid); *tail = cmd; @@ -1510,7 +1505,7 @@ static struct command **queue_command(struct command **tail, static void queue_commands_from_cert(struct command **tail, struct strbuf *push_cert) { - const char *boc, *eoc; + char *boc, *eoc; if (*tail) die("protocol error: got both push certificate and unsigned commands"); @@ -1523,10 +1518,17 @@ static void queue_commands_from_cert(struct command **tail, eoc = push_cert->buf + parse_signature(push_cert->buf, push_cert->len); while (boc < eoc) { - const char *eol = memchr(boc, '\n', eoc - boc); + char *eol = memchr(boc, '\n', eoc - boc); + char tmp; + if (!eol) eol = eoc; - tail = queue_command(tail, boc, eol - boc); + + tmp = *eol; + *eol = '\0'; + tail = queue_command(tail, boc); + *eol = tmp; + boc = eol + 1; } } @@ -1590,7 +1592,7 @@ static struct command *read_head_info(struct oid_array *shallow) continue; } - p = queue_command(p, line, linelen); + p = queue_command(p, line); } if (push_cert.len)
Re: [PATCH v2 00/21] object_id part 7
Jeff Kingwrites: > On Sun, Mar 26, 2017 at 04:01:22PM +, brian m. carlson wrote: > >> This is part 7 in the continuing transition to use struct object_id. >> >> This series focuses on two main areas: adding two constants for the >> maximum hash size we'll be using (which will be suitable for allocating >> memory) and converting struct sha1_array to struct oid_array. > > Both changes are very welcome. I do think it's probably worth changing > the name of sha1-array.[ch], but it doesn't need to happen immediately. > > I read through the whole series and didn't find anything objectionable. > The pointer-arithmetic fix should perhaps graduate separately. I didn't see anything incorrect when I queued the series, either, and after I re-read it I saw a few minor readability issues, but modulo that this looks ready. I did split the push-cert parsing fix and applied to an older base independently, though. > I suggested an additional cleanup around "linelen" in one patch. In the > name of keeping the number of re-rolls sane, I'm OK if we skip that for > now (the only reason I mentioned it at all is that you have to justify > the caveat in the commit message; with the fix, that justification can > go away). A follow-up after the dust settles could also mention "we earlier mentioned this caveat but with this fix we no longer have to worry about it", no? Thanks both, anyways.
Re: [PATCH v2 00/21] object_id part 7
On Tue, Mar 28, 2017 at 03:31:59AM -0400, Jeff King wrote: > I read through the whole series and didn't find anything objectionable. > The pointer-arithmetic fix should perhaps graduate separately. Junio's welcome to take that patch separately if he likes. > I suggested an additional cleanup around "linelen" in one patch. In the > name of keeping the number of re-rolls sane, I'm OK if we skip that for > now (the only reason I mentioned it at all is that you have to justify > the caveat in the commit message; with the fix, that justification can > go away). Let's leave it as it is, assuming Junio's okay with it. I can send in a few more patches to clean that up and use skip_prefix that we can drop on top and graduate separately. I think the justification is useful as it is, since it explains why we no longer want to check that particular value for historical reasons. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: https://keybase.io/bk2204 signature.asc Description: PGP signature
Re: [PATCH v2 00/21] object_id part 7
On Sun, Mar 26, 2017 at 04:01:22PM +, brian m. carlson wrote: > This is part 7 in the continuing transition to use struct object_id. > > This series focuses on two main areas: adding two constants for the > maximum hash size we'll be using (which will be suitable for allocating > memory) and converting struct sha1_array to struct oid_array. Both changes are very welcome. I do think it's probably worth changing the name of sha1-array.[ch], but it doesn't need to happen immediately. I read through the whole series and didn't find anything objectionable. The pointer-arithmetic fix should perhaps graduate separately. I suggested an additional cleanup around "linelen" in one patch. In the name of keeping the number of re-rolls sane, I'm OK if we skip that for now (the only reason I mentioned it at all is that you have to justify the caveat in the commit message; with the fix, that justification can go away). -Peff
[PATCH v2 00/21] object_id part 7
This is part 7 in the continuing transition to use struct object_id. This series focuses on two main areas: adding two constants for the maximum hash size we'll be using (which will be suitable for allocating memory) and converting struct sha1_array to struct oid_array. The rationale for adding separate constants for allocating memory is that with a new 256-bit hash function, we're going to need two different items: a constant for allocating memory that's as large as the largest hash, and a global variable telling us size the current hash is. I've opted to provide GIT_MAX_RAWSZ and GIT_MAX_HEXSZ for allocating memory, and leave GIT_SHA1_RAWSZ and GIT_SHA1_HEXSZ as values that can be later replaced by the aforementioned global. Replacing struct sha1_array with struct oid_array necessarily involves converting the shallow code, so I did that. The structure now handles objects of struct object_id. While I renamed the documentation (since people will search for that), I chose not to rename the sha1-array.[ch] files or the test helper because I didn't think it was worth the hassle, especially for people who don't have rename support turned on by default. There is also a patch for fixing some broken pointer arithmetic that was discovered during review of v1. I don't think it's exploitable, but it seems good to fix anyway. Additional eyes on this patch are welcomed. I chose to use Coccinelle quite a bit in this series, as it automates a lot of the manual work and aides in review. There is also some use of Perl one-liners. This series is available at https://github.com/bk2204/git under object-id-part7; it may be rebased. Changes from v1: * Rebase on current master (no changes). * Remove check for empty line in queue_command. * Add patch 6 to fix invalid pointer arithmetic. brian m. carlson (21): Define new hash-size constants for allocating memory Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ builtin/diff: convert to struct object_id builtin/pull: convert portions to struct object_id builtin/receive-pack: fix incorrect pointer arithmetic builtin/receive-pack: convert portions to struct object_id fsck: convert init_skiplist to struct object_id parse-options-cb: convert sha1_array_append caller to struct object_id test-sha1-array: convert most code to struct object_id sha1_name: convert struct disambiguate_state to object_id sha1_name: convert disambiguate_hint_fn to take object_id submodule: convert check_for_new_submodule_commits to object_id builtin/pull: convert to struct object_id sha1-array: convert internal storage for struct sha1_array to object_id Make sha1_array_append take a struct object_id * Convert remaining callers of sha1_array_lookup to object_id Convert sha1_array_lookup to take struct object_id Convert sha1_array_for_each_unique and for_each_abbrev to object_id Rename sha1_array to oid_array Documentation: update and rename api-sha1-array.txt .../{api-sha1-array.txt => api-oid-array.txt} | 44 +++ bisect.c | 43 --- builtin/blame.c| 4 +- builtin/cat-file.c | 14 +-- builtin/diff.c | 40 +++--- builtin/fetch-pack.c | 2 +- builtin/fetch.c| 6 +- builtin/merge-index.c | 2 +- builtin/merge.c| 2 +- builtin/pack-objects.c | 24 ++-- builtin/patch-id.c | 2 +- builtin/pull.c | 98 +++ builtin/receive-pack.c | 136 ++--- builtin/rev-list.c | 2 +- builtin/rev-parse.c| 4 +- builtin/send-pack.c| 4 +- cache.h| 10 +- combine-diff.c | 18 +-- commit.h | 14 +-- connect.c | 8 +- diff.c | 4 +- diff.h | 4 +- fetch-pack.c | 32 ++--- fetch-pack.h | 4 +- fsck.c | 17 +-- fsck.h | 2 +- hex.c | 2 +- parse-options-cb.c | 8 +- patch-ids.c| 2 +- patch-ids.h| 2 +- ref-filter.c | 22 ++--