Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Jeff King
On Tue, Mar 28, 2017 at 12:40:29PM -0700, Junio C Hamano wrote:

> > Here's that minor tweak, in case anybody is interested. It's less useful
> > without that follow-on that touches "eol" more, but perhaps it increases
> > readability on its own.
> 
> Yup, the only thing that the original (with Brian's fix) appears to
> be more careful about is it tries very hard to avoid setting boc
> past eoc.  As we are not checking "boc != eoc" but doing the
> comparison, that "careful" appearance does not give us any benefit
> in practice, other than having to do an extra "eol ? eol+1 : eoc";
> the result of this patch is easier to read.
> 
> By the way, eoc is "one past the end" of the array that begins at
> boc, so setting a pointer to eoc+1 may technically be in violation.
> I do not know how much it matters, though ;-)

I think that is OK. We are reading a strbuf, so eoc must either be the
first character of the PGP signature, or the terminating NUL if there
was no signature block[1]. So it's actually _inside_ the array, and
eoc+1 is our "one past".

-Peff

[1] Arguably we should bail when parse_signature() does not find a PGP
signature at all. We already bail with "malformed push certificate"
when there are other syntactic anomalies.


Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Junio C Hamano
Jeff King  writes:

> Here's that minor tweak, in case anybody is interested. It's less useful
> without that follow-on that touches "eol" more, but perhaps it increases
> readability on its own.

Yup, the only thing that the original (with Brian's fix) appears to
be more careful about is it tries very hard to avoid setting boc
past eoc.  As we are not checking "boc != eoc" but doing the
comparison, that "careful" appearance does not give us any benefit
in practice, other than having to do an extra "eol ? eol+1 : eoc";
the result of this patch is easier to read.

By the way, eoc is "one past the end" of the array that begins at
boc, so setting a pointer to eoc+1 may technically be in violation.
I do not know how much it matters, though ;-)

> -- >8 --
> Subject: [PATCH] receive-pack: simplify eol handling in cert parsing
>
> The queue_commands_from_cert() function wants to handle each
> line of the cert individually. It looks for "\n" in the
> to-be-parsed bytes, and special-cases each use of eol (the
> end-of-line variable) when we didn't find one.  Instead, we
> can just set the end-of-line variable to end-of-cert in the
> latter case.
>
> For advancing to the next line, it's OK for us to move our
> pointer past end-of-cert, because our loop condition just
> checks for pointer inequality. And it doesn't even violate
> the ANSI C "no more than one past the end of an array" rule,
> because we know in the worst case we've hit the terminating
> NUL of the strbuf.
>
> Signed-off-by: Jeff King 
> ---
>  builtin/receive-pack.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index 5d9e4da0a..58de2a1a9 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -1524,8 +1524,10 @@ static void queue_commands_from_cert(struct command 
> **tail,
>  
>   while (boc < eoc) {
>   const char *eol = memchr(boc, '\n', eoc - boc);
> - tail = queue_command(tail, boc, eol ? eol - boc : eoc - boc);
> - boc = eol ? eol + 1 : eoc;
> + if (!eol)
> + eol = eoc;
> + tail = queue_command(tail, boc, eol - boc);
> + boc = eol + 1;
>   }
>  }


Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Jeff King
On Tue, Mar 28, 2017 at 01:35:36PM -0400, Jeff King wrote:

> I thought I'd knock this out quickly before I forgot about it. But it
> actually isn't so simple.
> 
> The main caller in read_head_info() does indeed just pass strlen(line)
> as the length in each case. But the cert parser really does need us to
> respect the line length. So we either have to pass it in, or tie off the
> string.
> 
> The latter looks something like the patch below (on top of a minor
> tweak around "eol" handling). It's sufficiently ugly that it may not
> count as an actual cleanup, though. I'm OK if we just drop the idea.

Here's that minor tweak, in case anybody is interested. It's less useful
without that follow-on that touches "eol" more, but perhaps it increases
readability on its own.

-- >8 --
Subject: [PATCH] receive-pack: simplify eol handling in cert parsing

The queue_commands_from_cert() function wants to handle each
line of the cert individually. It looks for "\n" in the
to-be-parsed bytes, and special-cases each use of eol (the
end-of-line variable) when we didn't find one.  Instead, we
can just set the end-of-line variable to end-of-cert in the
latter case.

For advancing to the next line, it's OK for us to move our
pointer past end-of-cert, because our loop condition just
checks for pointer inequality. And it doesn't even violate
the ANSI C "no more than one past the end of an array" rule,
because we know in the worst case we've hit the terminating
NUL of the strbuf.

Signed-off-by: Jeff King 
---
 builtin/receive-pack.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 5d9e4da0a..58de2a1a9 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1524,8 +1524,10 @@ static void queue_commands_from_cert(struct command 
**tail,
 
while (boc < eoc) {
const char *eol = memchr(boc, '\n', eoc - boc);
-   tail = queue_command(tail, boc, eol ? eol - boc : eoc - boc);
-   boc = eol ? eol + 1 : eoc;
+   if (!eol)
+   eol = eoc;
+   tail = queue_command(tail, boc, eol - boc);
+   boc = eol + 1;
}
 }
 
-- 
2.12.2.845.g55fcf8b10



Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Jeff King
On Tue, Mar 28, 2017 at 11:13:15AM +, brian m. carlson wrote:

> > I suggested an additional cleanup around "linelen" in one patch. In the
> > name of keeping the number of re-rolls sane, I'm OK if we skip that for
> > now (the only reason I mentioned it at all is that you have to justify
> > the caveat in the commit message; with the fix, that justification can
> > go away).
> 
> Let's leave it as it is, assuming Junio's okay with it.  I can send in a
> few more patches to clean that up and use skip_prefix that we can drop
> on top and graduate separately.
> 
> I think the justification is useful as it is, since it explains why we
> no longer want to check that particular value for historical reasons.

I thought I'd knock this out quickly before I forgot about it. But it
actually isn't so simple.

The main caller in read_head_info() does indeed just pass strlen(line)
as the length in each case. But the cert parser really does need us to
respect the line length. So we either have to pass it in, or tie off the
string.

The latter looks something like the patch below (on top of a minor
tweak around "eol" handling). It's sufficiently ugly that it may not
count as an actual cleanup, though. I'm OK if we just drop the idea.

---
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 58de2a1a9..561a982e7 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1483,13 +1483,10 @@ static void execute_commands(struct command *commands,
 }
 
 static struct command **queue_command(struct command **tail,
- const char *line,
- int linelen)
+ const char *line)
 {
struct object_id old_oid, new_oid;
struct command *cmd;
-   const char *refname;
-   int reflen;
const char *p;
 
if (parse_oid_hex(line, _oid, ) ||
@@ -1498,9 +1495,7 @@ static struct command **queue_command(struct command 
**tail,
*p++ != ' ')
die("protocol error: expected old/new/ref, got '%s'", line);
 
-   refname = p;
-   reflen = linelen - (p - line);
-   FLEX_ALLOC_MEM(cmd, ref_name, refname, reflen);
+   FLEX_ALLOC_STR(cmd, ref_name, p);
oidcpy(>old_oid, _oid);
oidcpy(>new_oid, _oid);
*tail = cmd;
@@ -1510,7 +1505,7 @@ static struct command **queue_command(struct command 
**tail,
 static void queue_commands_from_cert(struct command **tail,
 struct strbuf *push_cert)
 {
-   const char *boc, *eoc;
+   char *boc, *eoc;
 
if (*tail)
die("protocol error: got both push certificate and unsigned 
commands");
@@ -1523,10 +1518,17 @@ static void queue_commands_from_cert(struct command 
**tail,
eoc = push_cert->buf + parse_signature(push_cert->buf, push_cert->len);
 
while (boc < eoc) {
-   const char *eol = memchr(boc, '\n', eoc - boc);
+   char *eol = memchr(boc, '\n', eoc - boc);
+   char tmp;
+
if (!eol)
eol = eoc;
-   tail = queue_command(tail, boc, eol - boc);
+
+   tmp = *eol;
+   *eol = '\0';
+   tail = queue_command(tail, boc);
+   *eol = tmp;
+
boc = eol + 1;
}
 }
@@ -1590,7 +1592,7 @@ static struct command *read_head_info(struct oid_array 
*shallow)
continue;
}
 
-   p = queue_command(p, line, linelen);
+   p = queue_command(p, line);
}
 
if (push_cert.len)


Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Junio C Hamano
Jeff King  writes:

> On Sun, Mar 26, 2017 at 04:01:22PM +, brian m. carlson wrote:
>
>> This is part 7 in the continuing transition to use struct object_id.
>> 
>> This series focuses on two main areas: adding two constants for the
>> maximum hash size we'll be using (which will be suitable for allocating
>> memory) and converting struct sha1_array to struct oid_array.
>
> Both changes are very welcome. I do think it's probably worth changing
> the name of sha1-array.[ch], but it doesn't need to happen immediately.
>
> I read through the whole series and didn't find anything objectionable.
> The pointer-arithmetic fix should perhaps graduate separately.

I didn't see anything incorrect when I queued the series, either,
and after I re-read it I saw a few minor readability issues, but
modulo that this looks ready.  I did split the push-cert parsing fix
and applied to an older base independently, though.

> I suggested an additional cleanup around "linelen" in one patch. In the
> name of keeping the number of re-rolls sane, I'm OK if we skip that for
> now (the only reason I mentioned it at all is that you have to justify
> the caveat in the commit message; with the fix, that justification can
> go away).

A follow-up after the dust settles could also mention "we earlier
mentioned this caveat but with this fix we no longer have to worry
about it", no?


Thanks both, anyways.


Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread brian m. carlson
On Tue, Mar 28, 2017 at 03:31:59AM -0400, Jeff King wrote:
> I read through the whole series and didn't find anything objectionable.
> The pointer-arithmetic fix should perhaps graduate separately.

Junio's welcome to take that patch separately if he likes.

> I suggested an additional cleanup around "linelen" in one patch. In the
> name of keeping the number of re-rolls sane, I'm OK if we skip that for
> now (the only reason I mentioned it at all is that you have to justify
> the caveat in the commit message; with the fix, that justification can
> go away).

Let's leave it as it is, assuming Junio's okay with it.  I can send in a
few more patches to clean that up and use skip_prefix that we can drop
on top and graduate separately.

I think the justification is useful as it is, since it explains why we
no longer want to check that particular value for historical reasons.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2 00/21] object_id part 7

2017-03-28 Thread Jeff King
On Sun, Mar 26, 2017 at 04:01:22PM +, brian m. carlson wrote:

> This is part 7 in the continuing transition to use struct object_id.
> 
> This series focuses on two main areas: adding two constants for the
> maximum hash size we'll be using (which will be suitable for allocating
> memory) and converting struct sha1_array to struct oid_array.

Both changes are very welcome. I do think it's probably worth changing
the name of sha1-array.[ch], but it doesn't need to happen immediately.

I read through the whole series and didn't find anything objectionable.
The pointer-arithmetic fix should perhaps graduate separately.

I suggested an additional cleanup around "linelen" in one patch. In the
name of keeping the number of re-rolls sane, I'm OK if we skip that for
now (the only reason I mentioned it at all is that you have to justify
the caveat in the commit message; with the fix, that justification can
go away).

-Peff


[PATCH v2 00/21] object_id part 7

2017-03-26 Thread brian m. carlson
This is part 7 in the continuing transition to use struct object_id.

This series focuses on two main areas: adding two constants for the
maximum hash size we'll be using (which will be suitable for allocating
memory) and converting struct sha1_array to struct oid_array.

The rationale for adding separate constants for allocating memory is
that with a new 256-bit hash function, we're going to need two different
items: a constant for allocating memory that's as large as the largest
hash, and a global variable telling us size the current hash is.  I've
opted to provide GIT_MAX_RAWSZ and GIT_MAX_HEXSZ for allocating memory,
and leave GIT_SHA1_RAWSZ and GIT_SHA1_HEXSZ as values that can be later
replaced by the aforementioned global.

Replacing struct sha1_array with struct oid_array necessarily involves
converting the shallow code, so I did that.  The structure now handles
objects of struct object_id.  While I renamed the documentation (since
people will search for that), I chose not to rename the sha1-array.[ch]
files or the test helper because I didn't think it was worth the hassle,
especially for people who don't have rename support turned on by
default.

There is also a patch for fixing some broken pointer arithmetic that was
discovered during review of v1.  I don't think it's exploitable, but it
seems good to fix anyway.  Additional eyes on this patch are welcomed.

I chose to use Coccinelle quite a bit in this series, as it automates a
lot of the manual work and aides in review.  There is also some use of
Perl one-liners.

This series is available at https://github.com/bk2204/git under
object-id-part7; it may be rebased.

Changes from v1:
* Rebase on current master (no changes).
* Remove check for empty line in queue_command.
* Add patch 6 to fix invalid pointer arithmetic.

brian m. carlson (21):
  Define new hash-size constants for allocating memory
  Convert GIT_SHA1_HEXSZ used for allocation to GIT_MAX_HEXSZ
  Convert GIT_SHA1_RAWSZ used for allocation to GIT_MAX_RAWSZ
  builtin/diff: convert to struct object_id
  builtin/pull: convert portions to struct object_id
  builtin/receive-pack: fix incorrect pointer arithmetic
  builtin/receive-pack: convert portions to struct object_id
  fsck: convert init_skiplist to struct object_id
  parse-options-cb: convert sha1_array_append caller to struct object_id
  test-sha1-array: convert most code to struct object_id
  sha1_name: convert struct disambiguate_state to object_id
  sha1_name: convert disambiguate_hint_fn to take object_id
  submodule: convert check_for_new_submodule_commits to object_id
  builtin/pull: convert to struct object_id
  sha1-array: convert internal storage for struct sha1_array to
object_id
  Make sha1_array_append take a struct object_id *
  Convert remaining callers of sha1_array_lookup to object_id
  Convert sha1_array_lookup to take struct object_id
  Convert sha1_array_for_each_unique and for_each_abbrev to object_id
  Rename sha1_array to oid_array
  Documentation: update and rename api-sha1-array.txt

 .../{api-sha1-array.txt => api-oid-array.txt}  |  44 +++
 bisect.c   |  43 ---
 builtin/blame.c|   4 +-
 builtin/cat-file.c |  14 +--
 builtin/diff.c |  40 +++---
 builtin/fetch-pack.c   |   2 +-
 builtin/fetch.c|   6 +-
 builtin/merge-index.c  |   2 +-
 builtin/merge.c|   2 +-
 builtin/pack-objects.c |  24 ++--
 builtin/patch-id.c |   2 +-
 builtin/pull.c |  98 +++
 builtin/receive-pack.c | 136 ++---
 builtin/rev-list.c |   2 +-
 builtin/rev-parse.c|   4 +-
 builtin/send-pack.c|   4 +-
 cache.h|  10 +-
 combine-diff.c |  18 +--
 commit.h   |  14 +--
 connect.c  |   8 +-
 diff.c |   4 +-
 diff.h |   4 +-
 fetch-pack.c   |  32 ++---
 fetch-pack.h   |   4 +-
 fsck.c |  17 +--
 fsck.h |   2 +-
 hex.c  |   2 +-
 parse-options-cb.c |   8 +-
 patch-ids.c|   2 +-
 patch-ids.h|   2 +-
 ref-filter.c   |  22 ++--