Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-26 Thread Junio C Hamano
René Scharfe  writes:

> Actually I didn't sign-off on purpose originally.  But OK, let's keep
> the version below.  I just feel strangely sad seeing that concise magic
> go.  Nevermind.

I actually share the sadness, too, but let's be stupid and obvious
here.

Thanks.

>
> Signed-off-by: Rene Scharfe 
>
>> -- >8 --
>> From: René Scharfe 
>> Date: Sun, 23 Oct 2016 19:57:30 +0200
>> Subject: [PATCH] hex: make wraparound of the index into ring-buffer explicit
>>
>> Overflow is defined for unsigned integers, but not for signed ones.
>>
>> We could make the ring-buffer index in sha1_to_hex() and
>> get_pathname() unsigned to be on the safe side to resolve this, but
>> let's make it explicit that we are wrapping around at whatever the
>> number of elements the ring-buffer has.  The compiler is smart enough
>> to turn modulus into bitmask for these codepaths that use
>> ring-buffers of a size that is a power of 2.
>>
>> Signed-off-by: René Scharfe 
>> Signed-off-by: Junio C Hamano 
>> ---
>>  hex.c  | 3 ++-
>>  path.c | 3 ++-
>>  2 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/hex.c b/hex.c
>> index ab2610e498..845b01a874 100644
>> --- a/hex.c
>> +++ b/hex.c
>> @@ -78,7 +78,8 @@ char *sha1_to_hex(const unsigned char *sha1)
>>  {
>>  static int bufno;
>>  static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
>> -return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
>> +bufno = (bufno + 1) % ARRAY_SIZE(hexbuffer);
>> +return sha1_to_hex_r(hexbuffer[bufno], sha1);
>>  }
>>
>>  char *oid_to_hex(const struct object_id *oid)
>> diff --git a/path.c b/path.c
>> index fe3c4d96c6..9bfaeda207 100644
>> --- a/path.c
>> +++ b/path.c
>> @@ -24,7 +24,8 @@ static struct strbuf *get_pathname(void)
>>  STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
>>  };
>>  static int index;
>> -struct strbuf *sb = _array[3 & ++index];
>> +struct strbuf *sb = _array[index];
>> +index = (index + 1) % ARRAY_SIZE(pathname_array);
>>  strbuf_reset(sb);
>>  return sb;
>>  }
>>


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-26 Thread René Scharfe

Am 25.10.2016 um 20:28 schrieb Junio C Hamano:

Jeff King  writes:


On Mon, Oct 24, 2016 at 04:53:50PM -0700, Junio C Hamano wrote:


So how about this?  It gets rid of magic number 3 and works for array
size that's not a power of two.  And as a nice side effect it can't
trigger a signed overflow anymore.


Looks good to me.  Peff?


Any of the variants discussed in this thread is fine by me.


OK, here is what I'll queue then.
I assumed that René wants to sign it off ;-).


Actually I didn't sign-off on purpose originally.  But OK, let's keep
the version below.  I just feel strangely sad seeing that concise magic
go.  Nevermind.

Signed-off-by: Rene Scharfe 


-- >8 --
From: René Scharfe 
Date: Sun, 23 Oct 2016 19:57:30 +0200
Subject: [PATCH] hex: make wraparound of the index into ring-buffer explicit

Overflow is defined for unsigned integers, but not for signed ones.

We could make the ring-buffer index in sha1_to_hex() and
get_pathname() unsigned to be on the safe side to resolve this, but
let's make it explicit that we are wrapping around at whatever the
number of elements the ring-buffer has.  The compiler is smart enough
to turn modulus into bitmask for these codepaths that use
ring-buffers of a size that is a power of 2.

Signed-off-by: René Scharfe 
Signed-off-by: Junio C Hamano 
---
 hex.c  | 3 ++-
 path.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hex.c b/hex.c
index ab2610e498..845b01a874 100644
--- a/hex.c
+++ b/hex.c
@@ -78,7 +78,8 @@ char *sha1_to_hex(const unsigned char *sha1)
 {
static int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
-   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
+   bufno = (bufno + 1) % ARRAY_SIZE(hexbuffer);
+   return sha1_to_hex_r(hexbuffer[bufno], sha1);
 }

 char *oid_to_hex(const struct object_id *oid)
diff --git a/path.c b/path.c
index fe3c4d96c6..9bfaeda207 100644
--- a/path.c
+++ b/path.c
@@ -24,7 +24,8 @@ static struct strbuf *get_pathname(void)
STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
};
static int index;
-   struct strbuf *sb = _array[3 & ++index];
+   struct strbuf *sb = _array[index];
+   index = (index + 1) % ARRAY_SIZE(pathname_array);
strbuf_reset(sb);
return sb;
 }



Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-25 Thread Junio C Hamano
Jeff King  writes:

>> diff --git a/path.c b/path.c
>> index fe3c4d96c6..9bfaeda207 100644
>> --- a/path.c
>> +++ b/path.c
>> @@ -24,7 +24,8 @@ static struct strbuf *get_pathname(void)
>>  STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
>>  };
>>  static int index;
>> -struct strbuf *sb = _array[3 & ++index];
>> +struct strbuf *sb = _array[index];
>> +index = (index + 1) % ARRAY_SIZE(pathname_array);
>>  strbuf_reset(sb);
>>  return sb;
>
> This converts the pre-increment to a post-increment, but I don't think
> it matters.

Yes, I think that using the ring buffer from the beginning, not from
the second element from the beginning, is conceptually cleaner ;-).



Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-25 Thread Jeff King
On Tue, Oct 25, 2016 at 11:28:40AM -0700, Junio C Hamano wrote:

> OK, here is what I'll queue then.
> I assumed that René wants to sign it off ;-).
> 
> -- >8 --
> From: René Scharfe 
> Date: Sun, 23 Oct 2016 19:57:30 +0200
> Subject: [PATCH] hex: make wraparound of the index into ring-buffer explicit
> 
> Overflow is defined for unsigned integers, but not for signed ones.
> 
> We could make the ring-buffer index in sha1_to_hex() and
> get_pathname() unsigned to be on the safe side to resolve this, but
> let's make it explicit that we are wrapping around at whatever the
> number of elements the ring-buffer has.  The compiler is smart enough
> to turn modulus into bitmask for these codepaths that use
> ring-buffers of a size that is a power of 2.

Looks good to me.

> diff --git a/path.c b/path.c
> index fe3c4d96c6..9bfaeda207 100644
> --- a/path.c
> +++ b/path.c
> @@ -24,7 +24,8 @@ static struct strbuf *get_pathname(void)
>   STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
>   };
>   static int index;
> - struct strbuf *sb = _array[3 & ++index];
> + struct strbuf *sb = _array[index];
> + index = (index + 1) % ARRAY_SIZE(pathname_array);
>   strbuf_reset(sb);
>   return sb;

This converts the pre-increment to a post-increment, but I don't think
it matters.

-Peff


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-25 Thread Junio C Hamano
Jeff King  writes:

> On Mon, Oct 24, 2016 at 04:53:50PM -0700, Junio C Hamano wrote:
>
>> > So how about this?  It gets rid of magic number 3 and works for array
>> > size that's not a power of two.  And as a nice side effect it can't
>> > trigger a signed overflow anymore.
>> 
>> Looks good to me.  Peff?
>
> Any of the variants discussed in this thread is fine by me.

OK, here is what I'll queue then.
I assumed that René wants to sign it off ;-).

-- >8 --
From: René Scharfe 
Date: Sun, 23 Oct 2016 19:57:30 +0200
Subject: [PATCH] hex: make wraparound of the index into ring-buffer explicit

Overflow is defined for unsigned integers, but not for signed ones.

We could make the ring-buffer index in sha1_to_hex() and
get_pathname() unsigned to be on the safe side to resolve this, but
let's make it explicit that we are wrapping around at whatever the
number of elements the ring-buffer has.  The compiler is smart enough
to turn modulus into bitmask for these codepaths that use
ring-buffers of a size that is a power of 2.

Signed-off-by: René Scharfe 
Signed-off-by: Junio C Hamano 
---
 hex.c  | 3 ++-
 path.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hex.c b/hex.c
index ab2610e498..845b01a874 100644
--- a/hex.c
+++ b/hex.c
@@ -78,7 +78,8 @@ char *sha1_to_hex(const unsigned char *sha1)
 {
static int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
-   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
+   bufno = (bufno + 1) % ARRAY_SIZE(hexbuffer);
+   return sha1_to_hex_r(hexbuffer[bufno], sha1);
 }
 
 char *oid_to_hex(const struct object_id *oid)
diff --git a/path.c b/path.c
index fe3c4d96c6..9bfaeda207 100644
--- a/path.c
+++ b/path.c
@@ -24,7 +24,8 @@ static struct strbuf *get_pathname(void)
STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
};
static int index;
-   struct strbuf *sb = _array[3 & ++index];
+   struct strbuf *sb = _array[index];
+   index = (index + 1) % ARRAY_SIZE(pathname_array);
strbuf_reset(sb);
return sb;
 }
-- 
2.10.1-777-gd068e6bde7



Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread Jeff King
On Mon, Oct 24, 2016 at 04:53:50PM -0700, Junio C Hamano wrote:

> > So how about this?  It gets rid of magic number 3 and works for array
> > size that's not a power of two.  And as a nice side effect it can't
> > trigger a signed overflow anymore.
> 
> Looks good to me.  Peff?

Any of the variants discussed in this thread is fine by me.

-Peff


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread Junio C Hamano
René Scharfe  writes:

> Am 24.10.2016 um 19:27 schrieb Junio C Hamano:
>> Junio C Hamano  writes:
>> 
 I think it would be preferable to just fix it inline in each place.
>>>
>>> Yeah, probably.
>>>
>>> My initial reaction to this was
>>>
>>>  char *sha1_to_hex(const unsigned char *sha1)
>>>  {
>>> -   static int bufno;
>>> +   static unsigned int bufno;
>>> static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
>>> return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
>>>
>>> "ah, we do not even need bufno as uint; it could be ushort or even
>>> uchar".  If this were a 256 element ring buffer and the index were
>>> uchar, we wouldn't even be having this discussion, and "3 &" is a
>>> way to get a fake type that is a 2-bit unsigned integer that wraps
>>> around when incremented.
>>>
>>> But being explicit, especially when we know that we can rely on the
>>> fact that the compilers are usually intelligent enough, is a good
>>> idea, I would think.
>>>
>>> Isn't size_t often wider than uint, by the way?  It somehow makes me
>>> feel dirty to use it when we know we only care about the bottom two
>>> bit, especially with the explicit "bufno %= ARRAY_SIZE(hexbuffer)",
>>> but I may be simply superstitious in this case.  I dunno.
>> 
>> If we are doing the wrap-around ourselves, I suspect that the index
>> should stay "int" (not even unsigned), as that is supposed to be the
>> most natural and performant type on the architecture.  Would it
>> still result in better code to use size_t instead?
>
> Right.
>
> Gcc emits an extra cltq instruction for x86-64 (Convert Long To Quad,
> i.e. 32-bit to 64-bit) if we wrap explicitly and still use an int in
> sha1_to_hex().  It doesn't if we use an unsigned int or size_t.  It
> also doesn't for get_pathname().  I guess updating the index variable
> only after use makes the difference there.
>
> But I think we can ignore that; it's just an extra cycle.  I only
> even noticed it when verifying that "% 4" is turned into "& 3"..
> Clang and icc don't add the cltq, by the way.
>
> So how about this?  It gets rid of magic number 3 and works for array
> size that's not a power of two.  And as a nice side effect it can't
> trigger a signed overflow anymore.

Looks good to me.  Peff?

> Just adding "unsigned" looks more attractive to me for some reason.
> Perhaps I stared enough at the code to get used to the magic values
> there..

I somehow share that feeling, too.


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread René Scharfe
Am 24.10.2016 um 19:27 schrieb Junio C Hamano:
> Junio C Hamano  writes:
> 
>>> I think it would be preferable to just fix it inline in each place.
>>
>> Yeah, probably.
>>
>> My initial reaction to this was
>>
>>  char *sha1_to_hex(const unsigned char *sha1)
>>  {
>> -static int bufno;
>> +static unsigned int bufno;
>>  static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
>>  return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
>>
>> "ah, we do not even need bufno as uint; it could be ushort or even
>> uchar".  If this were a 256 element ring buffer and the index were
>> uchar, we wouldn't even be having this discussion, and "3 &" is a
>> way to get a fake type that is a 2-bit unsigned integer that wraps
>> around when incremented.
>>
>> But being explicit, especially when we know that we can rely on the
>> fact that the compilers are usually intelligent enough, is a good
>> idea, I would think.
>>
>> Isn't size_t often wider than uint, by the way?  It somehow makes me
>> feel dirty to use it when we know we only care about the bottom two
>> bit, especially with the explicit "bufno %= ARRAY_SIZE(hexbuffer)",
>> but I may be simply superstitious in this case.  I dunno.
> 
> If we are doing the wrap-around ourselves, I suspect that the index
> should stay "int" (not even unsigned), as that is supposed to be the
> most natural and performant type on the architecture.  Would it
> still result in better code to use size_t instead?

Right.

Gcc emits an extra cltq instruction for x86-64 (Convert Long To Quad,
i.e. 32-bit to 64-bit) if we wrap explicitly and still use an int in
sha1_to_hex().  It doesn't if we use an unsigned int or size_t.  It
also doesn't for get_pathname().  I guess updating the index variable
only after use makes the difference there.

But I think we can ignore that; it's just an extra cycle.  I only
even noticed it when verifying that "% 4" is turned into "& 3"..
Clang and icc don't add the cltq, by the way.

So how about this?  It gets rid of magic number 3 and works for array
size that's not a power of two.  And as a nice side effect it can't
trigger a signed overflow anymore.

Just adding "unsigned" looks more attractive to me for some reason.
Perhaps I stared enough at the code to get used to the magic values
there..

René

---
 hex.c  | 3 ++-
 path.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hex.c b/hex.c
index ab2610e..845b01a 100644
--- a/hex.c
+++ b/hex.c
@@ -78,7 +78,8 @@ char *sha1_to_hex(const unsigned char *sha1)
 {
static int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
-   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
+   bufno = (bufno + 1) % ARRAY_SIZE(hexbuffer);
+   return sha1_to_hex_r(hexbuffer[bufno], sha1);
 }
 
 char *oid_to_hex(const struct object_id *oid)
diff --git a/path.c b/path.c
index a8e7295..52d889c 100644
--- a/path.c
+++ b/path.c
@@ -25,7 +25,8 @@ static struct strbuf *get_pathname(void)
STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
};
static int index;
-   struct strbuf *sb = _array[3 & ++index];
+   struct strbuf *sb = _array[index];
+   index = (index + 1) % ARRAY_SIZE(pathname_array);
strbuf_reset(sb);
return sb;
 }
-- 
2.10.1


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread Junio C Hamano
Junio C Hamano  writes:

>> I think it would be preferable to just fix it inline in each place.
>
> Yeah, probably.
>
> My initial reaction to this was
>
>  char *sha1_to_hex(const unsigned char *sha1)
>  {
> - static int bufno;
> + static unsigned int bufno;
>   static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
>   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
>
> "ah, we do not even need bufno as uint; it could be ushort or even
> uchar".  If this were a 256 element ring buffer and the index were
> uchar, we wouldn't even be having this discussion, and "3 &" is a
> way to get a fake type that is a 2-bit unsigned integer that wraps
> around when incremented.
>
> But being explicit, especially when we know that we can rely on the
> fact that the compilers are usually intelligent enough, is a good
> idea, I would think.
>
> Isn't size_t often wider than uint, by the way?  It somehow makes me
> feel dirty to use it when we know we only care about the bottom two
> bit, especially with the explicit "bufno %= ARRAY_SIZE(hexbuffer)",
> but I may be simply superstitious in this case.  I dunno.

If we are doing the wrap-around ourselves, I suspect that the index
should stay "int" (not even unsigned), as that is supposed to be the
most natural and performant type on the architecture.  Would it
still result in better code to use size_t instead?



Author: René Scharfe 
Date:   Sun Oct 23 19:57:30 2016 +0200

hex: make wraparound of the index into ring-buffer explicit

Overflow is defined for unsigned integers, but not for signed ones.

We could make the ring-buffer index in sha1_to_hex() and
get_pathname() unsigned to be on the safe side to resolve this, but
let's make it explicit that we are wrapping around at whatever the
number of elements the ring-buffer has.  The compiler is smart enough
to turn modulus into bitmask for these codepaths that use
ring-buffers of a size that is a power of 2.
---
 cache.h | 3 +++
 hex.c   | 4 ++--
 path.c  | 4 ++--
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index 4cba08ecb1..5429df0f92 100644
--- a/cache.h
+++ b/cache.h
@@ -547,6 +547,9 @@ extern int daemonize(void);
} \
} while (0)
 
+#define NEXT_RING_ITEM(array, index) \
+   (array)[(index) = ((index) + 1) % ARRAY_SIZE(array)]
+
 /* Initialize and use the cache information */
 struct lock_file;
 extern int read_index(struct index_state *);
diff --git a/hex.c b/hex.c
index ab2610e498..5e711b9e32 100644
--- a/hex.c
+++ b/hex.c
@@ -76,9 +76,9 @@ char *oid_to_hex_r(char *buffer, const struct object_id *oid)
 
 char *sha1_to_hex(const unsigned char *sha1)
 {
-   static int bufno;
+   static size_t bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
-   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
+   return sha1_to_hex_r(NEXT_RING_ITEM(hexbuffer, bufno), sha1);
 }
 
 char *oid_to_hex(const struct object_id *oid)
diff --git a/path.c b/path.c
index fe3c4d96c6..5b2ab2271f 100644
--- a/path.c
+++ b/path.c
@@ -23,8 +23,8 @@ static struct strbuf *get_pathname(void)
static struct strbuf pathname_array[4] = {
STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
};
-   static int index;
-   struct strbuf *sb = _array[3 & ++index];
+   static size_t index;
+   struct strbuf *sb = _RING_ITEM(pathname_array, index);
strbuf_reset(sb);
return sb;
 }


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread Junio C Hamano
Jeff King  writes:

>> > You could also write the second line like:
>> > 
>> >   bufno %= ARRAY_SIZE(hexbuffer);
>> > 
>> > which is less magical (right now the set of buffers must be a power of
>> > 2). I expect the compiler could turn that into a bitmask itself.
>> ...
>
> I think it would be preferable to just fix it inline in each place.

Yeah, probably.

My initial reaction to this was

 char *sha1_to_hex(const unsigned char *sha1)
 {
-   static int bufno;
+   static unsigned int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);

"ah, we do not even need bufno as uint; it could be ushort or even
uchar".  If this were a 256 element ring buffer and the index were
uchar, we wouldn't even be having this discussion, and "3 &" is a
way to get a fake type that is a 2-bit unsigned integer that wraps
around when incremented.

But being explicit, especially when we know that we can rely on the
fact that the compilers are usually intelligent enough, is a good
idea, I would think.

Isn't size_t often wider than uint, by the way?  It somehow makes me
feel dirty to use it when we know we only care about the bottom two
bit, especially with the explicit "bufno %= ARRAY_SIZE(hexbuffer)",
but I may be simply superstitious in this case.  I dunno.


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-24 Thread Jeff King
On Sun, Oct 23, 2016 at 07:57:30PM +0200, René Scharfe wrote:

> > > Hard to trigger, but probably even harder to diagnose once someone
> > > somehow manages to do it on some uncommon architecture.
> > 
> > Indeed. If we are worried about overflow, we would also want to assume
> > that it wraps at a multiple of 4, but that is probably a sane
> > assumption.
> 
> Hmm, I can't think of a way to violate this assumption except with unsigned
> integers that are only a single bit wide.  That would be a weird machine.
> Are there other possibilities?

No, I don't think so. I don't recall offhand whether the C standard
allows integers that are not powers of 2. But if it does, and somebody
develops such a platform, I have very little sympathy.

My comment was mostly "this is the only other restriction I can think
of, and it is crazy".

> > You could also write the second line like:
> > 
> >   bufno %= ARRAY_SIZE(hexbuffer);
> > 
> > which is less magical (right now the set of buffers must be a power of
> > 2). I expect the compiler could turn that into a bitmask itself.
> 
> Expelling magic is a good idea.  And indeed, at least gcc, clang and icc on
> x86-64 are smart enough to use an AND instead of dividing
> (https://godbolt.org/g/rFPpzF).
> 
> But gcc also adds a sign extension (cltq/cdqe) if we store the truncated
> value, unlike the other two compilers.  I don't see why -- the bit mask
> operation enforces a value between 0 and 3 (inclusive) and the upper bits of
> eax are zeroed automatically, so the cltq is effectively a noop.
> 
> Using size_t gets us rid of the extra instruction and is the right type
> anyway.  It would suffice on its own, hmm..

Yeah, I had assumed you would also switch to some form of unsigned type
either way.

> > I'm fine with any of the options. I guess you'd want a similar patch for
> > find_unique_abbrev on top of jk/no-looking-at-dotgit-outside-repo.
> 
> Actually I'd want you to want to amend your series yourself. ;)  Maybe I can
> convince Coccinelle to handle that issue for us.

I thought that series was in "next" already, but I see it isn't yet. I'd
still wait until the sha1_to_hex() solution settles, and then copy it.

> And there's also path.c::get_pathname().  That's enough cases to justify
> adding a macro, I'd say:
> [...]
> +#define NEXT_RING_ITEM(array, index) \
> + (array)[(index) = ((index) + 1) % ARRAY_SIZE(array)]
> +

I dunno. It hides a lot of magic without saving a lot of lines in the
caller, and the callers have to make sure "array" is an array and that
"index" is unsigned.

E.g., in this code:

> @@ -24,8 +24,8 @@ static struct strbuf *get_pathname(void)
>   static struct strbuf pathname_array[4] = {
>   STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
>   };
> - static int index;
> - struct strbuf *sb = _array[3 & ++index];
> + static size_t index;
> + struct strbuf *sb = _RING_ITEM(pathname_array, index);
>   strbuf_reset(sb);
>   return sb;
>  }

The truly ugly part is the repeated STRBUF_INIT. :)

I think it would be preferable to just fix it inline in each place.

-Peff


Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-23 Thread René Scharfe

Am 23.10.2016 um 11:11 schrieb Jeff King:

On Sun, Oct 23, 2016 at 11:00:48AM +0200, René Scharfe wrote:


Overflow is defined for unsigned integers, but not for signed ones.
Make the ring buffer index in sha1_to_hex() unsigned to be on the
safe side.

Signed-off-by: Rene Scharfe 
---
Hard to trigger, but probably even harder to diagnose once someone
somehow manages to do it on some uncommon architecture.


Indeed. If we are worried about overflow, we would also want to assume
that it wraps at a multiple of 4, but that is probably a sane
assumption.


Hmm, I can't think of a way to violate this assumption except with 
unsigned integers that are only a single bit wide.  That would be a 
weird machine.  Are there other possibilities?



diff --git a/hex.c b/hex.c
index ab2610e..8c6c189 100644
--- a/hex.c
+++ b/hex.c
@@ -76,7 +76,7 @@ char *oid_to_hex_r(char *buffer, const struct object_id *oid)

 char *sha1_to_hex(const unsigned char *sha1)
 {
-   static int bufno;
+   static unsigned int bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
 }


I wonder if just truncating bufno would be conceptually simpler (albeit
longer):

  bufno++;
  bufno &= 3;
  return sha1_to_hex_r(hexbuffer[bufno], sha1);

You could also write the second line like:

  bufno %= ARRAY_SIZE(hexbuffer);

which is less magical (right now the set of buffers must be a power of
2). I expect the compiler could turn that into a bitmask itself.


Expelling magic is a good idea.  And indeed, at least gcc, clang and icc 
on x86-64 are smart enough to use an AND instead of dividing 
(https://godbolt.org/g/rFPpzF).


But gcc also adds a sign extension (cltq/cdqe) if we store the truncated 
value, unlike the other two compilers.  I don't see why -- the bit mask 
operation enforces a value between 0 and 3 (inclusive) and the upper 
bits of eax are zeroed automatically, so the cltq is effectively a noop.


Using size_t gets us rid of the extra instruction and is the right type 
anyway.  It would suffice on its own, hmm..



I'm fine with any of the options. I guess you'd want a similar patch for
find_unique_abbrev on top of jk/no-looking-at-dotgit-outside-repo.


Actually I'd want you to want to amend your series yourself. ;)  Maybe I 
can convince Coccinelle to handle that issue for us.


And there's also path.c::get_pathname().  That's enough cases to justify 
adding a macro, I'd say:


---
 cache.h | 3 +++
 hex.c   | 4 ++--
 path.c  | 4 ++--
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/cache.h b/cache.h
index 05ecb88..8bb4918 100644
--- a/cache.h
+++ b/cache.h
@@ -555,6 +555,9 @@ extern int daemonize(void);
} \
} while (0)

+#define NEXT_RING_ITEM(array, index) \
+   (array)[(index) = ((index) + 1) % ARRAY_SIZE(array)]
+
 /* Initialize and use the cache information */
 struct lock_file;
 extern int read_index(struct index_state *);
diff --git a/hex.c b/hex.c
index ab2610e..5e711b9 100644
--- a/hex.c
+++ b/hex.c
@@ -76,9 +76,9 @@ char *oid_to_hex_r(char *buffer, const struct 
object_id *oid)


 char *sha1_to_hex(const unsigned char *sha1)
 {
-   static int bufno;
+   static size_t bufno;
static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
-   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
+   return sha1_to_hex_r(NEXT_RING_ITEM(hexbuffer, bufno), sha1);
 }

 char *oid_to_hex(const struct object_id *oid)
diff --git a/path.c b/path.c
index a8e7295..60dba6a 100644
--- a/path.c
+++ b/path.c
@@ -24,8 +24,8 @@ static struct strbuf *get_pathname(void)
static struct strbuf pathname_array[4] = {
STRBUF_INIT, STRBUF_INIT, STRBUF_INIT, STRBUF_INIT
};
-   static int index;
-   struct strbuf *sb = _array[3 & ++index];
+   static size_t index;
+   struct strbuf *sb = _RING_ITEM(pathname_array, index);
strbuf_reset(sb);
return sb;
 }
--
2.10.1




Re: [PATCH] hex: use unsigned index for ring buffer

2016-10-23 Thread Jeff King
On Sun, Oct 23, 2016 at 11:00:48AM +0200, René Scharfe wrote:

> Overflow is defined for unsigned integers, but not for signed ones.
> Make the ring buffer index in sha1_to_hex() unsigned to be on the
> safe side.
> 
> Signed-off-by: Rene Scharfe 
> ---
> Hard to trigger, but probably even harder to diagnose once someone
> somehow manages to do it on some uncommon architecture.

Indeed. If we are worried about overflow, we would also want to assume
that it wraps at a multiple of 4, but that is probably a sane
assumption.

> diff --git a/hex.c b/hex.c
> index ab2610e..8c6c189 100644
> --- a/hex.c
> +++ b/hex.c
> @@ -76,7 +76,7 @@ char *oid_to_hex_r(char *buffer, const struct object_id 
> *oid)
>  
>  char *sha1_to_hex(const unsigned char *sha1)
>  {
> - static int bufno;
> + static unsigned int bufno;
>   static char hexbuffer[4][GIT_SHA1_HEXSZ + 1];
>   return sha1_to_hex_r(hexbuffer[3 & ++bufno], sha1);
>  }

I wonder if just truncating bufno would be conceptually simpler (albeit
longer):

  bufno++;
  bufno &= 3;
  return sha1_to_hex_r(hexbuffer[bufno], sha1);

You could also write the second line like:

  bufno %= ARRAY_SIZE(hexbuffer);

which is less magical (right now the set of buffers must be a power of
2). I expect the compiler could turn that into a bitmask itself.

I'm fine with any of the options. I guess you'd want a similar patch for
find_unique_abbrev on top of jk/no-looking-at-dotgit-outside-repo.

-Peff