Re: [PATCH v2] packfile: Correct zlib buffer handling

2018-06-13 Thread Jeremy Linton
Hi,

On Wed, Jun 13, 2018 at 1:38 PM, Junio C Hamano  wrote:
> Eric Sunshine  writes:
>
>>> +   buffer[size] = 0; /* assure that the buffer is still terminated */
>>
>> I think we normally use '\0' for NUL on this project rather than simply 0.
>>
>> The comment is also effectively pure noise since it merely repeats
>> what the code already states clearly (especially when the code says
>> "buffer[size] = '\0';"), so dropping the comment altogether would be
>> reasonable.
>
> Actually, I'd prefer to have comment there, but not about "what this
> line does" (which is useless, as you pointed out) but about "why do
> we do this seemingly redundant clearing".
>
> Here is what I tentatively came up with.
>
> -- >8 --
> From: Jeremy Linton 
> Date: Wed, 13 Jun 2018 09:22:07 -0500
> Subject: [PATCH] packfile: correct zlib buffer handling
>
> The buffer being passed to zlib includes a NUL terminator that git
> needs to keep in place. unpack_compressed_entry() attempts to detect
> the case that the source buffer hasn't been fully consumed by
> checking to see if the destination buffer has been over consumed.
>
> This causes a problem, that more recent zlib patches have been
> poisoning the unconsumed portions of the buffer which overwrites
> the NUL byte, while correctly returning length and status.
>
> Let's place the NUL at the end of the buffer after inflate returns
> to assure that it doesn't result in problems for git even if its
> been overwritten by zlib.
>
> Signed-off-by: Jeremy Linton 
> Signed-off-by: Junio C Hamano 
> ---
>  packfile.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/packfile.c b/packfile.c
> index 4a5fe7ab18..d555699217 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -1422,6 +1422,9 @@ static void *unpack_compressed_entry(struct packed_git 
> *p,
> return NULL;
> }
>
> +   /* versions of zlib can clobber unconsumed portion of outbuf */
> +   buffer[size] = '\0';
> +
> return buffer;
>  }
>
> --
> 2.18.0-rc1-1-g6f333ff2fb

This is all fine with me, the original comment was an attempt to
indicate that the original null may not have been there anymore too..

Shall I resubmit it as above, or can it be picked up like this?


[PATCH v2] packfile: Correct zlib buffer handling

2018-06-13 Thread Jeremy Linton
The buffer being passed to zlib includes a null terminator that
git needs to keep in place. unpack_compressed_entry() attempts to
detect the case that the source buffer hasn't been fully consumed
by checking to see if the destination buffer has been over consumed.

This causes a problem, that more recent zlib patches have been
poisoning the unconsumed portions of the buffer which overwrites
the null, while correctly returning length and status.

Let's replace the null at the end of the buffer to assure that
if its been overwritten by zlib it doesn't result in problems for
git.

Signed-off-by: Jeremy Linton 
---
 packfile.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/packfile.c b/packfile.c
index 7c1a2519f..8db5d90ca 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1433,6 +1433,8 @@ static void *unpack_compressed_entry(struct packed_git *p,
return NULL;
}
 
+   buffer[size] = 0; /* assure that the buffer is still terminated */
+
return buffer;
 }
 
-- 
2.13.6



Re: [PATCH] packfile: Correct zlib buffer handling

2018-06-12 Thread Jeremy Linton
Hi,

Sorry about the delay here (bit of a mix-up and didn't reply to the list).

(see inline )

On Sun, May 27, 2018 at 9:41 PM, Junio C Hamano  wrote:
> Junio C Hamano  writes:
>
>> Duy Nguyen  writes:
>>
>>> On Sun, May 27, 2018 at 1:57 AM, Junio C Hamano  wrote:

(trimming)

>
> Specifically, I was worried about this assertion:
>
> Lets rely on the fact that the source buffer will only be fully
> consumed when the when the destination buffer is inflated to the
> correct size.
>
> which I think is the exact bad thinking that caused troubles for us
> in the past; isn't the explanation in 456cdf6e ("Fix loose object
> uncompression check.", 2007-03-19) relevant here?
>
> -   stream.avail_out = size + 1;
> +   stream.avail_out = size;
> ...
> stream.next_in = in;
> st = git_inflate(, Z_FINISH);
> if (!stream.avail_out)
> -   break; /* the payload is larger than it should be */
> +   break; /* done, st indicates if source fully consumed 
> */
> curpos += stream.next_in - in;
> } while (st == Z_OK || st == Z_BUF_ERROR);
> git_inflate_end();
> if ((st != Z_STREAM_END) || stream.total_out != size) {
> free(buffer);
> return NULL;
> }
>
> With minimum stream.avail_out without slack, when !avail_out, i.e.
> when we fully filled the output buffer, it could be that we had
> correct input that deflates to the correct size, in which case we
> are happy---st would say Z_STREAM_END, we would leave the loop
> because it is neither OK nor BUF_ERROR, and total_out would report
> the size we expected.  Or the input zlib stream may have ended with
> bytes that express "this concludes the stream", and the input bytes
> before that was sufficient to construct the original payload fully,
> and we may have just fed the bytes before that "this concludes the
> stream" to git_inflate().
>
> In such a case, we haven't consumed all the avail_in.  We may
> already have all the correct output, i.e. !avail_out, but because we
> haven't consumed the "this concludes the stream", st is not
> STREAM_END in such a case.

If I understand correctly your concerned the avail_in is longer than
what is required to fill the output buffer..

I'm fairly sure that won't result in a Z_STREAM_END, as you rightfully
point out, but the loop _will_ terminate due to the output buffer
being full and then since its not Z_STREAM_END the
unpack_compressed_entry fails, as it should.

>
> Our existing while() loop, with one-byte slack in avail_out, would
> have let us continue and the next iteration of the loop would have
> consumed the input without producing any more output (i.e. avail_out
> would have been left to 1 in both of these final two rounds) and we
> would have exited the loop.  After calling inflate_end(), we would
> have noticed STREAM_END and correct size and we would have been
> happy.

Your assuming that zlib will terminate with an error, but a fully
decompressed buffer, because it hasn't consumed the entire input
buffer. I don't think that is how it works (its not how the
documentation is written, nor the bits of code i've looked at seem to
work, which granted i'm not a zlib maintainer).


>
> The updated code would handle this latter case rather badly, no?  We
> leave the loop early, notice st is not STREAM_END, and be very
> unhappy, because this patch did not give us to consume the very end
> of the input stream and left the loop early.

Your correct if the above case is a valid zlib behavior then there
would be a problem. But, I don't think the termination is dicated by
insufficient output space until there is an attempt to utilize that
space.


>
>>> This yields two problems, first a single byte overrun won't be detected
>>> properly because the Z_STREAM_END will then be set, but the null
>>> terminator will have been overwritten.
>
> Because we compare total_out and size at the end, we would detect it
> as an error in this function, no?  Then zlib overwriting NUL would
> not be a problem, as we would free the buffer and return NULL, no?
>
>>> The other problem is that
>>> more recent zlib patches have been poisoning the unconsumed portions
>>> of the buffers which also overwrites the null, while correctly
>>> returning length and status.
>
> Isn't that a bug in zlib, though?  Or do they do that deliberately?
>
> I think a workaround with lower impact would be to manually restore
> NUL at the end of the buffer.

I agree, just resetting the NULL its likely safer, and I will repost a
patch soon which if nothing else makes git more robust to errant zlib
behavior.


[PATCH] packfile: Correct zlib buffer handling

2018-05-25 Thread Jeremy Linton
The buffer being passed to zlib includes a null terminator that
git needs to keep in place. unpack_compressed_entry() attempts to
detect the case that the source buffer hasn't been fully consumed
by checking to see if the destination buffer has been over consumed.

This yields two problems, first a single byte overrun won't be detected
properly because the Z_STREAM_END will then be set, but the null
terminator will have been overwritten. The other problem is that
more recent zlib patches have been poisoning the unconsumed portions
of the buffers which also overwrites the null, while correctly
returning length and status.

Lets rely on the fact that the source buffer will only be fully
consumed when the when the destination buffer is inflated to the
correct size. We can do this by passing zlib the correct buffer size
and properly checking the return status. The latter check actually
already exists if the buffer size is correct.

Signed-off-by: Jeremy Linton <lintonrjer...@gmail.com>
---
 packfile.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/packfile.c b/packfile.c
index 7c1a2519f..245eb3204 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1416,7 +1416,7 @@ static void *unpack_compressed_entry(struct packed_git *p,
return NULL;
memset(, 0, sizeof(stream));
stream.next_out = buffer;
-   stream.avail_out = size + 1;
+   stream.avail_out = size;
 
git_inflate_init();
do {
@@ -1424,7 +1424,7 @@ static void *unpack_compressed_entry(struct packed_git *p,
stream.next_in = in;
st = git_inflate(, Z_FINISH);
if (!stream.avail_out)
-   break; /* the payload is larger than it should be */
+   break; /* done, st indicates if source fully consumed */
curpos += stream.next_in - in;
} while (st == Z_OK || st == Z_BUF_ERROR);
git_inflate_end();
-- 
2.13.6



[PATCH] packfile: Correct zlib buffer handling

2018-05-25 Thread Jeremy Linton
The buffer being passed to zlib includes a null terminator that
git needs to keep in place. unpack_compressed_entry() attempts to
detect the case that the source buffer hasn't been fully consumed
by checking to see if the destination buffer has been over consumed.

This yields two problems, first a single byte overrun won't be detected
properly because the Z_STREAM_END will then be set, but the null
terminator will have been overwritten. The other problem is that
more recent zlib patches have been poisoning the unconsumed portions
of the buffers which also overwrites the null, while correctly
returning length and status.

Lets rely on the fact that the source buffer will only be fully
consumed when the when the destination buffer is inflated to the
correct size. We can do this by passing zlib the correct buffer size
and properly checking the return status. The latter check actually
already exists if the buffer size is correct.

Signed-off-by: Jeremy Linton <lintonrjer...@gmail.com>
---
 packfile.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/packfile.c b/packfile.c
index 7c1a2519f..245eb3204 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1416,7 +1416,7 @@ static void *unpack_compressed_entry(struct packed_git *p,
return NULL;
memset(, 0, sizeof(stream));
stream.next_out = buffer;
-   stream.avail_out = size + 1;
+   stream.avail_out = size;
 
git_inflate_init();
do {
@@ -1424,7 +1424,7 @@ static void *unpack_compressed_entry(struct packed_git *p,
stream.next_in = in;
st = git_inflate(, Z_FINISH);
if (!stream.avail_out)
-   break; /* the payload is larger than it should be */
+   break; /* done, st indicates if source fully consumed */
curpos += stream.next_in - in;
} while (st == Z_OK || st == Z_BUF_ERROR);
git_inflate_end();
-- 
2.13.6