Re: [PATCH 4/4] Fix delta offset overflow

2017-08-11 Thread Junio C Hamano
Martin Koegler  writes:

> On Thu, Aug 10, 2017 at 01:49:24PM -0700, Junio C Hamano wrote:
>> The lower 4-byte of moff (before incrementing it with msize) were
>> already encoded to the output stream before this hunk.  Shouldn't
>> we be checking if moff would fit in uint32_t _before_ that happens?
>
> moff is otherwise only decremented or assigned with an offset generated by
> create_delta_index. These offsets are limited by 4GB.
>
> Any larger offets would be a programming bug - so qualify for just a "assert".

OK, in that case, I agree that a check before encoding moff into
(upto) 4 output bytes is unnecessary.  Sorry, I didn't read the
function that populates index->hash[] before responding, and I admit
that I haven't read it for a while.

>> Cutting it off at here by resetting msize to 0 might help the next
>> iteration (I didn't check, but is the effect of it is to corrupt the
>> "val" rolling checksum and make it unlikely that the hash
>> computation would not find a correct match?) but it somehow feels
>> like closing the barn door after the horse has already bolted...
>
> The current code produces incorrect deltas - its not just a checksum issue.

Again, I mis-read what role msize was playing in the original (or in
your update).  I'd need to re-read that part of the code to make
sure I get how your change will fix the issue.

Thanks.

> By the way: 
>
> Somebody interested in JGIT should also look at these two bugs:
>
> https://github.com/eclipse/jgit/blob/005e5feb4ecd08c4e4d141a38b9e7942accb3212/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/pack/DeltaEncoder.java
> copy would also encode beyond 4GB - producing truncated delta offset.
>
> https://github.com/eclipse/jgit/blob/005e5feb4ecd08c4e4d141a38b9e7942accb3212/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/pack/BinaryDelta.java
> apply uses int for decoding length values.

I'll cc: an obvious suspect; thanks for the note.


Re: [PATCH 4/4] Fix delta offset overflow

2017-08-11 Thread Martin Koegler
On Thu, Aug 10, 2017 at 01:49:24PM -0700, Junio C Hamano wrote:
> The lower 4-byte of moff (before incrementing it with msize) were
> already encoded to the output stream before this hunk.  Shouldn't
> we be checking if moff would fit in uint32_t _before_ that happens?

moff is otherwise only decremented or assigned with an offset generated by
create_delta_index. These offsets are limited by 4GB.

Any larger offets would be a programming bug - so qualify for just a "assert".
 
> IOW, in a much earlier part of that "while (data < top)" loop, there
> is a code that tries to find a match that gives us a large msize by
> iterating over index->hash[], and it sets msize and moff as a potential
> location that we may want to use.  If moff is already big there, then
> we shouldn't consider it a match because we cannot encode its location
> using 4-byte anyway, no?

Recovering from an incorrect moff would add more complex code for a case, that
should not happen. A bailout would be sufficient.
 
> Cutting it off at here by resetting msize to 0 might help the next
> iteration (I didn't check, but is the effect of it is to corrupt the
> "val" rolling checksum and make it unlikely that the hash
> computation would not find a correct match?) but it somehow feels
> like closing the barn door after the horse has already bolted...

The current code produces incorrect deltas - its not just a checksum issue.

By the way: 

Somebody interested in JGIT should also look at these two bugs:

https://github.com/eclipse/jgit/blob/005e5feb4ecd08c4e4d141a38b9e7942accb3212/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/pack/DeltaEncoder.java
copy would also encode beyond 4GB - producing truncated delta offset.

https://github.com/eclipse/jgit/blob/005e5feb4ecd08c4e4d141a38b9e7942accb3212/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/pack/BinaryDelta.java
apply uses int for decoding length values.

Regards,
Martin


Re: [PATCH 4/4] Fix delta offset overflow

2017-08-10 Thread Junio C Hamano
Martin Koegler  writes:

> From: Martin Koegler 
>
> Prevent generating delta offsets beyond 4G.
>
> Signed-off-by: Martin Koegler 
> ---
>  diff-delta.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/diff-delta.c b/diff-delta.c
> index 3d5e1ef..633883e 100644
> --- a/diff-delta.c
> +++ b/diff-delta.c
> @@ -454,6 +454,9 @@ create_delta(const struct delta_index *index,
>   moff += msize;
>   msize = left;
>  
> + if (moff > 0x)
> + msize = 0;
> +

The lower 4-byte of moff (before incrementing it with msize) were
already encoded to the output stream before this hunk.  Shouldn't
we be checking if moff would fit in uint32_t _before_ that happens?

IOW, in a much earlier part of that "while (data < top)" loop, there
is a code that tries to find a match that gives us a large msize by
iterating over index->hash[], and it sets msize and moff as a potential
location that we may want to use.  If moff is already big there, then
we shouldn't consider it a match because we cannot encode its location
using 4-byte anyway, no?

Cutting it off at here by resetting msize to 0 might help the next
iteration (I didn't check, but is the effect of it is to corrupt the
"val" rolling checksum and make it unlikely that the hash
computation would not find a correct match?) but it somehow feels
like closing the barn door after the horse has already bolted...

>   if (msize < 4096) {
>   int j;
>   val = 0;


[PATCH 4/4] Fix delta offset overflow

2017-08-10 Thread Martin Koegler
From: Martin Koegler 

Prevent generating delta offsets beyond 4G.

Signed-off-by: Martin Koegler 
---
 diff-delta.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/diff-delta.c b/diff-delta.c
index 3d5e1ef..633883e 100644
--- a/diff-delta.c
+++ b/diff-delta.c
@@ -454,6 +454,9 @@ create_delta(const struct delta_index *index,
moff += msize;
msize = left;
 
+   if (moff > 0x)
+   msize = 0;
+
if (msize < 4096) {
int j;
val = 0;
-- 
2.1.4