Re: [PATCH 4/4] Fix delta offset overflow
Martin Koeglerwrites: > On Thu, Aug 10, 2017 at 01:49:24PM -0700, Junio C Hamano wrote: >> The lower 4-byte of moff (before incrementing it with msize) were >> already encoded to the output stream before this hunk. Shouldn't >> we be checking if moff would fit in uint32_t _before_ that happens? > > moff is otherwise only decremented or assigned with an offset generated by > create_delta_index. These offsets are limited by 4GB. > > Any larger offets would be a programming bug - so qualify for just a "assert". OK, in that case, I agree that a check before encoding moff into (upto) 4 output bytes is unnecessary. Sorry, I didn't read the function that populates index->hash[] before responding, and I admit that I haven't read it for a while. >> Cutting it off at here by resetting msize to 0 might help the next >> iteration (I didn't check, but is the effect of it is to corrupt the >> "val" rolling checksum and make it unlikely that the hash >> computation would not find a correct match?) but it somehow feels >> like closing the barn door after the horse has already bolted... > > The current code produces incorrect deltas - its not just a checksum issue. Again, I mis-read what role msize was playing in the original (or in your update). I'd need to re-read that part of the code to make sure I get how your change will fix the issue. Thanks. > By the way: > > Somebody interested in JGIT should also look at these two bugs: > > https://github.com/eclipse/jgit/blob/005e5feb4ecd08c4e4d141a38b9e7942accb3212/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/pack/DeltaEncoder.java > copy would also encode beyond 4GB - producing truncated delta offset. > > https://github.com/eclipse/jgit/blob/005e5feb4ecd08c4e4d141a38b9e7942accb3212/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/pack/BinaryDelta.java > apply uses int for decoding length values. I'll cc: an obvious suspect; thanks for the note.
Re: [PATCH 4/4] Fix delta offset overflow
On Thu, Aug 10, 2017 at 01:49:24PM -0700, Junio C Hamano wrote: > The lower 4-byte of moff (before incrementing it with msize) were > already encoded to the output stream before this hunk. Shouldn't > we be checking if moff would fit in uint32_t _before_ that happens? moff is otherwise only decremented or assigned with an offset generated by create_delta_index. These offsets are limited by 4GB. Any larger offets would be a programming bug - so qualify for just a "assert". > IOW, in a much earlier part of that "while (data < top)" loop, there > is a code that tries to find a match that gives us a large msize by > iterating over index->hash[], and it sets msize and moff as a potential > location that we may want to use. If moff is already big there, then > we shouldn't consider it a match because we cannot encode its location > using 4-byte anyway, no? Recovering from an incorrect moff would add more complex code for a case, that should not happen. A bailout would be sufficient. > Cutting it off at here by resetting msize to 0 might help the next > iteration (I didn't check, but is the effect of it is to corrupt the > "val" rolling checksum and make it unlikely that the hash > computation would not find a correct match?) but it somehow feels > like closing the barn door after the horse has already bolted... The current code produces incorrect deltas - its not just a checksum issue. By the way: Somebody interested in JGIT should also look at these two bugs: https://github.com/eclipse/jgit/blob/005e5feb4ecd08c4e4d141a38b9e7942accb3212/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/pack/DeltaEncoder.java copy would also encode beyond 4GB - producing truncated delta offset. https://github.com/eclipse/jgit/blob/005e5feb4ecd08c4e4d141a38b9e7942accb3212/org.eclipse.jgit/src/org/eclipse/jgit/internal/storage/pack/BinaryDelta.java apply uses int for decoding length values. Regards, Martin
Re: [PATCH 4/4] Fix delta offset overflow
Martin Koeglerwrites: > From: Martin Koegler > > Prevent generating delta offsets beyond 4G. > > Signed-off-by: Martin Koegler > --- > diff-delta.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/diff-delta.c b/diff-delta.c > index 3d5e1ef..633883e 100644 > --- a/diff-delta.c > +++ b/diff-delta.c > @@ -454,6 +454,9 @@ create_delta(const struct delta_index *index, > moff += msize; > msize = left; > > + if (moff > 0x) > + msize = 0; > + The lower 4-byte of moff (before incrementing it with msize) were already encoded to the output stream before this hunk. Shouldn't we be checking if moff would fit in uint32_t _before_ that happens? IOW, in a much earlier part of that "while (data < top)" loop, there is a code that tries to find a match that gives us a large msize by iterating over index->hash[], and it sets msize and moff as a potential location that we may want to use. If moff is already big there, then we shouldn't consider it a match because we cannot encode its location using 4-byte anyway, no? Cutting it off at here by resetting msize to 0 might help the next iteration (I didn't check, but is the effect of it is to corrupt the "val" rolling checksum and make it unlikely that the hash computation would not find a correct match?) but it somehow feels like closing the barn door after the horse has already bolted... > if (msize < 4096) { > int j; > val = 0;
[PATCH 4/4] Fix delta offset overflow
From: Martin KoeglerPrevent generating delta offsets beyond 4G. Signed-off-by: Martin Koegler --- diff-delta.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/diff-delta.c b/diff-delta.c index 3d5e1ef..633883e 100644 --- a/diff-delta.c +++ b/diff-delta.c @@ -454,6 +454,9 @@ create_delta(const struct delta_index *index, moff += msize; msize = left; + if (moff > 0x) + msize = 0; + if (msize < 4096) { int j; val = 0; -- 2.1.4