Re: [PATCH 0/2] Re-integrate sha1dc

2017-03-17 Thread Jeff King
On Thu, Mar 16, 2017 at 10:22:12PM -0700, Junio C Hamano wrote:

> My .travis.yml suggestion was about testing with SHA1DC in
> preparation for making it the default.  That would give us another
> incentive to keep an eye on its performance, too, before we make it
> the default in Makefile, at which time the forced selection in the
> travis configuration can be removed.

Yeah, I'm not opposed to that, though it requires somebody actually
paying attention to the timings differences. I'm not sure we can do that
automatically on Travis.

-Peff


Re: [PATCH 0/2] Re-integrate sha1dc

2017-03-16 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Mar 16, 2017 at 03:23:59PM -0700, Junio C Hamano wrote:
>
>> I am wondering if we should queue another one for .travis.yml on top
>> to force use of USE_SHA1DC=YesPlease during the tests.  I expect
>> that we'd be encouraging its use for ordinary users without any
>> specific needs in the release notes in 2.13 release.
>
> I don't think it would buy us much. There's not really any way for this
> build to interact with the rest of the code in any interesting way, so
> either it works as a SHA-1 implementation or it doesn't. If you just
> want it exercised, I'll say that it's powering all of github.com right
> now.
>
> I did wonder if we should ship with it as the default (instead of
> openssl). It's definitely slower, but maybe widespread safety is a good
> thing. OTOH, I think we have a fair bit of time before we see any
> real-life collisions, just given the time and expense of generating
> them.

My .travis.yml suggestion was about testing with SHA1DC in
preparation for making it the default.  That would give us another
incentive to keep an eye on its performance, too, before we make it
the default in Makefile, at which time the forced selection in the
travis configuration can be removed.

Thanks.




Re: [PATCH 0/2] Re-integrate sha1dc

2017-03-16 Thread Jeff King
On Thu, Mar 16, 2017 at 03:23:59PM -0700, Junio C Hamano wrote:

> I am wondering if we should queue another one for .travis.yml on top
> to force use of USE_SHA1DC=YesPlease during the tests.  I expect
> that we'd be encouraging its use for ordinary users without any
> specific needs in the release notes in 2.13 release.

I don't think it would buy us much. There's not really any way for this
build to interact with the rest of the code in any interesting way, so
either it works as a SHA-1 implementation or it doesn't. If you just
want it exercised, I'll say that it's powering all of github.com right
now.

I did wonder if we should ship with it as the default (instead of
openssl). It's definitely slower, but maybe widespread safety is a good
thing. OTOH, I think we have a fair bit of time before we see any
real-life collisions, just given the time and expense of generating
them.

-Peff


Re: [PATCH 0/2] Re-integrate sha1dc

2017-03-16 Thread Linus Torvalds
On Thu, Mar 16, 2017 at 3:04 PM, Jeff King  wrote:
>
> There are a few things I think are worth changing. The die() message
> should mention the sha1 we computed. That will be a big help if an old
> version of git tries to unknowingly push a colliding object to a newer
> version. The user will see "collision on sha1 1234.." which gives them a
> starting point to figure out where they got the bad object from.
>
> And to make that work, we have to disable the safe_hash feature (which
> intentionally corrupts a colliding sha1). We _could_ rip it out
> entirely, but since it only kicks in when we see a collision, I doubt
> it's impacting anything.
>
> I also updated the timings in my commit message, and added a basic test.

No complaints about your version.

   Linus


Re: [PATCH 0/2] Re-integrate sha1dc

2017-03-16 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Mar 16, 2017 at 06:04:56PM -0400, Jeff King wrote:
>
>> So here's my version. It's on top of the hash.h tweak, as well.
>> 
>>   [1/5]: add collision-detecting sha1 implementation
>>   [2/5]: sha1dc: adjust header includes for git
>>   [3/5]: sha1dc: disable safe_hash feature
>>   [4/5]: Makefile: add USE_SHA1DC knob
>>   [5/5]: t0013: add a basic sha1 collision detection test
>> [...]
>>  t/t0013/shattered-1.pdf |  Bin 0 -> 422435 bytes

A 420k binary blob has become ~500k base85 binary patch, which is
larger than 100k.

> So obviously I had the same 100K problem you did on the first patch, but
> the fifth one also won't make it to the list. You can pull the whole
> thing from:
>
>   https://github.com/peff/git.git jk/sha1dc

Thanks.  

For today's integration, I have the one from Linus only because it
came earlier and today's integration cycle was already running.  I
agree with this series that disables the safe-hash thing.

I am wondering if we should queue another one for .travis.yml on top
to force use of USE_SHA1DC=YesPlease during the tests.  I expect
that we'd be encouraging its use for ordinary users without any
specific needs in the release notes in 2.13 release.




Re: [PATCH 0/2] Re-integrate sha1dc

2017-03-16 Thread Jeff King
On Thu, Mar 16, 2017 at 06:04:56PM -0400, Jeff King wrote:

> So here's my version. It's on top of the hash.h tweak, as well.
> 
>   [1/5]: add collision-detecting sha1 implementation
>   [2/5]: sha1dc: adjust header includes for git
>   [3/5]: sha1dc: disable safe_hash feature
>   [4/5]: Makefile: add USE_SHA1DC knob
>   [5/5]: t0013: add a basic sha1 collision detection test
> [...]
>  t/t0013/shattered-1.pdf |  Bin 0 -> 422435 bytes

So obviously I had the same 100K problem you did on the first patch, but
the fifth one also won't make it to the list. You can pull the whole
thing from:

  https://github.com/peff/git.git jk/sha1dc

-Peff


Re: [PATCH 0/2] Re-integrate sha1dc

2017-03-16 Thread Jeff King
On Thu, Mar 16, 2017 at 01:24:02PM -0700, Linus Torvalds wrote:

> I suspect the first patch will not make it to the list since it's over 
> 100kB in size, but oh well.. Junio and Jeff will see it.

Yep, it didn't make it, but I got it.

> It "WorksForMe(tm)" and the integration patches are now fairly trivial, 
> since upstream already did the dieting and some of the semantic changes to 
> gits more traditional C code.

There are a few things I think are worth changing. The die() message
should mention the sha1 we computed. That will be a big help if an old
version of git tries to unknowingly push a colliding object to a newer
version. The user will see "collision on sha1 1234.." which gives them a
starting point to figure out where they got the bad object from.

And to make that work, we have to disable the safe_hash feature (which
intentionally corrupts a colliding sha1). We _could_ rip it out
entirely, but since it only kicks in when we see a collision, I doubt
it's impacting anything.

I also updated the timings in my commit message, and added a basic test.

> I did leave the C++ wrapper lines that the sha1dc header files have grown 
> in the meantime, I debated removing them but felt that "closer to 
> upstream" was worth it.

Yeah, I independently made the same decision.

So here's my version. It's on top of the hash.h tweak, as well.

  [1/5]: add collision-detecting sha1 implementation
  [2/5]: sha1dc: adjust header includes for git
  [3/5]: sha1dc: disable safe_hash feature
  [4/5]: Makefile: add USE_SHA1DC knob
  [5/5]: t0013: add a basic sha1 collision detection test

 Makefile|   11 +
 hash.h  |2 +
 sha1dc/LICENSE.txt  |   30 +
 sha1dc/sha1.c   | 1808 +++
 sha1dc/sha1.h   |  122 
 sha1dc/ubc_check.c  |  363 ++
 sha1dc/ubc_check.h  |   44 ++
 t/t0013-sha1dc.sh   |   19 +
 t/t0013/shattered-1.pdf |  Bin 0 -> 422435 bytes
 9 files changed, 2399 insertions(+)
 create mode 100644 sha1dc/LICENSE.txt
 create mode 100644 sha1dc/sha1.c
 create mode 100644 sha1dc/sha1.h
 create mode 100644 sha1dc/ubc_check.c
 create mode 100644 sha1dc/ubc_check.h
 create mode 100755 t/t0013-sha1dc.sh
 create mode 100644 t/t0013/shattered-1.pdf

-Peff