Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

Warren Young Thu, 23 Feb 2017 16:02:39 -0800

On Feb 23, 2017, at 10:50 AM, Marc Simpson <m...@0branch.com> wrote:
> 
> This may be of interest to some here, especially in light of previous
> SHA-1 related discussions on list:
> 
>  https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html


Before I respond, first know that I respond out of concern for Fossil.  I’m a 
staunch Fossil defender, and I’m on the record doing so, many times.  My 
motivation in laying out these criticisms is that I want Fossil to continue to 
be worthy of that defense going forward.

Second, there will be those who say we’ve covered all of this already, multiple 
times.  I know, I was there.  But now we have new data.  Before, this sort of 
attack was theoretical only.  Now it’s not only proven possible, it is already 
within the ROI budget for certain specialized attacks; attacks only get cheaper 
over time.

The new data includes not only this news from Google and its research partners. 
 The resulting discussion on Hacker News made me aware of a way an attacker 
could use this new attack against Fossil despite the fact that this is “just” a 
collision, rather than a way to generate second-preimages inexpensively:

    https://news.ycombinator.com/item?id=13715887

This thus gives me an answer to drh’s challenge to me in one of the prior 
threads on this subject:

    https://goo.gl/2tzdOi

Executive summary for those who don’t want to click either of the above links: 
Challenge: “What can you do with this attack?”  Response: “Replace a good 
checkin during a clone/sync shipping that good checkin to another Fossil 
instance.”  After the attack, the repos do not all contain the same data, but 
they will agree that they’re in sync if you ask them to check.

Previous threads have pointed out that you need to fiddle not only with the 
SHA1 but also with an MD5 and some other kind of non-cryptographic checksum, 
and still have both the evil and useful checkins be working C code.  All of 
that is doable.  A motivated attacker could probably do all of that 
computationally in about a second on modern hardware.  That means these other 
mechanisms add essentially nothing to Fossil’s resistance to sync stream 
tampering.

The classic solution for Byzantine Fault Tolerance in the face of 1 possible 
traitorous general — or an untrustworthy messenger, as with the MITM attack — 
is to have at least 3 replicas, but that only works when the algorithm used to 
achieve consensus among the loyal generals is trustworthy.

This paper is on-point:

    http://zoo.cs.yale.edu/classes/cs426/2012/bib/castro99practical.pdf

Quoting from page 2, “We also assume that the adversary…[is] computationally 
bound so that (with very high probability) it is unable to subvert the 
cryptographic techniques mentioned above.”  We call that “key assumption 
violated” where I come from.  (Check the authors list, by the way.  Yes, *that* 
Liskov.)

According to another post on the Hacker News page linked above, this attack 
cost about $100k in GPGPU resources.  There must be Fossil-hosted projects 
worth that much to attack today, and after a bad actor pulls that one off, 
they’ve paid for the hardware, so a second attack costs only power and cooling. 
 

Given the up-front costs, a bit more to mount a MITM attack doesn’t seem 
infeasible.

Consider also that this attack is “free” to attackers who already have access 
to a pool of compromised PCs, many of which will have powerful GPUs.

Today, I see the following defenses to this problem:

1. “fossil diff --from known-good-release” before electing to use binaries 
built from a given repo you don’t trust implicitly.  (And I hope this news has 
shortened your list of such repos!)

2. Modify all drive-by patches to foil pre-generated collisions.  (Presumably 
you trust those with checkin privileges on the repo.)

3. Put any attackable Fossil repos behind a TLS proxy with a strong cert (i.e. 
not self-signed, and certainly not SHA-1 hashed!) that enforces TLS access, as 
in my HOWTO:

    https://goo.gl/USybpW

TLS proxying prevents the MITM attack, but look at the cost.  I’m happy to pay 
it for my public Fossil repos, but do we really want to make that kind of thing 
a prerequisite for all Fossil users once the cost of this kind of attack drops 
to script-kiddie levels?  That defeats a large chunk of the Fossil value 
proposition, being that it’s easy to set up.

Fossil’s "Redirect to HTTPS on the Login page” setting (Admin > Access) doesn’t 
solve this problem, since it still allows plaintext clones.  This news shows 
that we need to protect the clone/sync stream now as well, not just the login 
exchange.

Historically, all widely-deployed hash functions have had a finite lifetime.  
SHA-1 has had a good, long run:

    http://valerieaurora.org/hash.html

The PHC scheme would allow Fossil to migrate to something stronger in a 
backwards-compatible fashion:

   https://github.com/P-H-C/phc-string-format/blob/master/phc-sf-spec.md

That is, if the hash argument in the F, P, and Q cards is not 40 characters and 
it has a suitable prefix, it’s a new-style hash, else it’s a legacy SHA-1 hash.

(I’ve previously suggested Modular Crypt Format for this, but PHC has some nice 
properties over MCF.  See the link.)

There could also be a re-write mode, where the hashes are all recomputed.  This 
could only be used for repos where all of the people who have cloned from it 
can be induced to re-clone, but that probably covers the vast majority of 
Fossil users.  I believe this is one of those quiet majority kinds of things: 
you never hear from the majority of private Fossil users.

For the rest, the ability to migrate forward would be sufficient, since by the 
time the hashes on ancient checkins have been broken, the incentive to attack 
those checkins has almost certainly passed.

We might not have to migrate forward more than once anyway.  Quoting Bruce 
Schneier:

    “…brute-force attacks against 256-bit keys will be infeasible
     until computers are built from something other than matter
     and occupy something other than space.”

I think Fossil is in a much better position to do this sort of migration than, 
say, Git, due to its semi-centralized nature.  Massively depended upon services 
like GitHub tie Git’s hands, because they can’t move forward until the whole 
ecosystem has updated software that can understand the new hashes.  In the 
Fossil world, we can migrate one repo at a time on a schedule sensible to each 
repo’s administrator.

I think all that’s wanted here is for someone to want to do this.

(Before you ask me to do it, first ask yourself if you want me in charge of 
your data integrity mechanisms.  I care about this problem, but I’m no domain 
expert, and I’m pretty sure that isn’t just the Dunning–Kruger effect talking.  
There must be better choices.)
_______________________________________________
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users

Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

Reply via email to