Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-27 Thread K. Fossil user
Hello,
I can't believe that a so called community manager (Warren) could say theses 
...  :D
I assume that most of you have read Warren Stuff ?
1/ Hey Warren, you know nothing about security software so don't speak about 
it. :-|

2/ CentOS does not need any advice from you : they know their stuff.a) SHA 
algorithm is not an issue when it is more a test for integrity such as MD5 for 
files than anything else.b) You are like a kid who focuses on ONE thing when 
people like me are focusing in MANY things even when it comes to ONE subject.c) 
CentOS uses many tools to know who is sending what.You could force people to 
follow some procedures to be allowed to put a precise info into a specific 
server :
Say, you can ask the guy to use SSH first to connect, and only after he could 
work : no GPG, nothing else needed...I don't even talk about port knocking...
So they can rely on ONLY SHA1 if they want that, this is not an issue at all !
So YOUR sentence "it’ll be a problem if git.centos.org is still relying on SHA1 
hashes" IS NOT RELEVANT.
3/>"Point #2 is also questionable.  Torvalds is assuming that any collision 
attack on a Git"
Really ? As I understand Fossil points (security etc) are not seriously 
questionable [for you] ... but Linus point is ? Are you kidding ?
MY assumption is that Linus would like that a guy such as a Fossil Team guy, 
could understand that it is NOT that hard to make some changes with the SHA 
algorithm...

In another word :

4/ When you say that plans are easy but execution is hard ... You miss the 
point.a) Plans are necessary for serious project.b) Plans done, execution is 
not that hard : it may take time but it is not that hard.c) Appropriate tools 
are needed to achieve plans in time and expectations...
Serious project = Plans ! OK ?

If for you SHA1 is not a serious project, then I would like you to explain it 
to us... :D
5/ All this said :I've noticed that there are too much details in Linus Torvald 
discuss. I suppose that he was thinking about guys like you Warren ?You know 
the guy that don't get it. :-)
  
Regards

K.

  De : Warren Young <war...@etr-usa.com>
 À : Fossil SCM user's discussion <fossil-users@lists.fossil-scm.org> 
 Envoyé le : Lundi 27 février 2017 18h10
 Objet : Re: [fossil-users] Google Security Blog: Announcing the first SHA1 
collision
   
On Feb 26, 2017, at 2:58 PM, Stephan Beal <sgb...@googlemail.com> wrote:
> 
> just FYI, Linus' own words on the topic, posted yesterday:
> 
> https://plus.google.com/u/0/+LinusTorvalds/posts/7tp2gYWQugL

Point #1 misses the fact that people *do* rely on Git hashes for security.  
Maybe they’re not “supposed” to, but they do.

For example, the CentOS sources are published through Git these days, rather 
than as a pile of potentially-signed SRPM files.  This means the only assurance 
you have that the content checked into Git hasn’t been tampered with is that 
the hashes are consistent.

(I randomly inspected one of their repos, and it doesn’t use GPG signed 
commits, so the hashes are all you’ve got.)

This is adequate security today, but once bad actors can do these SHA1 attacks 
inexpensively, it’ll be a problem if git.centos.org is still relying on SHA1 
hashes.


Point #2 is also questionable.  Torvalds is assuming that any collision attack 
on a Git checkin will be detectable because of the random noise you have to 
insert into both instances to make them match.

Except that you don’t have to do it with random noise.

Thought experiment time: Given that it is now mature technology to be able to 
react to a useful subset of the spoken English language either over a crappy 
cell phone connection or via shouting at a microphone in a canister in the next 
room, complete with query chaining (e.g. Google Now, Amazon Echo, etc.) how 
much more difficult is it to write an “AI” that can automatically generate 
sane-looking but harmless C code in the middle of a pile of other C code to 
fuzz its data bits?

I have no training in AI type stuff, but I think I could do a pretty decent job 
just by feeding a large subset of GitHub into a Markov chain model.  Now 
imagine what someone with training, motivation, and resources could do.

Or, don't imagine.  Just go read the Microsoft Research paper on DeepCoder:

  https://news.ycombinator.com/item?id=13720580

I suspect there are parts of the Linux kernel sources that are 
indistinguishable from the output of a Markov chain model. :)  *Someone* 
allowed those patches to be checked in.


As for his point #3, he just offers it without support.  He says there’s a 
plan.  Well, we have a plan, too.  Plans are easy.  Execution is the hard part.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


   ___
fossil-users mailing list
fossil-users@lists.fossil-scm

Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-27 Thread bch
On 2/27/17, Warren Young  wrote:
> On Feb 26, 2017, at 2:58 PM, Stephan Beal  wrote:
>>
>> just FYI, Linus' own words on the topic, posted yesterday:
>>
>> https://plus.google.com/u/0/+LinusTorvalds/posts/7tp2gYWQugL
>
> Point #1 misses the fact that people *do* rely on Git hashes for security.
> Maybe they’re not “supposed” to, but they do.
>
> For example, the CentOS sources are published through Git these days, rather
> than as a pile of potentially-signed SRPM files.  This means the only
> assurance you have that the content checked into Git hasn’t been tampered
> with is that the hashes are consistent.

This is a long-standing peeve of mine, which is: a repository is not a
release. If this is how CentOS does distribution, I'd argue they have
more of a systemic problem than a technical one.

-bch

>
> (I randomly inspected one of their repos, and it doesn’t use GPG signed
> commits, so the hashes are all you’ve got.)
>
> This is adequate security today, but once bad actors can do these SHA1
> attacks inexpensively, it’ll be a problem if git.centos.org is still relying
> on SHA1 hashes.
>
>
> Point #2 is also questionable.  Torvalds is assuming that any collision
> attack on a Git checkin will be detectable because of the random noise you
> have to insert into both instances to make them match.
>
> Except that you don’t have to do it with random noise.
>
> Thought experiment time: Given that it is now mature technology to be able
> to react to a useful subset of the spoken English language either over a
> crappy cell phone connection or via shouting at a microphone in a canister
> in the next room, complete with query chaining (e.g. Google Now, Amazon
> Echo, etc.) how much more difficult is it to write an “AI” that can
> automatically generate sane-looking but harmless C code in the middle of a
> pile of other C code to fuzz its data bits?
>
> I have no training in AI type stuff, but I think I could do a pretty decent
> job just by feeding a large subset of GitHub into a Markov chain model.  Now
> imagine what someone with training, motivation, and resources could do.
>
> Or, don't imagine.  Just go read the Microsoft Research paper on DeepCoder:
>
>https://news.ycombinator.com/item?id=13720580
>
> I suspect there are parts of the Linux kernel sources that are
> indistinguishable from the output of a Markov chain model. :)  *Someone*
> allowed those patches to be checked in.
>
>
> As for his point #3, he just offers it without support.  He says there’s a
> plan.  Well, we have a plan, too.  Plans are easy.  Execution is the hard
> part.
> ___
> fossil-users mailing list
> fossil-users@lists.fossil-scm.org
> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-27 Thread Warren Young
On Feb 26, 2017, at 2:58 PM, Stephan Beal  wrote:
> 
> just FYI, Linus' own words on the topic, posted yesterday:
> 
> https://plus.google.com/u/0/+LinusTorvalds/posts/7tp2gYWQugL

Point #1 misses the fact that people *do* rely on Git hashes for security.  
Maybe they’re not “supposed” to, but they do.

For example, the CentOS sources are published through Git these days, rather 
than as a pile of potentially-signed SRPM files.  This means the only assurance 
you have that the content checked into Git hasn’t been tampered with is that 
the hashes are consistent.

(I randomly inspected one of their repos, and it doesn’t use GPG signed 
commits, so the hashes are all you’ve got.)

This is adequate security today, but once bad actors can do these SHA1 attacks 
inexpensively, it’ll be a problem if git.centos.org is still relying on SHA1 
hashes.


Point #2 is also questionable.  Torvalds is assuming that any collision attack 
on a Git checkin will be detectable because of the random noise you have to 
insert into both instances to make them match.

Except that you don’t have to do it with random noise.

Thought experiment time: Given that it is now mature technology to be able to 
react to a useful subset of the spoken English language either over a crappy 
cell phone connection or via shouting at a microphone in a canister in the next 
room, complete with query chaining (e.g. Google Now, Amazon Echo, etc.) how 
much more difficult is it to write an “AI” that can automatically generate 
sane-looking but harmless C code in the middle of a pile of other C code to 
fuzz its data bits?

I have no training in AI type stuff, but I think I could do a pretty decent job 
just by feeding a large subset of GitHub into a Markov chain model.  Now 
imagine what someone with training, motivation, and resources could do.

Or, don't imagine.  Just go read the Microsoft Research paper on DeepCoder:

   https://news.ycombinator.com/item?id=13720580

I suspect there are parts of the Linux kernel sources that are 
indistinguishable from the output of a Markov chain model. :)  *Someone* 
allowed those patches to be checked in.


As for his point #3, he just offers it without support.  He says there’s a 
plan.  Well, we have a plan, too.  Plans are easy.  Execution is the hard part.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-27 Thread Warren Young
On Feb 26, 2017, at 2:34 PM, Richard Hipp  wrote:
> 
> On 2/23/17, Warren Young  wrote:
>> 
>> I think Fossil is in a much better position to do this sort of migration
>> than, say, Git, due to its semi-centralized nature.
> 
> it is reasonable to argue that Git(Hub) is more centralized than
> Fossil.

Yes, but that’s my point: because so many people use Git in conjunction with 
some large service with many users — not just GitHub, but also BitBucket, 
visualstudio.com, etc. — they can’t as easily change a hash algorithm like this 
because cutting off “only” some small percentage of users means annoying many 
thousands of users.

Whereas with Fossil, few Fossil instances host many Fossil repositories, so 
that the number of users affected by any decision to upgrade to a better hash 
algorithm affects few enough people that in many cases, they can all be 
contacted personally to coordinate the upgrade.

GitHub may have the power to declare a flag day[1] but imagine the hue and cry 
if they tried!

Thus, Git is going to have a very hard time moving away from SHA1.


[1]: https://en.wikipedia.org/wiki/Flag_day_(computing)
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-27 Thread Warren Young
On Feb 26, 2017, at 2:04 PM, Ron W  wrote:
> 
> From: Warren Young 
> 
> > The PHC scheme would allow Fossil to migrate to something stronger in a
> > backwards-compatible fashion:
> 
> The PHC scheme is conceptually good, but is not friendly for use by command 
> line tools

I wasn’t suggesting that the end user of Fossil type PHC-formatted hashes.  
That is an implementation detail which would allow Fossil to migrate to 
something better than SHA1 today, and then to something better than that 
tomorrow, all in a forwards-compatible way.

If I say “fossil up 123abc” in this new world, that should match *any* hash 
whose first 6 digits in hex form are “123abc”, no matter the hash algorithm.  
If it matches one artifact hashed with SHA1 and another hashed with SHA2-256, 
it’s still a “collision” in the Fossil sense, and Fossil will still demand more 
digits to disambiguate it, just as it does today.

If that means we have to modify PHC to encode in hex, that’s fine, because 
Base64 is not the most interesting part of PHC.  The most interesting parts are 
how it encodes the algorithm used, so that the hash is self-documenting.  It 
means you don’t have to memorize (or encode!) imperfect heuristics like “40 hex 
digits == SHA1” because 40 hex digits can also be RIPEMD-160 or SHA3-160.

(The latter is nonstandard, but it’s possible to configure SHA3 to produce it.  
 I assume there are those who have done this on purpose in software where the 
developers want to use a stronger algorithm but can’t resize some fixed-width 
hash field.)
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-26 Thread K. Fossil user
Hello,

Does this mean that it is not so hard to adapt SHA algorithm to a better one ?:D

DRH suspected that it would be hard :D :D :D
Of course I don't agree with DRH ; I will never agree with him about security 
discuss either ... :-|
Thank to "sgbeal". :-)  
Best Regards

K.

  De : Stephan Beal <sgb...@googlemail.com>
 À : Fossil SCM user's discussion <fossil-users@lists.fossil-scm.org> 
 Envoyé le : Dimanche 26 février 2017 21h58
 Objet : Re: [fossil-users] Google Security Blog: Announcing the first SHA1 
collision
   
On Sun, Feb 26, 2017 at 10:34 PM, Richard Hipp <d...@sqlite.org> wrote:

And in any event, I don't think centralization is a factor here.
Fossil is better positioned than Git or Mercurial to transition to a
different hash algorithm because the Fossil implementation uses a
relational database as its backing store.  Git and Hg, in contrast,
both use bespoke pile-of-files database formats which, I suspect, will
be more difficult to adapt.


just FYI, Linus' own words on the topic, posted yesterday:
https://plus.google.com/u/0/+LinusTorvalds/posts/7tp2gYWQugL
-- 
- stephan beal
http://wanderinghorse.net/home/stephan/"Freedom is sloppy. But since tyranny's 
the only guaranteed byproduct of those who insist on a perfect world, freedom 
will have to do." -- Bigby Wolf

___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


   ___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-26 Thread Stephan Beal
On Sun, Feb 26, 2017 at 10:34 PM, Richard Hipp  wrote:

> And in any event, I don't think centralization is a factor here.
> Fossil is better positioned than Git or Mercurial to transition to a
> different hash algorithm because the Fossil implementation uses a
> relational database as its backing store.  Git and Hg, in contrast,
> both use bespoke pile-of-files database formats which, I suspect, will
> be more difficult to adapt.
>

just FYI, Linus' own words on the topic, posted yesterday:

https://plus.google.com/u/0/+LinusTorvalds/posts/7tp2gYWQugL

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-26 Thread Richard Hipp
On 2/23/17, Warren Young  wrote:
>
> I think Fossil is in a much better position to do this sort of migration
> than, say, Git, due to its semi-centralized nature.

Though they are technically distinct, in the minds of many users Git
and GitHub are the same thing.  And GitHub is highly centralized.  So
it is reasonable to argue that Git(Hub) is more centralized than
Fossil.

And in any event, I don't think centralization is a factor here.
Fossil is better positioned than Git or Mercurial to transition to a
different hash algorithm because the Fossil implementation uses a
relational database as its backing store.  Git and Hg, in contrast,
both use bespoke pile-of-files database formats which, I suspect, will
be more difficult to adapt.

-- 
D. Richard Hipp
d...@sqlite.org
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-26 Thread Ron W
On Thu, Feb 23, 2017 at 11:23 PM,  wrote:
>
> Date: Fri, 24 Feb 2017 04:23:06 + (UTC)
> From: "K. Fossil user" 
> To: Fossil SCM user's discussion 
> Subject:
> 2/ semi?
>
> > « I think Fossil is in a much better position to do this sort of
> migration than, say, Git, due to its semi-centralized nature »
> This would convince people to use Git not Fossil ...
>
> Git is more secure than Fossil (first reason to use a VCS)Git could be
> centralized or not. I am wondering if Fossil could be centralized... Now
> you've said that it is semi-centralized by NATURE.
>

git and Fossil are equally decentralized. Both are DVCSs.

The "semi-centralized nature" really refers to the git community coalescing
around huge repository services like GitHub.

Fossil can also be organised around repository services. chisselapp.org is
a dedicated Fossil repository service. There are some repository services,
like SourceForge, that off several VCS options, including git and Fossil,

FYI, for most organizational purposes, projects tend to "revolve" around a
"central" master repository (or a central cluster of redundant master
repositories). This is equally true for both git and Fossil.

However, truly peer-to-peer relations between developer repositories can be
setup. This setup is basically the same as a central cluster, except that
each member of the cluster is used directly by members of the development
team.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-26 Thread Ron W
On Thu, Feb 23, 2017 at 7:02 PM, <fossil-users-requ...@lists.fossil-scm.org>
wrote:
>
> Date: Thu, 23 Feb 2017 17:01:56 -0700
> From: Warren Young <war...@etr-usa.com>
> Subject: Re: [fossil-users] Google Security Blog: Announcing the first
> SHA1 collision
>
> The PHC scheme would allow Fossil to migrate to something stronger in a
> backwards-compatible fashion:
>
>https://github.com/P-H-C/phc-string-format/blob/master/phc-sf-spec.md
>
> That is, if the hash argument in the F, P, and Q cards is not 40
> characters and it has a suitable prefix, it’s a new-style hash, else it’s a
> legacy SHA-1 hash.
>

The PHC scheme is conceptually good, but is not friendly for use by command
line tools like Fossil or git. This is mostly because it uses $ as its
field introducer, so will quoting. Also, the Base64 encoding relies on both
upper and lowercase letters, so is more prone to typographical errors.

I suggest a simpler scheme that provides the benefits of PHC in a more
command line friendly way.

Use ^ as the prefix and data introducers. The prefix would have a 1
character field for the artifact type, followed by the nonce. Then a second
^ separates the prefix from the data, which will be the hash. Base64
encoding would make the hash string use fewer characters while continuing
to use the hexadecimal encoding would be less prone to typographical errors.

Example: ^m1234567890^ab4c90e2.

m is the artifact type. Suggest m for manifest, c for control, etc.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-26 Thread Ron W
On Fri, Feb 24, 2017 at 5:54 PM, <fossil-users-requ...@lists.fossil-scm.org>
wrote:
>
> Date: Fri, 24 Feb 2017 20:38:48 +0100
> From: Joerg Sonnenberger <jo...@bec.de>
> Subject: Re: [fossil-users] Google Security Blog: Announcing the first
> SHA1 collision
>
> On Fri, Feb 24, 2017 at 10:32:20AM -0800, bch wrote:
> > Are you saing:
> >
> > contenthash = sha256(content);
> > identifier = sha256 (contenthash . blobtype . conentsize . content);
> >
> > "blobtype" == cardtype ?
>
> Yes.
>

Wouldn't it be artifact type? (manifest, control, etc.)  rather than card
type?
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-25 Thread Joerg Sonnenberger
On Fri, Feb 24, 2017 at 03:54:56PM -0700, Warren Young wrote:
> On Feb 24, 2017, at 10:37 AM, Joerg Sonnenberger  wrote:
> > 
> > On Thu, Feb 23, 2017 at 05:01:56PM -0700, Warren Young wrote:
> >> But now we have new data.
> >> Before, this sort of attack was theoretical only.  Now it’s not only
> >> proven possible, it is already within the ROI budget for certain
> >> specialized attacks; attacks only get cheaper over time.
> > 
> > Actually, the only thing that changed is that the attack now has a
> > decent price tag.
> 
> We also know it can happen before most of our respective careers are
> over, so it isn’t something we can boot down the road to the next generation.

The attack itself hasn't add anything significant to the SHA1 attack
knowledge AFAICT. As such, it is a good implementation of what has been
known already. Seriously, the only real surprise is the price tag.

> > Fossil had "hash collissions" issues for a long time
> 
> Really?  Fossil has a very nice status page showing what your current
> repository is doing in this regard: Admin > Reports > SHA1 Collisions.

Completely different thing.

> But this report only tells you about accidental collisions, whereas the
> SHAttered attack is about creating purposeful collisions.

Ever tried attaching an existing commit manifest to a bug report or
committing it locally and then rebuilding the repo?

> > The new stored blob should be:
> > - hash of the rest of the blob
> > - blob type
> > - content size
> > - content
> 
> What does the content size buy you?

Making it more difficult to use payload-after-random-data attacks that
exploit the block nature of many hash function constructions. This is
the mechanism that allows stripping aligned suffixes from both documents
without changing the status of the colliding hash.


> The SHAttered attack shows that if you’re only after a collision, you
> can maintain the file size while making your collision.

Creating random collisions like in that is only of limited usefulness.
More interesting attacks require tighter control.

> If you’re trying to protect against preimage attacks with this
> modification, the content size isn’t an independent variable with
> respect to the content itself.  I think if you asked a cryptographer,
> they’d tell you it adds nothing to the robustness of the resulting hash.

I didn't claim it to be. It is useful data to have access to as early as
possible and it is a pesky consistency check, even if it doesn't provide
any additional (theoretical) security. But it does help with some
classes.

> I also don’t see what hashing the blob data twice gets you.  The hash
> value changes, but again not as an independent variable.

When you can create a hash collissions for the inner hash, it doesn't
mean you have automatically a collission for the outer hash as well.
You almost certainly do not. At the same time, creating a collission of
the outer hash directly is much harder as you need to also compensate
for the structure of the data to have a useful result. At the very least
it means that the attacker has to do a full hash recomputation in every
cycle and not just update the last block of a shared prefix.

> If you want to make things more difficult, you could throw a timestamp
> in there, as Git does.

What timestamp again? Like the useless size data you just argued
against?

> > can significantly cut down in the parsing time on rebuilds.
> 
> Why?  If you’re trying to find data of a particular type in the DB,
> doesn’t event.type tell you what you want to know without parsing cards?

Well, where does that table get populated from after the initial clone?

> What became of the idea of skipping rebuilds for very large repos, by
> the way?  Fossil rebuilds are strictly optional, aren’t they?

Not at all. You have to rebuild at least once after the clone.

Joerg
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-24 Thread Ross Berteig


On 2/23/2017 4:01 PM, Warren Young wrote:

The PHC scheme would allow Fossil to migrate to something stronger in a 
backwards-compatible fashion:

https://github.com/P-H-C/phc-string-format/blob/master/phc-sf-spec.md

That is, if the hash argument in the F, P, and Q cards is not 40 characters and 
it has a suitable prefix, it’s a new-style hash, else it’s a legacy SHA-1 hash.

(I’ve previously suggested Modular Crypt Format for this, but PHC has some nice 
properties over MCF.  See the link.)


Tl;dr: Don't forget about human factors when considering a change.

Should we decide to move to a new hash function, something like PHC is a 
decent approach for keeping track of hashes stored internally. But 
without some care, it is a usability nightmare, especially at the 
command line and in URLs (and wiki markup) where any "long enough" 
prefix of the hash serves to identify the target.


One way of keeping that ease of use would be to match user input against 
a prefix of just the "hash string". Since PHC specifies that hashes are 
Base64 encoded, they are unlikely to collide with any existing SHA1 
artifact ids, at least after a reasonable length.


We would expect that most or all artifacts in a single repo would have 
the same $id and $parameters, so requiring the user to type them would 
be counter-productive. We should permit them, of course, to allow for 
explicitly naming a single artifact.


Would we hash with salt? I don't know. If we did, then the salt would 
need to remain constant for any particular artifact for the lifetime of 
that artifact in that repo (and its clones). The salt could be as simple 
as the blob type suggested by Joerg [email today 9:37am], or it could 
include something more like a nonce. Using the blob type (perhaps with a 
short nonce appended) would get the advantages noted by Joerg when blobs 
are ingested (during push, pull, or rebuild); specifically blobs that 
smell like manifests but are not can be salted so that they are not 
parsed as manifests when ingested.


Who gets to decide which hash should be used in a repo: Just Fossil's 
developers? The creator of a repo? The user of a repo? Regardless, I 
think we would agree that once a particular artifact is named in a 
manifest it cannot change hashes since that would require changing that 
manifest, which would change its hash, and so on. But perhaps the next 
checkin could use a different hash for some of its content, and to name 
its manifest. That would allow preservation of existing names for old 
artifacts alongside a new choice of hash functions for naming new 
artifacts.


Warren [email today 2:54pm] is right that there are long lead times 
between any change we make and its dispersal into the wide universe of 
official distros and personal users. That tends to imply that if we 
think the threat potential of SHA1 collisions is a concern on the five 
year horizon, we need to implement whatever change we decide on soon so 
that it is in widespread use *before* the threat is real.


All of that said, should we make a change?

I'm not sure. Switching to a new hash has a non-trivial cost. Storing it 
in the PHC style (or inventing our own hash type metadata trick) seems 
like the way to mitigate the least expensive part of that cost. The rest 
of the cost is in the myriad implementation details and in designing for 
best backward compatibility to reduce friction for the user with 100s of 
personal repos.


If we do make a change, I would resist the temptation to immediately 
rewrite the entire history to use the new hash. Certainly it would be 
possible to get a tree of all manifests and work through it replacing 
all SHA1 strings with the new hash. But that ignores all the other 
places that might have referred to a particular checkin by its SHA1 
hash, most obviously wiki markup in wiki pages and technotes but also 
all those communicates about a work in progress. "Hey Joe, what happened 
in checkin [123456]?" in an email or chat log would now be impossible to 
relate back to history.


Perhaps that could be mitigated by tagging each newly rewritten checkin 
with the SHA1 hash, possibly inventing a new kind of tag that can be 
matched by prefix for the purpose.


-- Ross Berteig r...@cheshireeng.com Cheshire Engineering Corp. 
http://www.CheshireEng.com/ +1 626 303 1602


___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-24 Thread Warren Young
On Feb 24, 2017, at 10:37 AM, Joerg Sonnenberger  wrote:
> 
> On Thu, Feb 23, 2017 at 05:01:56PM -0700, Warren Young wrote:
>> But now we have new data.
>> Before, this sort of attack was theoretical only.  Now it’s not only
>> proven possible, it is already within the ROI budget for certain
>> specialized attacks; attacks only get cheaper over time.
> 
> Actually, the only thing that changed is that the attack now has a
> decent price tag.

We also know it can happen before most of our respective careers are over, so 
it isn’t something we can boot down the road to the next generation.

In 5 years, I expect this attack to either be 10x faster or 10x cheaper, 
depending on the attacker’s needs.  Between distro lag for “stable” OSes and 
the long time some systems stay installed, untouched, a fix today *might* be 
widely distributed by then.

For example, Debian Jessie (the current stable version) is still shipping 1.29, 
which is about 2.5 years old now.  Stretch (the next version, due out sometime 
this year) will ship 1.37 (and that can’t change, because Stretch is 
hard-frozen now) which means a fix today will likely still take at least a few 
years to get into the *next* Debian after that.  And not all Jessie and Stretch 
installs will upgrade to whatever comes next, which could ship a fix made today.

And there you are, all 5 years spent waiting on a fix magically made today to 
get out to the installed base.

We should not wait until the sky is currently falling.

> Fossil had "hash collissions" issues for a long time

Really?  Fossil has a very nice status page showing what your current 
repository is doing in this regard: Admin > Reports > SHA1 Collisions.  Example:

https://sqlite.org/src/hash-collisions

This shows that for the SQLite code base, 8 hex digits is enough to uniquely 
identify any artifact currently in the DB.

But this report only tells you about accidental collisions, whereas the 
SHAttered attack is about creating purposeful collisions.

> The new stored blob should be:
> - hash of the rest of the blob
> - blob type
> - content size
> - content

What does the content size buy you?

The SHAttered attack shows that if you’re only after a collision, you can 
maintain the file size while making your collision.  They didn’t even have to 
shift the untouched bytes in the file, as shown by this simplistic diff demo:

https://news.ycombinator.com/item?id=13721633

If you’re trying to protect against preimage attacks with this modification, 
the content size isn’t an independent variable with respect to the content 
itself.  I think if you asked a cryptographer, they’d tell you it adds nothing 
to the robustness of the resulting hash.

I also don’t see what hashing the blob data twice gets you.  The hash value 
changes, but again not as an independent variable.

If you want to make things more difficult, you could throw a timestamp in 
there, as Git does.  Store it in plaintext, so it can act as a kind of 
primitive salt.  Better would be to include a random nonce, also stored in 
plaintext alongside the blob data.

So, identifier = sha256(contenthash . blobtype . timestamp . nonce)

The latter two are an either/and situation, though having both is probably 
overkill.

> Including the blob type makes it
> more robust

That I’ll agree with, because the blob type is an independent variable.  You 
could have the same block of text being stored as a wiki document, a ticket 
comment, or a file, and it would indeed be good if they hashed differently.

> can significantly cut down in the parsing time on rebuilds.

Why?  If you’re trying to find data of a particular type in the DB, doesn’t 
event.type tell you what you want to know without parsing cards?

What became of the idea of skipping rebuilds for very large repos, by the way?  
Fossil rebuilds are strictly optional, aren’t they?

What I’m asking is, if you have a Fossil repo like your experimental NetBSD 
one, it’s just slower to access if it isn’t rebuilt, right?  So, why couldn’t 
Fossil let you clone it without the rebuild to begin with, then you could 
schedule a rebuild to run over the weekend sometime?
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-24 Thread Joerg Sonnenberger
On Fri, Feb 24, 2017 at 10:32:20AM -0800, bch wrote:
> Are you saing:
> 
> contenthash = sha256(content);
> identifier = sha256 (contenthash . blobtype . conentsize . content);
> 
> "blobtype" == cardtype ?

Yes.

Joerg
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-24 Thread bch
Are you saing:

contenthash = sha256(content);
identifier = sha256 (contenthash . blobtype . conentsize . content);


"blobtype" == cardtype ?

-bch



On 2/24/17, Joerg Sonnenberger  wrote:
> On Thu, Feb 23, 2017 at 05:01:56PM -0700, Warren Young wrote:
>> Second, there will be those who say we’ve covered all of this already,
>> multiple times.  I know, I was there.  But now we have new data.
>> Before, this sort of attack was theoretical only.  Now it’s not only
>> proven possible, it is already within the ROI budget for certain
>> specialized attacks; attacks only get cheaper over time.
>
> Actually, the only thing that changed is that the attack now has a
> decent price tag. That's really not all that much, contrary to all the
> ranting going on right now.
>
> Fossil had "hash collissions" issues for a long time and I'm not even
> talking about this kind of issues. It might be a good idea to change the
> storage format for new blobs and optionally provide a one time
> conversion option. But I don't have the time to implement that.
>
> The new stored blob should be:
> - hash of the rest of the blob
> - blob type
> - content size
> - content
>
> The idea is that object id is the hash of everything. This should be
> much more resilient to preimage attacks based on the block structure of
> the hash function. I haven't had a chance to talk with a real
> cryptographer about this yet, though. Including the blob type makes it
> more robust and can significantly cut down in the parsing time on
> rebuilds. Including the content size is just another way of making
> meaningful preimage attacks somewhat harder.
>
> Joerg
> ___
> fossil-users mailing list
> fossil-users@lists.fossil-scm.org
> http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
>
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-24 Thread Joerg Sonnenberger
On Thu, Feb 23, 2017 at 05:01:56PM -0700, Warren Young wrote:
> Second, there will be those who say we’ve covered all of this already,
> multiple times.  I know, I was there.  But now we have new data.
> Before, this sort of attack was theoretical only.  Now it’s not only
> proven possible, it is already within the ROI budget for certain
> specialized attacks; attacks only get cheaper over time.

Actually, the only thing that changed is that the attack now has a
decent price tag. That's really not all that much, contrary to all the
ranting going on right now.

Fossil had "hash collissions" issues for a long time and I'm not even
talking about this kind of issues. It might be a good idea to change the
storage format for new blobs and optionally provide a one time
conversion option. But I don't have the time to implement that.

The new stored blob should be:
- hash of the rest of the blob
- blob type
- content size
- content

The idea is that object id is the hash of everything. This should be
much more resilient to preimage attacks based on the block structure of
the hash function. I haven't had a chance to talk with a real
cryptographer about this yet, though. Including the blob type makes it
more robust and can significantly cut down in the parsing time on
rebuilds. Including the content size is just another way of making
meaningful preimage attacks somewhat harder.

Joerg
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-23 Thread K. Fossil user

1/ Warren the guy who knows nothing about software security talks about 
software security ...Wow. I don't get this.

2/ semi?

> « I think Fossil is in a much better position to do this sort of migration 
> than, say, Git, due to its semi-centralized nature »
This would convince people to use Git not Fossil ...

Git is more secure than Fossil (first reason to use a VCS)Git could be 
centralized or not. I am wondering if Fossil could be centralized... Now you've 
said that it is semi-centralized by NATURE.

 
Regards

K.

  De : Warren Young <war...@etr-usa.com>
 À : Fossil SCM user's discussion <fossil-users@lists.fossil-scm.org> 
 Envoyé le : Vendredi 24 février 2017 0h01
 Objet : Re: [fossil-users] Google Security Blog: Announcing the first SHA1 
collision
  


On Feb 23, 2017, at 10:50 AM, Marc Simpson <m...@0branch.com> wrote:
> 
> This may be of interest to some here, especially in light of previous
> SHA-1 related discussions on list:
> 
>  https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html

Before I respond, first know that I respond out of concern for Fossil.  I’m a 
staunch Fossil defender, and I’m on the record doing so, many times.  My 
motivation in laying out these criticisms is that I want Fossil to continue to 
be worthy of that defense going forward.

Second, there will be those who say we’ve covered all of this already, multiple 
times.  I know, I was there.  But now we have new data.  Before, this sort of 
attack was theoretical only.  Now it’s not only proven possible, it is already 
within the ROI budget for certain specialized attacks; attacks only get cheaper 
over time.

The new data includes not only this news from Google and its research partners. 
 The resulting discussion on Hacker News made me aware of a way an attacker 
could use this new attack against Fossil despite the fact that this is “just” a 
collision, rather than a way to generate second-preimages inexpensively:

    https://news.ycombinator.com/item?id=13715887

This thus gives me an answer to drh’s challenge to me in one of the prior 
threads on this subject:

    https://goo.gl/2tzdOi

Executive summary for those who don’t want to click either of the above links: 
Challenge: “What can you do with this attack?”  Response: “Replace a good 
checkin during a clone/sync shipping that good checkin to another Fossil 
instance.”  After the attack, the repos do not all contain the same data, but 
they will agree that they’re in sync if you ask them to check.

Previous threads have pointed out that you need to fiddle not only with the 
SHA1 but also with an MD5 and some other kind of non-cryptographic checksum, 
and still have both the evil and useful checkins be working C code.  All of 
that is doable.  A motivated attacker could probably do all of that 
computationally in about a second on modern hardware.  That means these other 
mechanisms add essentially nothing to Fossil’s resistance to sync stream 
tampering.

The classic solution for Byzantine Fault Tolerance in the face of 1 possible 
traitorous general — or an untrustworthy messenger, as with the MITM attack — 
is to have at least 3 replicas, but that only works when the algorithm used to 
achieve consensus among the loyal generals is trustworthy.

This paper is on-point:

    http://zoo.cs.yale.edu/classes/cs426/2012/bib/castro99practical.pdf

Quoting from page 2, “We also assume that the adversary…[is] computationally 
bound so that (with very high probability) it is unable to subvert the 
cryptographic techniques mentioned above.”  We call that “key assumption 
violated” where I come from.  (Check the authors list, by the way.  Yes, *that* 
Liskov.)

According to another post on the Hacker News page linked above, this attack 
cost about $100k in GPGPU resources.  There must be Fossil-hosted projects 
worth that much to attack today, and after a bad actor pulls that one off, 
they’ve paid for the hardware, so a second attack costs only power and cooling. 
 

Given the up-front costs, a bit more to mount a MITM attack doesn’t seem 
infeasible.

Consider also that this attack is “free” to attackers who already have access 
to a pool of compromised PCs, many of which will have powerful GPUs.

Today, I see the following defenses to this problem:

1. “fossil diff --from known-good-release” before electing to use binaries 
built from a given repo you don’t trust implicitly.  (And I hope this news has 
shortened your list of such repos!)

2. Modify all drive-by patches to foil pre-generated collisions.  (Presumably 
you trust those with checkin privileges on the repo.)

3. Put any attackable Fossil repos behind a TLS proxy with a strong cert (i.e. 
not self-signed, and certainly not SHA-1 hashed!) that enforces TLS access, as 
in my HOWTO:

    https://goo.gl/USybpW

TLS proxying prevents the MITM attack, but look at the cost.  I’m happy to pay 
it for my public Fossil repos, but do we

Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-23 Thread K. Fossil user
Thank you Marc...
1/ I've said that it is needed to let people choose their digest algorithm...
a) of course the Fossil team does not take into account what I've said.
b) I was wondering in the past when would it be possible to the lambda guy to 
break the sha1.Finally it is worse than what I've expected...
c) I like this sentence :"especially in light of previous
>SHA-1 related discussions on list"
the discussion is not over, it is the beginning ... :D

2/ When I reply to ALL I've got these TWO e-mail ...Which one is the correct 
one...?
 
 
Best Regards

K.

  De : Kees Nuyt <k.n...@zonnet.nl>
 À : fossil-us...@mailinglists.sqlite.org 
 Envoyé le : Jeudi 23 février 2017 18h15
 Objet : Re: [fossil-users] Google Security Blog: Announcing the first SHA1 
collision


  
[Default] On Thu, 23 Feb 2017 09:50:12 -0800, Marc Simpson
<m...@0branch.com> wrote:

>This may be of interest to some here, especially in light of previous
>SHA-1 related discussions on list:
>
>  https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html

Interesting.

https://shattered.io/ says:


Who is capable of mounting this attack?

This attack required over 9,223,372,036,854,775,808 SHA1
computations. This took the equivalent processing power as 6,500
years of single-CPU computations and 110 years of single-GPU
computations.

How does this attack compare to the brute force one?

The SHAttered attack is 100,000 faster than the brute force
attack that relies on the birthday paradox. The brute force
attack would require 12,000,000 GPU years to complete, and it is
therefore impractical. 


-- 
Regards,
Kees Nuyt

___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


   ___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-23 Thread Joerg Sonnenberger
On Thu, Feb 23, 2017 at 06:12:18PM -0500, Martin Gagnon wrote:
> Seems that Git can store both of them, I beleive it calculate the sha1
> on a combination of the filename and the content or something like that.

No, it stores the object type first, which effectively creates a
different block structure. It doesn't mean that the same type of
computation wouldn't create a conflict to hide objects in git. It just
needs slightly different content.

Joerg
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-23 Thread Warren Young
On Feb 23, 2017, at 10:50 AM, Marc Simpson  wrote:
> 
> This may be of interest to some here, especially in light of previous
> SHA-1 related discussions on list:
> 
>  https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html

Before I respond, first know that I respond out of concern for Fossil.  I’m a 
staunch Fossil defender, and I’m on the record doing so, many times.  My 
motivation in laying out these criticisms is that I want Fossil to continue to 
be worthy of that defense going forward.

Second, there will be those who say we’ve covered all of this already, multiple 
times.  I know, I was there.  But now we have new data.  Before, this sort of 
attack was theoretical only.  Now it’s not only proven possible, it is already 
within the ROI budget for certain specialized attacks; attacks only get cheaper 
over time.

The new data includes not only this news from Google and its research partners. 
 The resulting discussion on Hacker News made me aware of a way an attacker 
could use this new attack against Fossil despite the fact that this is “just” a 
collision, rather than a way to generate second-preimages inexpensively:

https://news.ycombinator.com/item?id=13715887

This thus gives me an answer to drh’s challenge to me in one of the prior 
threads on this subject:

https://goo.gl/2tzdOi

Executive summary for those who don’t want to click either of the above links: 
Challenge: “What can you do with this attack?”  Response: “Replace a good 
checkin during a clone/sync shipping that good checkin to another Fossil 
instance.”  After the attack, the repos do not all contain the same data, but 
they will agree that they’re in sync if you ask them to check.

Previous threads have pointed out that you need to fiddle not only with the 
SHA1 but also with an MD5 and some other kind of non-cryptographic checksum, 
and still have both the evil and useful checkins be working C code.  All of 
that is doable.  A motivated attacker could probably do all of that 
computationally in about a second on modern hardware.  That means these other 
mechanisms add essentially nothing to Fossil’s resistance to sync stream 
tampering.

The classic solution for Byzantine Fault Tolerance in the face of 1 possible 
traitorous general — or an untrustworthy messenger, as with the MITM attack — 
is to have at least 3 replicas, but that only works when the algorithm used to 
achieve consensus among the loyal generals is trustworthy.

This paper is on-point:

http://zoo.cs.yale.edu/classes/cs426/2012/bib/castro99practical.pdf

Quoting from page 2, “We also assume that the adversary…[is] computationally 
bound so that (with very high probability) it is unable to subvert the 
cryptographic techniques mentioned above.”  We call that “key assumption 
violated” where I come from.  (Check the authors list, by the way.  Yes, *that* 
Liskov.)

According to another post on the Hacker News page linked above, this attack 
cost about $100k in GPGPU resources.  There must be Fossil-hosted projects 
worth that much to attack today, and after a bad actor pulls that one off, 
they’ve paid for the hardware, so a second attack costs only power and cooling. 
 

Given the up-front costs, a bit more to mount a MITM attack doesn’t seem 
infeasible.

Consider also that this attack is “free” to attackers who already have access 
to a pool of compromised PCs, many of which will have powerful GPUs.

Today, I see the following defenses to this problem:

1. “fossil diff --from known-good-release” before electing to use binaries 
built from a given repo you don’t trust implicitly.  (And I hope this news has 
shortened your list of such repos!)

2. Modify all drive-by patches to foil pre-generated collisions.  (Presumably 
you trust those with checkin privileges on the repo.)

3. Put any attackable Fossil repos behind a TLS proxy with a strong cert (i.e. 
not self-signed, and certainly not SHA-1 hashed!) that enforces TLS access, as 
in my HOWTO:

https://goo.gl/USybpW

TLS proxying prevents the MITM attack, but look at the cost.  I’m happy to pay 
it for my public Fossil repos, but do we really want to make that kind of thing 
a prerequisite for all Fossil users once the cost of this kind of attack drops 
to script-kiddie levels?  That defeats a large chunk of the Fossil value 
proposition, being that it’s easy to set up.

Fossil’s "Redirect to HTTPS on the Login page” setting (Admin > Access) doesn’t 
solve this problem, since it still allows plaintext clones.  This news shows 
that we need to protect the clone/sync stream now as well, not just the login 
exchange.

Historically, all widely-deployed hash functions have had a finite lifetime.  
SHA-1 has had a good, long run:

http://valerieaurora.org/hash.html

The PHC scheme would allow Fossil to migrate to something stronger in a 
backwards-compatible fashion:

   https://github.com/P-H-C/phc-string-format/blob/master/phc-sf-spec.md

That is, if 

Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-23 Thread Martin Gagnon
On Thu, Feb 23, 2017 at 03:18:29PM -0800, bch wrote:

  [snip]
> 
> Or more correctly, "a *subsequent* file with the same sha1 hash..." If you
> happened to commit the Trojan file first, the "good" commit would have been
> the one to fail.
>

True, but if you pull from untrusted user (or give push access to
untrusted user), nothing prevent the trojan file to get in, even without
sha1 hash collision.

But at least, someone cannot replace a file you already have with a
malicious one with the same sha1 hash.

-- 
Martin G.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-23 Thread bch
On Feb 23, 2017 15:12, "Martin Gagnon"  wrote:

On Thu, Feb 23, 2017 at 09:50:12AM -0800, Marc Simpson wrote:
> This may be of interest to some here, especially in light of previous
> SHA-1 related discussions on list:
>
>   https://security.googleblog.com/2017/02/announcing-first-
sha1-collision.html
>

Also, Here's a related discussion from git mailing list:
  https://marc.info/?t=14878688461=1=2

Somebody tried those 2 colliding pdf's on a git repository:
  https://github.com/joeyh/supercollider

Seems that Git can store both of them, I beleive it calculate the sha1
on a combination of the filename and the content or something like that.

So I was curious and I tried to put them on a fossil repository.


  Here what I got (with repo-cksum enabled):
  --
$ ls
bad.pdf  good.pdf

$ sha1sum bad.pdf good.pdf
d00bbe65d80f6d53d5c15da7c6b4f0a655c5a86a  bad.pdf
d00bbe65d80f6d53d5c15da7c6b4f0a655c5a86a  good.pdf

$ fossil add bad.pdf
ADDED  bad.pdf

$ fossil add good.pdf
ADDED  good.pdf

$ fossil commit
[***] Current default user: mgagnon
vim "./ci-comment-A70B84D400D2.txt"
./bad.pdf contains binary data. Use --no-warnings or the
"binary-glob" setting to disable this warning.
Commit anyhow (a=all/y/N)? a
New_Version: 510b26ef49be508003304840a9ea18894007ad51
ERROR: [good.pdf] is different on disk compared to the repository
NOTICE: Repository version of [good.pdf] stored in
[file-3a8b62456795ffdb]
working checkout does not match what would have ended up in the
repository:  2106b982989e5604ec91523ddd81c879 versus
a388ff244a318ee5904ba276b754d84a
  -

Seems that repo-cksum give an extra protection against this kind of
collision. (but don't let the file goes in...)

Then, I tried with repo-cksum disabled.

  1- I add good.pdf and commit.
  2- I add bad.pdf and commit (it succeed)
  3- I check with "fossil ui" and both files share the same content
 (good.pdf)

At least, if a file with a certain sha1 hash exist on a repo, a
malicious file with the same sha1 hash should never goes in.


Or more correctly, "a *subsequent* file with the same sha1 hash..." If you
happened to commit the Trojan file first, the "good" commit would have been
the one to fail.


I'm not an expert in encryption and security: I agree that the
possibility of sha1 collision is not a good thing, but it's probably not
the end of the world and it doesn't make fossil un-usable.

  side note: A collision is a lot easier to produce on a file like a pdf
 than on a source file.

"That's why pdf's are the classic model for showing
these attacks: it's easy to insert garbage in the
middle of a pdf that is invisible."

-- git mailing list citation

Regards,

--
Martin G.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-23 Thread Martin Gagnon
On Thu, Feb 23, 2017 at 09:50:12AM -0800, Marc Simpson wrote:
> This may be of interest to some here, especially in light of previous
> SHA-1 related discussions on list:
> 
>   https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
> 

Also, Here's a related discussion from git mailing list:
  https://marc.info/?t=14878688461=1=2

Somebody tried those 2 colliding pdf's on a git repository:
  https://github.com/joeyh/supercollider

Seems that Git can store both of them, I beleive it calculate the sha1
on a combination of the filename and the content or something like that.

So I was curious and I tried to put them on a fossil repository.


  Here what I got (with repo-cksum enabled):
  --
$ ls
bad.pdf  good.pdf

$ sha1sum bad.pdf good.pdf
d00bbe65d80f6d53d5c15da7c6b4f0a655c5a86a  bad.pdf
d00bbe65d80f6d53d5c15da7c6b4f0a655c5a86a  good.pdf

$ fossil add bad.pdf
ADDED  bad.pdf

$ fossil add good.pdf
ADDED  good.pdf

$ fossil commit
[***] Current default user: mgagnon
vim "./ci-comment-A70B84D400D2.txt"
./bad.pdf contains binary data. Use --no-warnings or the "binary-glob" 
setting to disable this warning.
Commit anyhow (a=all/y/N)? a
New_Version: 510b26ef49be508003304840a9ea18894007ad51
ERROR: [good.pdf] is different on disk compared to the repository
NOTICE: Repository version of [good.pdf] stored in 
[file-3a8b62456795ffdb]
working checkout does not match what would have ended up in the 
repository:  2106b982989e5604ec91523ddd81c879 versus 
a388ff244a318ee5904ba276b754d84a
  -

Seems that repo-cksum give an extra protection against this kind of
collision. (but don't let the file goes in...)

Then, I tried with repo-cksum disabled. 

  1- I add good.pdf and commit.
  2- I add bad.pdf and commit (it succeed)
  3- I check with "fossil ui" and both files share the same content
 (good.pdf)

At least, if a file with a certain sha1 hash exist on a repo, a
malicious file with the same sha1 hash should never goes in.

I'm not an expert in encryption and security: I agree that the
possibility of sha1 collision is not a good thing, but it's probably not
the end of the world and it doesn't make fossil un-usable.

  side note: A collision is a lot easier to produce on a file like a pdf
 than on a source file. 

"That's why pdf's are the classic model for showing
these attacks: it's easy to insert garbage in the
middle of a pdf that is invisible."

-- git mailing list citation

Regards,

-- 
Martin G.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Google Security Blog: Announcing the first SHA1 collision

2017-02-23 Thread Kees Nuyt
[Default] On Thu, 23 Feb 2017 09:50:12 -0800, Marc Simpson
 wrote:

>This may be of interest to some here, especially in light of previous
>SHA-1 related discussions on list:
>
>  https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html

Interesting.

https://shattered.io/ says:


Who is capable of mounting this attack?

This attack required over 9,223,372,036,854,775,808 SHA1
computations. This took the equivalent processing power as 6,500
years of single-CPU computations and 110 years of single-GPU
computations.

How does this attack compare to the brute force one?

The SHAttered attack is 100,000 faster than the brute force
attack that relies on the birthday paradox. The brute force
attack would require 12,000,000 GPU years to complete, and it is
therefore impractical. 


-- 
Regards,
Kees Nuyt

___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users