Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format

Daniel Sahlberg Wed, 17 Jan 2024 23:43:57 -0800

@Karl Fogel <kfo...@red-bean.com>,  @Evgeny Kotkov
<evgeny.kot...@visualsvn.com>


Any chance for a comment on the questions in this thread?

I've also added my own comment below.

Kind regards,
Daniel



Den sön 14 jan. 2024 kl 00:56 skrev Nathan Hartman <hartman.nat...@gmail.com
>:

> On Fri, Jan 12, 2024 at 3:51 PM Johan Corveleyn <jcor...@gmail.com> wrote:
>
>> On Fri, Jan 12, 2024 at 12:37 PM Daniel Shahaf <d...@daniel.shahaf.name>
>> wrote:
>> ...
>> > Procedurally, the long hiatus is counterproductive.  Neither kfogel nor
>> > I had the context in our heads, and the cache misses took their toll in
>> > tuits and in wallclock time.  Furthermore, I have less spare time for
>> > dev@ discussions than I did when I cast the veto (= a year ago next
>> > Saturday).  Going forward it might be preferable for threads not to
>> > hibernate.
>>
>> I agree, but obviously the hibernation is not some deliberate action
>> by anyone. It's just that most of us here have less spare time for
>> dev@ discussions (and for SVN development) than before. Especially for
>> such complex matters, and especially when people feel there are
>> walking into a minefield. There are only a few active devs left, and
>> tuits are running low ...
>>
>> ...
>> > That being the case, I have considered whether merging the feature
>> > branch outweighs letting dev@ take a not-only-/pro forma/ role in
>> > design discussions.  I am of the opinion that it does not, and
>> > therefore I reäfirrm the veto.
>>
>> It has become more clear to me (I was only following tangentially)
>> that your veto is focused on the development methodology and the lack
>> of design discussion. Is that a valid reason for a veto? We are low on
>> resources, someone still finds time to make some progress, no one
>> blocks it on technical grounds, and then someone vetoes it because we
>> don't have enough resources?
>>
>> That puts us pretty much in deadlock, because we are too low on
>> resources. Or maybe I misunderstand?
>>
>> To be clear: I appreciate your input, Daniel, and your insistence on a
>> more thorough design discussion. I assume it's coming from a genuine
>> concern that we formulate problems well, and think hard about possible
>> solutions (focusing on the precise problem we are trying to solve).
>> But at the end of the day, if that design discussion doesn't happen
>> (or not enough to your satisfaction anyway), is that grounds for a
>> veto? For me it's a tough call, because on the one hand you have a
>> point, but on the other hand ... you're blocking _some_ progress
>> because the process behind it is not perfect (which is hard to do with
>> the 3.25 tuits we have left).
>>
>> > P.S.  Could that BRANCH-README please state what's the problem the
>> branch
>> > means to solve, i.e., the goal / acceptance test?  "Make it possible to
>> > «svn add» SHA-1 collisions"?
>>
>> I agree that would be a good step.
>>
>> I too find it a bit unclear what problem we're actually trying to
>> solve, apart from a vague feeling that SHA-1 will become more and more
>> broken over time, and that this will cause fatal injury to SVN (in its
>> WC, protocol, dump format, or repository). And perhaps the fact that
>> security auditors are becoming more and more triggered by seeing SHA-1
>> (even if they don't understand the way it is used and its
>> ramifications). Making it possible to 'svn add' SHA-1 collisions is
>> not it, I think.
>>
>> --
>> Johan
>>
>
>
> Johan's reply sums up my thoughts pretty closely.
>
> I would very much like to *avoid* all of the following: deadlock, bad
> feelings, and members of this small community leaving because of deadlocks
> or bad feelings.
>
> I agree that (at the very least), BRANCH-README should define what problem
> the branch aims to solve, and perhaps that's really the main thing we need
> to discuss and resolve.
>
> Johan touched on one issue with SHA1: regardless how it is actually used
> in SVN and whether it is adequate for those purposes, there is customer
> perception. I can imagine, for example, the IT dept of some big
> $corporation could blacklist SHA1 because it is considered broken for
> cryptographic purposes. But they could blacklist it for everything. Even
> though it is safe and effective for our use cases, try explaining that to
> an admin who is struggling to meet such a blanket policy.
>
> I would like to add another reason to think about a post-SHA1 future: I'm
> writing on mobile so I can't easily grep for things now, but could our
> dependencies eventually remove the SHA1 implementation? (I just saw
> something about removal of DSA from some famous lib not too long ago. SHA1
> could be next?)
>
> When would SHA1 disappear? I don't know, but I consider it plausible to
> happen in about 5 years.
>
> If SHA1 is removed in the future, there will need to be a mad dash to
> replace it. Or we'll have to add a new dependency to use an alternate
> implementation. Or we'll have to implement our own SHA1 or copy some code
> into SVN. All of these seem bad to me.
>
> Switching to a different hash is also a bad idea, I think, because it is
> likely to suffer the same problems as SHA1 later on, as cryptography
> research proceeds and newer hashes become declared broken.
>
> I'll try to describe what I think is a best case scenario: Support
> multi-hash in 1.15 in format 32 WCs. SHA1 can continue to be the default
> but we should be careful not to require a SHA1 implementation to exist.
> Furthermore, by default "svn checkout" continues to create format 31 WCs
> (this is implemented currently). When new (1.15 and up) servers talk to new
> clients, they'll have to negotiate the "best" common hash for the protocol.
> Over time, we can add other hashes. Over time, distros and package managers
> pick up 1.15. Someday down the line (5 years?), if SHA1 goes away, or an IT
> dept wants to avoid SHA1 for whatever reasons, most of the hard work of
> changing hashes will have been done already and most people will have the
> newer software on their system already. Changing hashes then becomes a
> trivial matter. The same will be true of any future hashes that become
> declared broken, requiring almost no additional work on our part. Notably,
> it will not be necessary to bump the WC or protocol formats because of
> hashes.
>

> Pros: Future-proofing against the real and perceived brokenness of any
> hash types.
>
> Cons: Requires a lot of work up front, which no one might volunteer to do.
>
> We should continue hashing out (pun intended) how to address the different
> concerns raised.
>
> Are there any technical reasons *not* to support other hashes going
> forward?
>
> Are there other pros or cons to supporting a scenario like I described?
>

As far as I understand, the point of multi-hash is to keep the WC format
between versions (so older clients can continue to use the WC). I need some
help to understand how that would work in practice. Let's say that 1.15
adds SHAABC, 1.16 adds SHAXYZ. Then 1.17 drops SHA1. But...
- A 1.17 client will only use SHAABC or SHAXYZ hashes.
- A 1.16 client can use SHA1, SHAABC and SHAXYZ hashes.
- A 1.15 client can only use SHA1 and SHAABC hashes.

How can these work together? A WC created in 1.17 can't be used by a 1.15
client and a WC created in 1.15 (with SHA1) can't be used by a 1.17 client.
How is this different from bumping the format? How do we detect this?

At least, we'd need some method of updating the hashes in the database,
akin the WC format upgrades in some versions (was it 1.8?).

Kind regards,
Daniel

Re: Switching from SHA1 to a checksum type without known collisions in 1.15 working copy format

Reply via email to