[fossil-users] merge after cherrypick plus edit does not identify GCA as I would like

2015-09-14 Thread Eric Rubin-Smith
See the transcript below for gory details.  The summary is:

1. create a new file on trunk and check it in.
2. edit the file and check in on a branch (let's call it "beta")
3. trunk decides it wants that particular change set from step (2), so
cherrypick it (assume in this example that other stuff is happening on the
beta branch that we don't want in trunk at the moment, so a normal merge is
not appropriate).
4. edit the same file on trunk and check it in
5. the "beta" branch now wants to merge the latest from trunk to continue
work

==> It's noted as a merge conflict

This is because the GCA calculation does not seem to incorporate the
cherrypick info (at least in this case).

Perhaps there is some deeper reason for this that I'm unaware of, but for
this case the behavior is suboptimal.

Transcript illustrating the above:

eas@little:~$ fossil version
This is fossil version 1.33 [9c65b5432e] 2015-05-23 11:11:31 UTC
eas@little:~$ mkdir /tmp/fossil
eas@little:~$ cd !$
cd /tmp/fossil
eas@little:/tmp/fossil$ fossil new test.db
project-id: c3037e9c81eb4c3279dfc24f07579bfbe604ddee
server-id:  ba2bb96bf830fa680389b425fa09c5dcfd5370c2
admin-user: eas (initial password is "dc73fc")
eas@little:/tmp/fossil$ mkdir sandbox
eas@little:/tmp/fossil$ cd !$
cd sandbox
eas@little:/tmp/fossil/sandbox$ fossil open /tmp/fossil/test.db
project-name: 
repository:   /tmp/fossil/test.db
local-root:   /tmp/fossil/sandbox/
config-db:/home/eas/.fossil
project-code: c3037e9c81eb4c3279dfc24f07579bfbe604ddee
checkout: 6bb0b6577411bd798631d137bf5b2d7d8fc3ac12 2015-09-14 15:28:16
UTC
tags: trunk
comment:  initial empty check-in (user: eas)
check-ins:1
eas@little:/tmp/fossil/sandbox$ echo 1.0 > VERSION
eas@little:/tmp/fossil/sandbox$ fossil add VERSION
ADDED  VERSION
eas@little:/tmp/fossil/sandbox$ fossil commit -m "Add version file"
New_Version: b6f302b927b0289feae9831c80f8b066f6e87d70
eas@little:/tmp/fossil/sandbox$ echo 1.1 > h^CRSION
eas@little:/tmp/fossil/sandbox$ set -o vi
eas@little:/tmp/fossil/sandbox$ echo 1.1b1 > VERSION
eas@little:/tmp/fossil/sandbox$ fossil commit --branch beta -m "Start beta
branch."
New_Version: ac64ec791f7be8601848e4c50a87dc54262cc659
eas@little:/tmp/fossil/sandbox$ fossil update trunk
UPDATE VERSION
---
updated-to:   b6f302b927b0289feae9831c80f8b066f6e87d70 2015-09-14 15:28:52
UTC
tags: trunk
comment:  Add version file (user: eas)
changes:  1 file modified.
 "fossil undo" is available to undo changes to the working checkout.
eas@little:/tmp/fossil/sandbox$ fossil merge --cherrypick
ac64ec791f7be8601848e4c50a87dc54262cc659
UPDATE VERSION
 "fossil undo" is available to undo changes to the working checkout.

eas@little:/tmp/fossil/sandbox$
eas@little:/tmp/fossil/sandbox$ fossil commit -m "Cherrypick version number
change."
New_Version: 8a4693e6ce2faa5cf3cd1e5a839b33ba7c590d02
eas@little:/tmp/fossil/sandbox$ echo 1.1b1.01 > VERSION
eas@little:/tmp/fossil/sandbox$ fossil commit -m "More work on the trunk."
New_Version: cee15c31915298ecce84eb0b5aa9b7520e3c8b61
eas@little:/tmp/fossil/sandbox$ fossil update beta
UPDATE VERSION
---
updated-to:   ac64ec791f7be8601848e4c50a87dc54262cc659 2015-09-14 15:29:19
UTC
tags: beta
comment:  Start beta branch. (user: eas)
changes:  1 file modified.
 "fossil undo" is available to undo changes to the working checkout.
eas@little:/tmp/fossil/sandbox$ fossil merge trunk
MERGE VERSION
* 1 merge conflicts in VERSION
WARNING: 1 merge conflicts
 "fossil undo" is available to undo changes to the working checkout.
eas@little:/tmp/fossil/sandbox$ cat VERSION
<<< BEGIN MERGE CONFLICT: local copy shown first <<<
1.1b1
=== COMMON ANCESTOR content follows 
1.0
=== MERGED IN content follows ==
1.1b1.01
>>> END MERGE CONFLICT >
eas@little:/tmp/fossil/sandbox$ exit
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Why Hash

2015-09-14 Thread Ron W
On Mon, Sep 14, 2015 at 3:10 PM, Scott Robison 
wrote:
>
> I wasn't really thinking of who might want to do it, just that sha1 isn't
> being used for cryptographic security and that would be covered by other
> means (GPG for example).
>

The hashes can be important for verifying the integrity of the repository.
Even when not "signing" commits, a secure hash is still valuable. The more
secure the hash, the harder it is to hide corruption.

Also, the description of the "PGP command" setting says "Command used to
clear-sign manifests at check-in." This suggests that only the manifest
itself is signed. Therefor, the GPG signature relies on the hashes - in the
manifest - generated by Fossil
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Why Hash

2015-09-14 Thread Ron W
On Mon, Sep 14, 2015 at 1:46 PM, Warren Young  wrote:

> The question is, “Why does this UI web page have to *say* that it is a
> SHA-1 hash?”
>
> If this page just said “checkin ID,” what would be lost?
>

As far as VCS functionality, nothing.

On the other hand, many projects publish the hashes for their release
packages so that people can verify the package is correct.

The Fossil manifest's hash takes that to another level. Verify the
manifest, then verify each file listed in the manifest.


> What would be gained is that people wouldn’t be trying to work out how to
> match sha1sum commands to Fossil output,


fossil artifact id | sha1sum -
sha1sum path/to/file


> and Fossil would be free to switch to a different algorithm later if that
> seemed like a good idea.


Fossil still can switch hash algorithms. Existing repos probably remain
with SHA1, while new repos would use the new algorithm. Not impossible to
convert a repo, but all IDs would change. Any use of old IDs could utilize
tags generated during the conversion.

Even a mixed hash repo could exist if a version (or hash) card were
introduced.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Why Hash

2015-09-14 Thread Warren Young
On Sep 14, 2015, at 4:24 PM, Ron W  wrote:
> 
> On Mon, Sep 14, 2015 at 1:46 PM, Warren Young  wrote:
>  
> > What would be gained is that people wouldn’t be trying to work out how to 
> > match sha1sum commands to Fossil output,
> 
> fossil artifact id | sha1sum -
> sha1sum path/to/file

See, that just proves the point: the “SHA1 Hash” line on the /info page gives 
you a *checkin* ID, not an artifact ID.  (Yes, yes, I know there are artifact 
IDs below that, but we’re not talking about them.)

In fact, this whole artifact ID vs checkin ID distinction completely flew over 
my head until recently.  I had to re-read the file format wiki document again 
in the context of this discussion in order to finally grasp it.

I think I might have gotten over that hump a bit quicker if the UI was explicit 
about saying “checkin ID” and “artifact ID” instead of just saying, “Here’s 
some SHA-1 hashes, enjoy!"

> Fossil still can switch hash algorithms. Existing repos probably remain with 
> SHA1, while new repos would use the new algorithm. Not impossible to convert 
> a repo, but all IDs would change. Any use of old IDs could utilize tags 
> generated during the conversion.
> 
> Even a mixed hash repo could exist if a version (or hash) card were 
> introduced.

glibc-based Linux systems cope with this problem in /etc/shadow by tagging the 
hash: a prefix of $1$ means it’s the old MD5 hash that replaced the ancient 
crypt(3) algortihm long ago, whereas the Linux box nearest to you probably uses 
$6$ by default, meaning SHA-512.

man 3 crypt for details.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Why Hash

2015-09-14 Thread Warren Young
On Sep 14, 2015, at 4:40 PM, Warren Young  wrote:
> 
> glibc-based Linux systems cope with this problem in /etc/shadow by tagging 
> the hash

I just learned that this isn’t a Linux-specific thing, that it is in fact a 
pseudostandard also used on the BSDs and in various other places:

  http://pythonhosted.org/passlib/modular_crypt_format.html

___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Why Hash

2015-09-14 Thread Stephan Beal
On Mon, Sep 14, 2015 at 7:46 PM, Warren Young  wrote:

> output, and Fossil would be free to switch to a different algorithm later
> if that seemed like a good idea.
>

Indeed, fossil's model allows any hash to be used, but it is not possible
to change the hash without a near-complete overhaul of fossil (and its
docs), nor without invalidating every repo in existence, so it's highly
unlikely to ever happen. Supporting two hash variants in one fossil binary
would likely prove to be problematic (and would require a major overhaul).


> And indeed, maybe it is a good idea, since SHA-1 is nearing its EOL for
> cryptographic use:
>
>   https://www.google.com/?q=sha-1%20end%20of%20life


Fossil does not use it in a cryptographic context, so i would argue that
that's not relevant for fossil's continued use. Fossil only uses sha-1 to
define/determine content identity. (There are long threads somewhere in the
list archives about the changes of hash collision. Management summary: not
likely to happen for many human generations.)

-- 
- stephan beal
http://wanderinghorse.net/home/stephan/
http://gplus.to/sgbeal
"Freedom is sloppy. But since tyranny's the only guaranteed byproduct of
those who insist on a perfect world, freedom will have to do." -- Bigby Wolf
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Why Hash

2015-09-14 Thread Warren Young
On Sep 12, 2015, at 2:26 AM, Stephan Beal  wrote:
> 
> On Sat, Sep 12, 2015 at 12:57 AM, Warren Young  wrote:
> For instance, why even mention “SHA1 Hash” on the checkin details page in 
> fossil ui, from src/info.c?  Why not something more generic, like “checkin 
> ID”?
> 
> The checkin ID is the hash of the manifest for the checkin. 

Yes, I know that.  The question is not, “Why is the checkin ID a SHA-1 hash?”  
The question is, “Why does this UI web page have to *say* that it is a SHA-1 
hash?”

If this page just said “checkin ID,” what would be lost?

What would be gained is that people wouldn’t be trying to work out how to match 
sha1sum commands to Fossil output, and Fossil would be free to switch to a 
different algorithm later if that seemed like a good idea.

And indeed, maybe it is a good idea, since SHA-1 is nearing its EOL for 
cryptographic use: 

  https://www.google.com/?q=sha-1%20end%20of%20life
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Multiple Projects in one Repo

2015-09-14 Thread Warren Young
On Sep 12, 2015, at 9:54 AM, Oliver Friedrich 
 wrote:
> 
> with nested repositories my administration overhead would exceed even the 
> single repository solution, right?

The alternative to managing just one .fossil file is managing just one directly 
full of .fossil files.  Is that really such a big difference?

Note that “fossil serve” works the same when pointed to a directory full of 
fossils as it does when pointed at a single fossil, with the exception that the 
URLs are all one directory deeper.

There are two annoyances currently involved in managing nested repositories, 
which could in principle go away: the need to explicitly open the 
sub-repositories every time you open the primary, and the need to pass --nested 
to bypass the checks Fossil does for a parent containing .fslckout (a.k.a. 
_FOSSIL_).

Git solves this problem more elegantly with the submodule:

  https://git-scm.com/book/en/v2/Git-Tools-Submodules

In short, the main project simply declares that it needs other projects by URL. 
When you clone the main project, Git also clones the submodules into the 
correct place within the parent’s tree.

I use nested checkouts myself, though probably for a different reason than 
you’re proposing.  I have a top-level repository with many branches for older 
versions of the software, and all versions need to share a single set of 
Fossil-managed resource files.  These shared resource files must be versioned 
separately from the main repository files, but the current version must always 
be present underneath the main repository branches.  If I have three branches 
checked out, I need three additional nested checkouts of the shared repository.

It would be very nice if I could just open each branch, and have the 
subproject’s repo opened in its correct place, automatically.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Why Hash

2015-09-14 Thread Scott Robison
On Mon, Sep 14, 2015 at 11:46 AM, Warren Young  wrote:

> On Sep 12, 2015, at 2:26 AM, Stephan Beal  wrote:
> >
> > On Sat, Sep 12, 2015 at 12:57 AM, Warren Young  wrote:
> > For instance, why even mention “SHA1 Hash” on the checkin details page
> in fossil ui, from src/info.c?  Why not something more generic, like
> “checkin ID”?
> >
> > The checkin ID is the hash of the manifest for the checkin.
>
> Yes, I know that.  The question is not, “Why is the checkin ID a SHA-1
> hash?”  The question is, “Why does this UI web page have to *say* that it
> is a SHA-1 hash?”
>
> If this page just said “checkin ID,” what would be lost?
>

Nothing would really be lost that I can imagine. That being said:


> What would be gained is that people wouldn’t be trying to work out how to
> match sha1sum commands to Fossil output, and Fossil would be free to switch
> to a different algorithm later if that seemed like a good idea.
>

Is this really a problem? Given that the checkin ID is generated from a
structured manifest file which is generated in part from sha1 hash values
from all included artifacts, it seems intractable to create a deliberately
colliding hash.


> And indeed, maybe it is a good idea, since SHA-1 is nearing its EOL for
> cryptographic use:
>
>   https://www.google.com/?q=sha-1%20end%20of%20life


Except fossil doesn't use it for cryptographic security. For secure
communications, sure, make the change. For "deterministic generation of
identifiers with low probability of collision" it stills seems safe enough.
If people need more security, they should probably be using GPG to sign
commits.

If the powers that be want to make a change of algorithm for ID generation,
that'd be fine. I just don't see any urgency myself in non-cryptographic
applications.

-- 
Scott Robison
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Why Hash

2015-09-14 Thread Scott Robison
On Mon, Sep 14, 2015 at 1:01 PM, Warren Young  wrote:

> On Sep 14, 2015, at 12:11 PM, Scott Robison 
> wrote:
> >
> > > Fossil would be free to switch to a different algorithm later if that
> seemed like a good idea.
> >
> > Is this really a problem? Given that the checkin ID is generated from a
> structured manifest file which is generated in part from sha1 hash values
> from all included artifacts, it seems intractable to create a deliberately
> colliding hash.
>
> If I were a black hat — and please realize that I have zero practice
> trying to be one, so assume that a real black hat would be as much better
> at this as Mario Andretti is better than me at driving really fast — and I
> wanted to attack someone else’s Fossil repo, I would consider its use of
> SHA-1 as at least “hopeful.”
>
> The first line of defense is the passwords of valid committers, which
> presumably contain much less than 160 bits of entropy.  All you need to do
> is find one weak password.  And if that seems like an impossible thing to
> you, you haven’t been paying attention to the computer security news.
>

Fair enough.


> So now you have checkin privileges on someone else’s Fossil repo.  Now
> what?  Obviously you could just commit evil code to the trunk, but it would
> be much neater if you could insert it into an arbitrary point in the
> checkin tree, if for no other reason than to hide it from the timeline
> page, to reduce your chances of getting caught.
>
> So yes, the question really does become, how difficult is it to forge a
> consistent yet bogus SHA-1 hash?  If the crypto folk are worried about it —
> and a more conservative bunch of computer scientists you will not find —
> I’d say there is probably cause to be worried.
>

Also fair enough. Though there would be the additional difficulty (though I
don't know how difficult it would be) to convince the canonical repository
to replace an old checkin with a crafted checkin. This seems unlikely to me
given that the receiving repo (as I understand it) will say "I already have
that ID, what about the next one".


> Let me restate that last point, to be doubly clear: If Bruce “security
> theater” Schneier is worried about SHA-1, *I* am worried about SHA-1.
>
>   https://www.schneier.com/blog/archives/2005/02/sha1_broken.html
>   https://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html
>   https://www.schneier.com/blog/archives/2012/10/when_will_we_se.html
>   https://konklone.com/post/why-google-is-hurrying-the-web-to-kill-sha-1
>
> The first two links talk about an attack that made it possible to generate
> a hash collision with difficult-to-obtain levels of technology…in 2005.
> That’s 6 Moore’s Law generations ago, which comes to about a factor of 100
> in CPU cycles per dollar.
>
> The third link gives a budgetary estimate of what it took to attack SHA-1
> in 2012, with projections into the future that do not include an estimated
> rate of change in attack effectiveness.  Attacks never get weaker, only
> stronger.
>
> If you’re only thinking of maladjusted individuals and bottom-feeding
> criminal gangs doing this, you probably haven’t considered that there might
> be at least one major world government which would like to covertly insert
> a bit of code into a widely-used open source project.


I wasn't really thinking of who might want to do it, just that sha1 isn't
being used for cryptographic security and that would be covered by other
means (GPG for example).

Thanks for the thoughtful response vs the (all too often on the internet)
approach of questioning my parentage or intellect. :)

-- 
Scott Robison
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users


Re: [fossil-users] Why Hash

2015-09-14 Thread Warren Young
On Sep 14, 2015, at 12:11 PM, Scott Robison  wrote:
> 
> > Fossil would be free to switch to a different algorithm later if that 
> > seemed like a good idea.
> 
> Is this really a problem? Given that the checkin ID is generated from a 
> structured manifest file which is generated in part from sha1 hash values 
> from all included artifacts, it seems intractable to create a deliberately 
> colliding hash.

If I were a black hat — and please realize that I have zero practice trying to 
be one, so assume that a real black hat would be as much better at this as 
Mario Andretti is better than me at driving really fast — and I wanted to 
attack someone else’s Fossil repo, I would consider its use of SHA-1 as at 
least “hopeful.”

The first line of defense is the passwords of valid committers, which 
presumably contain much less than 160 bits of entropy.  All you need to do is 
find one weak password.  And if that seems like an impossible thing to you, you 
haven’t been paying attention to the computer security news.

So now you have checkin privileges on someone else’s Fossil repo.  Now what?  
Obviously you could just commit evil code to the trunk, but it would be much 
neater if you could insert it into an arbitrary point in the checkin tree, if 
for no other reason than to hide it from the timeline page, to reduce your 
chances of getting caught.

So yes, the question really does become, how difficult is it to forge a 
consistent yet bogus SHA-1 hash?  If the crypto folk are worried about it — and 
a more conservative bunch of computer scientists you will not find — I’d say 
there is probably cause to be worried.

Let me restate that last point, to be doubly clear: If Bruce “security theater” 
Schneier is worried about SHA-1, *I* am worried about SHA-1.

  https://www.schneier.com/blog/archives/2005/02/sha1_broken.html
  https://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html
  https://www.schneier.com/blog/archives/2012/10/when_will_we_se.html
  https://konklone.com/post/why-google-is-hurrying-the-web-to-kill-sha-1

The first two links talk about an attack that made it possible to generate a 
hash collision with difficult-to-obtain levels of technology…in 2005.  That’s 6 
Moore’s Law generations ago, which comes to about a factor of 100 in CPU cycles 
per dollar.

The third link gives a budgetary estimate of what it took to attack SHA-1 in 
2012, with projections into the future that do not include an estimated rate of 
change in attack effectiveness.  Attacks never get weaker, only stronger.

If you’re only thinking of maladjusted individuals and bottom-feeding criminal 
gangs doing this, you probably haven’t considered that there might be at least 
one major world government which would like to covertly insert a bit of code 
into a widely-used open source project.
___
fossil-users mailing list
fossil-users@lists.fossil-scm.org
http://lists.fossil-scm.org:8080/cgi-bin/mailman/listinfo/fossil-users