Re: [gentoo-dev] Re: git security (SHA-1)
Dnia 2014-09-20, o godz. 21:20:34 Rich Freeman ri...@gentoo.org napisał(a): On Sat, Sep 20, 2014 at 8:58 PM, Gordon Pettey petteyg...@gmail.com wrote: You're following the wrong train down the wrong tracks. Git [0-9a-f]{40} is to CVS 1[.][1-9][0-9]+. You're arguing that CVS is more secure because its commits are sequential numbers. Ulrich is well-aware of that. His argument is that with cvs there is no security whatsoever in the scm, and so there is more interest in layering security on-top. With git there is more of a tendency to rely on the less-than-robust commit signing system. We could always just keep full manifests in the tree and be no worse off than with cvs. And we would be no better off than with CVS. We'd have huge repository with a lot of redundant space-eating data and the impossibility of sane merges or rebases. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Re: git security (SHA-1)
On Sun, 21 Sep 2014, Michał Górny wrote: Rich Freeman ri...@gentoo.org napisał(a): Ulrich is well-aware of that. His argument is that with cvs there is no security whatsoever in the scm, and so there is more interest in layering security on-top. With git there is more of a tendency to rely on the less-than-robust commit signing system. We could always just keep full manifests in the tree and be no worse off than with cvs. And we would be no better off than with CVS. We'd have huge repository with a lot of redundant space-eating data and the impossibility of sane merges or rebases. Not necessarily. As long as you keep write access to the repository secure, you don't need anything special there. However, it's a different story when the tree is distributed via a mirror system that is not entirely under our control. Full manifests could be generated automatically (and signed with an infra key) when copying the tree from the repository to the master mirror. Ulrich pgpBYUzWxkf9x.pgp Description: PGP signature
Re: [gentoo-dev] Re: git security (SHA-1)
Ulrich Mueller: Full manifests could be generated automatically (and signed with an infra key) when copying the tree from the repository to the master mirror. Would you like to implement it?
Re: [gentoo-dev] Re: git security (SHA-1)
Dnia 2014-09-21, o godz. 09:54:06 Ulrich Mueller u...@gentoo.org napisał(a): On Sun, 21 Sep 2014, Michał Górny wrote: Rich Freeman ri...@gentoo.org napisał(a): Ulrich is well-aware of that. His argument is that with cvs there is no security whatsoever in the scm, and so there is more interest in layering security on-top. With git there is more of a tendency to rely on the less-than-robust commit signing system. We could always just keep full manifests in the tree and be no worse off than with cvs. And we would be no better off than with CVS. We'd have huge repository with a lot of redundant space-eating data and the impossibility of sane merges or rebases. Not necessarily. As long as you keep write access to the repository secure, you don't need anything special there. However, it's a different story when the tree is distributed via a mirror system that is not entirely under our control. Full manifests could be generated automatically (and signed with an infra key) when copying the tree from the repository to the master mirror. Do you really consider keeping a key open for machine signing somewhat secure? -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Re: git security (SHA-1)
On Sun, 21 Sep 2014, Michał Górny wrote: Do you really consider keeping a key open for machine signing somewhat secure? You mean, as compared to manifests (or commits) signed by 250 different developers' keys? Ulrich pgpF0PMDGXMa0.pgp Description: PGP signature
Re: [gentoo-dev] Re: git security (SHA-1)
On Sun, Sep 21, 2014 at 2:13 PM, Ulrich Mueller u...@gentoo.org wrote: On Sun, 21 Sep 2014, Michał Górny wrote: Do you really consider keeping a key open for machine signing somewhat secure? You mean, as compared to manifests (or commits) signed by 250 different developers' keys? Ulrich Unrelated to git discussion, in the past we discussed co-sign, so that developer signs using short term key, and infra co-sign using long term key if the developer sign is valid at that time. Portage infra should relay on infra key signature, while tractability is available up to developer. I will take the opportunity of responding to write that my preference is to keep the manifest signature detached from the version management technology, with no git specific feature usage, nor git specific development (signed hrefs). It will enable much easier use of each technology, one for file management and the other for security, while enabling rebase and reorg without effecting integrity. If we can establish co-sign I will be very happy. Regards, Alon
Re: [gentoo-dev] Re: git security (SHA-1)
Ulrich Mueller: On Sun, 21 Sep 2014, Michał Górny wrote: Do you really consider keeping a key open for machine signing somewhat secure? You mean, as compared to manifests (or commits) signed by 250 different developers' keys? That's the actual security problem in gentoo: 250 developers (which will not be fixed by SHA256 and not by an infra key). I think this discussion is derailing and unrelated to practical security, but you keep talking _only_ about hashes instead of... well, security which is not just about maths, but also about probability, resources, configuration and project structure. If you keep pushing into this direction without an implementation that solves it then you will just have no one care about git migration any more.
Re: [gentoo-dev] Gentoo git workflows and the stabilization/keywording process
Hi! On Fri, 19 Sep 2014, hasufell wrote: Tobias Klausmann: If this should really turn out to be a problem, then we could also: 4) Replace git's default merge driver by our own one that is better suited for ebuilds. This can be done per repository via .git/config and .gitattributes. Certainly that would be even more helpful! Still, all of these scenarios cause merge commits No. 1. git pull --rebase=preserve origin master = error: could not apply commit... commit-msg 2. fix conflicts via 'git mergetool' (e.g. meld or vimdiff with 3 panel view... very easy to see what happened) 3. finish rebase via 'git rebase --continue' = your unpushed keyword commit has been rewritten without a merge commit 4. push See, this is why I asked: I was not aware of this (and have pointed out repeatedly that I'd be delighted to be educated). That is pretty easy and takes you ~20s for a keyword merge. What's the problem? The problem is that not everyone has deep knowledge of git. I asked because I wanted to know what (if anything) we can do about a problem I perceived. When we do the migration, there _will_ be confusion and breakage and those who actually have deep knowledge will likely cringe a lot. Documentation is the way out of that. Regards, Tobias -- printk(Penguin %d is stuck in the bottle.\n, i); linux-2.0.38/arch/sparc/kernel/smp.c
Re: [gentoo-dev] Gentoo git workflows and the stabilization/keywording process
Tobias Klausmann: When we do the migration, there _will_ be confusion and breakage and those who actually have deep knowledge will likely cringe a lot. Documentation is the way out of that. https://wiki.gentoo.org/wiki/Gentoo_git_workflow But so far, not many people have been particularly interested in the details of these things. I'm also not sure if the ML is the right way to figure out these details.
Re: [gentoo-dev] Gentoo git workflows and the stabilization/keywording process
Dnia 2014-09-18, o godz. 19:39:08 Tobias Klausmann klaus...@gentoo.org napisał(a): Since we're causing at least mild upheaval process-wise, I thought I'd bring up a topic that will be exacerbated by the git migration if it's not really addressed. AIUI, we try to avoid merge conflicts, unless the merge is a meaningful integration of divergent processes. However, one aspect of how ebuilds are written these days will cause a non-trivial amount of merge commits that are not actually useful in that sense. This is due to the way keywording and stabilization work on an ebuild level. Since keywords are all in one line, any merge tool will barf on two keywords being changed in disparate clones. I.e. if I change ~alpha-alpha while someone else changes ~amd64-amd64, we will have a merge conflict. If someone stabilizes the package you have edited, then most likely you actually want to edit your commits and move the changes to a revbump. If at all, I'd be more worried by a case when queued version bumps would lose keywords that were added in the meantime to older versions. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Gentoo git workflows and the stabilization/keywording process
On Sun, 21 Sep 2014, hasufell wrote: https://wiki.gentoo.org/wiki/Gentoo_git_workflow But so far, not many people have been particularly interested in the details of these things. I'm also not sure if the ML is the right way to figure out these details. Where else should this be discussed then? Discussion page of the wiki page? | commit policy | • atomic commits (one logical change) A version bump plus cleaning up older ebuilds will be considered one logical change, I suppose? | • commits may span across multiple ebuilds/directories if it's one | logical change | • every commit on the left-most line of the history (that is, all | the commits following the first parent of each commit) must be gpg | signed by a gentoo dev | • repoman must be run from all related ebuild directories (or | related category directories or top-level directory) on the tip of | the local master branch (as in: right before you push and also | after resolving push-conflicts) Have you tested if running repoman in the top-level directory is realistic as part of the workflow? | commit message format | • all lines max 70-75 chars | • first line brief explanation | • second line always empty | • optional detailed multiline explanation must start at the third | line | • for commits that affect only a single package, prepend | CATEGORY/PN: to the first line In some cases of long package names, this may be in conflict with the first item. dev-python/rax-default-network-flags-python-novaclient-ext: as a prefix doesn't leave much space for an explanation. ;) | • for commits that affect only a single package, but also modify | eclasses/profiles/licenses as part of a logical change, also | prepend CATEGORY/PN: to the first line | • for commits that affect only the profile directory, prepend | profiles: to the first line | • for commits that affect only the eclass directory, prepend | ECLASSNAME.eclass: to the first line Maybe just eclass: would be slightly more systematic? (Because for all others it is a directory.) | • for commits that affect only licenses directory, prepend | licenses: to the first line Same for commits that affect licenses and profiles? | • for commits that affect only metadata directory, prepend | metadata: to the first line | • mass commits that affect a whole category (or large parts of it) | may prepend CATEGORY: to the first line All in all, this looks sane to me. Is it planned to enforce this policy by repoman? Ulrich pgpuJcrBWOtmG.pgp Description: PGP signature
Re: [gentoo-dev] Gentoo git workflows and the stabilization/keywording process
-BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 09/21/2014 01:21 PM, Michał Górny wrote: Dnia 2014-09-18, o godz. 19:39:08 Tobias Klausmann klausman-abrp7r+bbdudnm+yrof...@public.gmane.org napisał(a): Since we're causing at least mild upheaval process-wise, I thought I'd bring up a topic that will be exacerbated by the git migration if it's not really addressed. AIUI, we try to avoid merge conflicts, unless the merge is a meaningful integration of divergent processes. However, one aspect of how ebuilds are written these days will cause a non-trivial amount of merge commits that are not actually useful in that sense. This is due to the way keywording and stabilization work on an ebuild level. Since keywords are all in one line, any merge tool will barf on two keywords being changed in disparate clones. I.e. if I change ~alpha-alpha while someone else changes ~amd64-amd64, we will have a merge conflict. If someone stabilizes the package you have edited, then most likely you actually want to edit your commits and move the changes to a revbump. If at all, I'd be more worried by a case when queued version bumps would lose keywords that were added in the meantime to older versions. The case under discussion here was where the only edit made locally to the ebuild was, in fact, stabilizing the ebuild on a different architecture. In this case, the correct response would be to ensure that the final commit pushed (whether it be a merge commit or rebased) contains the stabilization for both arches, as there would be no need to revbump to add a stable keyword (in fact, I'd call that an incorrect resolution). - -- Jonathan Callen -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBCgAGBQJUHy8BAAoJELHSF2kinlg4wPIP/jrDAo0ye4f/5jRF3AwZYTN8 UkEJywDWFA4K9ffZbuAN2gGEWGclh6p5fMiiPwKjxAaMDzopuWdzWRrVYXsTEqqW SKT0jmAQroSiXj3DLczuvijw0xhY1sKLAUUnnsxhdieYb61SusIBV5xtDl4R/xuD pF8wtlEp/KL3Kbba4wlcBL4lZsSannuNZ8BA0QyGhPfiLb/UEYQsL+hhwWNR9IlM HnDuBzdHFvJP29QfP556+ItxoKFDFJpRMYqukN2Ws1g276DesryWe0iWF9hvzCyj VdTml7T+u1i7VXM9UCE0IAp7r94y/14Q1S/+XWLNxATFxI2xVP3HuecZVKwDYhcy zu8FTcRmYZddIzhcnUKVWe/e5ihIvYlesEoKVgsl6TCXbcbshzyKEpfZXcjTs5Po gQe32QPDwLetqO5qbko51SbO2E1gCMEFrzf+MsBLE2oO+nhfEbL6Gmsx5F7beM37 AafEdb8U3ZBi7jz2/zPGBKzoyDinbjB9o5I+mrwG7qL4t0OZDjS9Qg/eOg3lm3cF 5tQbMNvVOzxBocis3FSk+L9lMVxQKQAYCrYn52meZbbCdFc85bnqXuCZvh1Q+mp0 zgX8vroDDyzpzP6AJeq8haP10TExnVABbsZHV9YxhVEipQO0mPrtzDWodS157dEN yzgJrofBkxqf/r2hUcvX =gARN -END PGP SIGNATURE-
Re: [gentoo-dev] Gentoo git workflows and the stabilization/keywording process
Jonathan Callen wrote: the correct response would be to ensure that the final commit pushed (whether it be a merge commit or rebased) contains the stabilization for both arches I think this is one of the things to check in a post-receive or post-update hook. What is the easiest way to access keywords out of an ebuild - which does *not* require sourcing the ebuild in a shell? //Peter
Re: [gentoo-dev] Gentoo git workflows and the stabilization/keywording process
On Sun, 21 Sep 2014, Peter Stuge wrote: Jonathan Callen wrote: the correct response would be to ensure that the final commit pushed (whether it be a merge commit or rebased) contains the stabilization for both arches I think this is one of the things to check in a post-receive or post-update hook. What is the easiest way to access keywords out of an ebuild - which does *not* require sourcing the ebuild in a shell? A quick scan of the tree shows that combinations of any of the following would have to be handled: - leading whitespace - double or single quotes, or no quotation marks at all - continuation lines, with or without backslash - comments following the keywords assignment - keywords inherited from an eclass Of course, there is no guarantee that devs don't invent something new that isn't in above list. :) So I'd say there is no sane way other than sourcing with bash, unless we want to restrict by policy what is allowed as KEYWORDS syntax. Ulrich pgp8u_OncpcuM.pgp Description: PGP signature
Re: [gentoo-dev] Gentoo git workflows and the stabilization/keywording process
Ulrich Mueller: On Sun, 21 Sep 2014, hasufell wrote: https://wiki.gentoo.org/wiki/Gentoo_git_workflow But so far, not many people have been particularly interested in the details of these things. I'm also not sure if the ML is the right way to figure out these details. Where else should this be discussed then? Discussion page of the wiki page? | commit policy | • atomic commits (one logical change) A version bump plus cleaning up older ebuilds will be considered one logical change, I suppose? I'd consider it two logical changes (e.g. imagine a user complaining about ebuild removal... you cannot easily revert it if it's not a separate commit). But I don't have a strong opinion on that and I'm not sure if we can enforce commit rules in such fine-grained details, can we? Do you think this should be added explicitly? | • commits may span across multiple ebuilds/directories if it's one | logical change | • every commit on the left-most line of the history (that is, all | the commits following the first parent of each commit) must be gpg | signed by a gentoo dev | • repoman must be run from all related ebuild directories (or | related category directories or top-level directory) on the tip of | the local master branch (as in: right before you push and also | after resolving push-conflicts) Have you tested if running repoman in the top-level directory is realistic as part of the workflow? This is really just meant for stuff like mass commits, not regular ebuild stuff. Ask patrick, he's running repoman all the time, probably even top-level. | commit message format | • all lines max 70-75 chars | • first line brief explanation | • second line always empty | • optional detailed multiline explanation must start at the third | line | • for commits that affect only a single package, prepend | CATEGORY/PN: to the first line In some cases of long package names, this may be in conflict with the first item. dev-python/rax-default-network-flags-python-novaclient-ext: as a prefix doesn't leave much space for an explanation. ;) I'd say that's rare enough to warrant exceptions in that case. Or are you suggesting to drop CATEGORY/PN completely? I mean... it's not strictly necessary, since you can just look at the files that have been touched and run git log on an ebuild directory, but I think it will still make using the history easier, especially since mass-commits will not have CATEGORY/PN and can very easily be identified. It's also common practice in various overlays. Do you want me to add this exceptional case as a valid reason to have more than 75 chars in the first line? | • for commits that affect only a single package, but also modify | eclasses/profiles/licenses as part of a logical change, also | prepend CATEGORY/PN: to the first line | • for commits that affect only the profile directory, prepend | profiles: to the first line | • for commits that affect only the eclass directory, prepend | ECLASSNAME.eclass: to the first line Maybe just eclass: would be slightly more systematic? (Because for all others it is a directory.) Not sure either. This is again just a convenience thing for history readability, not strictly necessary information. But I'd personally go for full eclass name. | • for commits that affect only licenses directory, prepend | licenses: to the first line Same for commits that affect licenses and profiles? You are probably referring to license_groups. I'd say it matters more what the intention is and not so much what directories in particular were touched. If you want to add a new license and have to edit profiles/license_groups, I'd still just prepend licenses: . Maybe I should add this as a general idea to the commit guideline. I think that's better than specing the whole through which will probably just confuse people, no? | • for commits that affect only metadata directory, prepend | metadata: to the first line | • mass commits that affect a whole category (or large parts of it) | may prepend CATEGORY: to the first line All in all, this looks sane to me. Is it planned to enforce this policy by repoman? Good question. I'm not sure if it's a good idea though. It might get in our way for corner cases and whatnot. Maybe as a polite warning it would be ok. Also, if you have ideas of wording something better in particular, please share or just edit the wiki.
Re: [gentoo-dev] Gentoo git workflows and the stabilization/keywording process
hasufell wrote: A version bump plus cleaning up older ebuilds will be considered one logical change, I suppose? I'd consider it two logical changes .. But I don't have a strong opinion on that I do - I think this is really important. Having clean history makes a huge difference for anyone who wants to use that history. One argument against those clean professional development practices that comes up over and over is that it takes more time, (mimimi I don't have time to be part of any solution) which is sometimes true - but since git makes committing so easy usually the difference isn't very big, and the payoff when you benefit in the future is quite significant. rant Of course, lots of people still do not care at all about what is only a potential benefit in an uncertain future. Personally I might prefer that they stop doing open source instead of wasting my time with their whining. You pay forward, that's the point. /rant Getting back on track, it's likely that first-time git users will have to get used to committing more often than with other VCSes. and I'm not sure if we can enforce commit rules in such fine-grained details, can we? It's impossible to catch all cases, and there will always be disagreement between individual developers as to what is actually appropriate useful and correct. I think explicit consensus is impractical. :\ Do you think this should be added explicitly? I think keeping rules vague is probably the only thing that somehow scales. I really don't know. I get a feeling that a given individual either understands this concept of atomic changes, or they don't. I haven't seen someone who at first didn't understand (with someone explaining it to them of course) it come around and get it sometime later. :( //Peter
Re: [gentoo-dev] Gentoo git workflows and the stabilization/keywording process
On Sun, Sep 21, 2014 at 9:08 PM, Peter Stuge pe...@stuge.se wrote: hasufell wrote: A version bump plus cleaning up older ebuilds will be considered one logical change, I suppose? I'd consider it two logical changes .. But I don't have a strong opinion on that I do - I think this is really important. Having clean history makes a huge difference for anyone who wants to use that history. One argument against those clean professional development practices that comes up over and over is that it takes more time, (mimimi I don't have time to be part of any solution) which is sometimes true - but since git makes committing so easy usually the difference isn't very big, and the payoff when you benefit in the future is quite significant. ++ A git commit is virtually instantaneous since it is entirely local. Do you think this should be added explicitly? I think keeping rules vague is probably the only thing that somehow scales. ++ I think we should start out with decent guidelines, and then move on from there. Nobody is going to die if some of our commits are sloppy out of the gate. One of our biggest strengths as a distro is the autonomy we give individual developers, and guidelines are usually more productive than rules. If they get abused, we can deal with it. -- Rich
Re: [gentoo-dev] Gentoo git workflows and the stabilization/keywording process
On Monday 22 September 2014 00:52:14 hasufell wrote: | • repoman must be run from all related ebuild directories (or | | related category directories or top-level directory) on the tip of | the local master branch (as in: right before you push and also | after resolving push-conflicts) Have you tested if running repoman in the top-level directory is realistic as part of the workflow? This is really just meant for stuff like mass commits, not regular ebuild stuff. Ask patrick, he's running repoman all the time, probably even top-level. It's not. For a category like dev-python I'm seeing runtimes near 10 minutes on a 3.4Ghz Xeon. Scaling that down to more common hardware it's even prohibitive on smaller categories ... and at our current commit frequency, mwehehehehe. hehe. Heh. It'd be an infinite pull-rebase-repoman-yell cycle.
[gentoo-dev] Re: Gentoo git workflows and the stabilization/keywording process
Rich Freeman posted on Sun, 21 Sep 2014 21:46:14 -0400 as excerpted: On Sun, Sep 21, 2014 at 9:08 PM, Peter Stuge pe...@stuge.se wrote: hasufell wrote: A version bump plus cleaning up older ebuilds will be considered one logical change, I suppose? I'd consider it two logical changes ... But I don't have a strong opinion on that I do - I think this is really important. Having clean history makes a huge difference for anyone who wants to use that history. One argument against those clean professional development practices that comes up over and over is that it takes more time... but since git makes committing so easy usually the difference isn't very big, and the payoff when you benefit in the future is quite significant. ++ A git commit is virtually instantaneous since it is entirely local. Unlike CVS, git commit != git push, and understanding that is vital to effective us of git /as/ /git/. Commit is local and fast; it HAS to be to encourage single-logical-change commits. But you can separately commit one or a dozen or a dozen hundred logical changes as part of a single set or a few sets of commits and push them all at once, /just/ once. Devs doing gentoo all day could easily do one or two pushes a day, with many commits in each. Those with less time might do the same work over several days or a week and might push just once or twice that week, if none of the changes are time-critical enough to be worth a more urgent push. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman