[gentoo-dev] Re: gentoo git workflow

2014-09-14 Thread Duncan
C. Bergström posted on Mon, 15 Sep 2014 02:49:48 +0700 as excerpted:

> Pretty please do NOT allow "merge" commits.. they are the bane of evil
> for the long term ability to have any sane work-flow. Trying browsing a
> commit history after a big merge commit.. or following the parent..

You just called the inventor of git and the workflow it was designed to 
manage insane.  If that's the case arguably quite a bit more insanity 
would be a good thing! =:^)

Try git log --color --stat --graph on the mainline kernel in a terminal 
and read only the main merge-commit logs unless that merge is something 
of special interest you want more info on.  It actually makes following a 
full kernel cycle, including the commit window, drilling down to sub-
merges and individual commits only on 2-3 areas of interest while keeping 
a general awareness of developments in the rest of the kernel not only 
practical once again, but relatively easy.  Without seeing merge-commits 
it was a LOT harder.  I know as I've done it both ways, and while I can 
get around in git to some extent, my git skills are definitely nothing 
special!  =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




[gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Duncan
hasufell posted on Sun, 14 Sep 2014 13:50:32 + as excerpted:

> Jauhien Piatlicki:
>> 
>> Again, how will user check the integrity and authenticity if Manifests
>> are unsigned?

> There is no regression if this isn't solved.

> People who really care use emerge-webrsync.
> If we use the proposed solution, then there is an additional method via
> the User syncing repo, so it's a win.

Absolutely.  emerge-webrsync is the current "I care enough to worry about 
it" method, and this already adds the user-sync git repo as an "I care" 
option.  Leaving standard rsync users where they already are isn't a 
regression and shouldn't be a blocker.  Don't let the perfect be the 
enemy of the imperfect but better! =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman




Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Tim Harder
On 2014-09-14 21:57, Kent Fredric wrote:
> I generate metadata for the perl-experimental overlay periodically as a
> snapshotted variation of the same, and the performance isn't so bad.

Overlays with few eclasses are much different than the main tree.
Anyway, egencache isn't bad it's just significantly slower than
alternatives so it could be sped up quite a lot if necessary.

> However, what I suspect you *could* do with a push hook is regen metadata
> for only things that were modified in that commit, because I believe
> there's a way to regen metadata for only specific files now.

> ie:
>  modifications to cat/PN *would* trigger a metadata update, but only for
> that cat/PN
>  modifications to eclass/* would *NOT* trigger a metadata update as part of
> the push.

> And doing tree-wide "an eclass was changed" updates could be done with
> lower priority in an asynchronous cron job or something so as not to block
> workflow for several minutes/hours/whatever while some muppet sits there
> watching "git push" do nothing.

If we need to do piecewise regen it seems we would be better off just
sticking with the current scheduled cron job approach. Otherwise it
sounds like one could pull updates without having the correct metadata
for a significant portion of the tree.

Tim


pgpD3F3w_LSNi.pgp
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 13:30, Tim Harder  wrote:

> I haven't run portage metadata regen on a beefy machine lately, but I
> don't think it could keep up in all cases. Perhaps someone can prove me
> wrong.
>
> Anyway, things could definitely be sped up if portage merges a few speed
> tweaks used in pkgcore. Specifically, I think using some of the weakref
> and perhaps jitted attrs support along with the eclass caching hacks
> would give a 2-4x metadata regen speedup. Otherwise pkgcore could
> potentially be used to regen metadata as well or some other tuned regen
> tool.
>


I generate metadata for the perl-experimental overlay periodically as a
snapshotted variation of the same, and the performance isn't so bad.

However, what I suspect you *could* do with a push hook is regen metadata
for only things that were modified in that commit, because I believe
there's a way to regen metadata for only specific files now.

ie:
 modifications to cat/PN *would* trigger a metadata update, but only for
that cat/PN
 modifications to eclass/* would *NOT* trigger a metadata update as part of
the push.

And doing tree-wide "an eclass was changed" updates could be done with
lower priority in an asynchronous cron job or something so as not to block
workflow for several minutes/hours/whatever while some muppet sits there
watching "git push" do nothing.

-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Tim Harder
On 2014-09-14 10:46, Michał Górny wrote:
> Dnia 2014-09-14, o godz. 15:40:06
> Davide Pesavento  napisał(a):
> > How long does the md5-cache regeneration process take? Are you sure it
> > will be able to keep up with the rate of pushes to the repo during
> > "peak hours"? If not, maybe we could use a time-based thing similar to
> > the current cvs->rsync synchronization.
> 
> This strongly depends on how much data is there to update. A few
> ebuilds are quite fast, eclass change isn't ;). I was thinking of
> something along the lines of, in pseudo-code speaking:
> 
>   systemctl restart cache-regen
> 
> That is, we start the regen on every update. If it finishes in time, it
> commits the new metadata. If another update occurs during regen, we
> just restart it to let it catch the new data.
> 
> Of course, if we can't spare the resources to do intermediate updates,
> we may as well switch to cron-based update method.

I don't see per push metadata regen working entirely well in this case
if this is the only way we're generating the metadata cache for users to
sync. It's easy to imagine a plausible situation where a widely used
eclass change is made followed by commits less than a minute apart (or
shorter than however long it would take for metadata regen to occur) for
at least 30 minutes (rsync refresh period for most user-facing mirrors)
during a time of high activity.

I haven't run portage metadata regen on a beefy machine lately, but I
don't think it could keep up in all cases. Perhaps someone can prove me
wrong.

Anyway, things could definitely be sped up if portage merges a few speed
tweaks used in pkgcore. Specifically, I think using some of the weakref
and perhaps jitted attrs support along with the eclass caching hacks
would give a 2-4x metadata regen speedup. Otherwise pkgcore could
potentially be used to regen metadata as well or some other tuned regen
tool.

Tim


pgpGfmG5Ks9YC.pgp
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 13:06, Peter Stuge  wrote:

> even after
> the commits.
>

I've even made branches in "detached head" state ( that is, without a
branch ) and given them branches after the fact.

After all, branches aren't really "things", they're just pointers to SHA1s,
that get repointed to new sha1's as part of "git commit".

Tags are also simply pointers, they just don't get updated by default.

-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Peter Stuge
Rich Freeman wrote:
> If you just want to do 15 standalone commits before you push you can
> do those sequentially easily enough.  A branch would be more
> appropriate for some kind of mini-project.
..
> That is the beauty of git - branches are really cheap.
> So are repositories

And commits.

Not only are branches cheap, they are also very easy to create, and
maybe most importantly they can be created at any time, even after
the commits.

It's quick and painless to create a bunch of commits which aren't
really closely related in sequence, and only later clean the whole
series of commits up while creating different branches for commits
which should actually be grouped rather than mixed all together.


//Peter



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Peter Stuge
Patrick Lauer wrote:
> > > That'd mean I need half a dozen checkouts just to emulate cvs, which
> > > somehow doesn't make much sense to me ...
> > 
> > Unlike CVS, git doesn't force you to work in "Keep millions of files in
> > uncommitted states" mode just to work on a codebase, due to the commit <->
> > replicate seperation.
> 
> But that's the feature!

You can have millions of uncommitted files with git too. The person
who creates a commit always decides what changes in what files should
be included in that commit. (You don't even have to commit all the
changes within one file at the same time.)

There are some shortcuts for committing all uncommitted changes at
once but you don't have to do that. I frequently only commit little
bits of my currently uncommitted changes.


> I can work on bumping postgresql (takes about 1h walltime to compile and test 
> all versions) *and* work on a few tiny python packages while doing that. 
> Without breaking either process. Without multiple checkouts.

Same with git.


//Peter



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Rich Freeman
On Sun, Sep 14, 2014 at 7:21 PM, Patrick Lauer  wrote:
>
> iow, git doesn't allow people to work on more than one item at a time?
>
> That'd mean I need half a dozen checkouts just to emulate cvs, which somehow
> doesn't make much sense to me ...
>

Well, you can work on as many things as you like in git, but it
doesn't keep track of what changes have to do with what things if you
don't commit in-between.  So, you'll have a big list of changes in
your index, and you'll have to pick-and-choose what you commit at any
one time.

If you really want to work on many things "at once" the better way to
do it is to do a temporary branch per-thing, and when you switch
between things you switch between branches, and then move into master
things as they are done.

I assume you mean working on things that will take a while to
complete.  If you just want to do 15 standalone commits before you
push you can do those sequentially easily enough.  A branch would be
more appropriate for some kind of mini-project.

You can work on branches without pushing those to the master repo.
Or, if appropriate a project team might choose to push their branch to
master, or to some other repo (like an overlay).  This would allow
collaborative work on a large commit, with a quick final merge into
the main tree.  That is the beauty of git - branches are really cheap.
So are repositories - if somebody wants to do all their work in github
and then push to the main tree, they can do that.

--
Rich



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Patrick Lauer:
> On Monday 15 September 2014 11:27:34 Kent Fredric wrote:
>> On 15 September 2014 11:21, Patrick Lauer  wrote:
>>> iow, git doesn't allow people to work on more than one item at a time?
>>>
>>> That'd mean I need half a dozen checkouts just to emulate cvs, which
>>> somehow
>>> doesn't make much sense to me ...
>>
>> Use the Stash. Or just commit items, then swap branches, and then discard
>> the commits sometime later before pushing.
>>
>> Unlike CVS, git doesn't force you to work in "Keep millions of files in
>> uncommitted states" mode just to work on a codebase, due to the commit <->
>> replicate seperation.
> But that's the feature!
> 
> I can work on bumping postgresql (takes about 1h walltime to compile and test 
> all versions) *and* work on a few tiny python packages while doing that. 
> Without breaking either process. Without multiple checkouts.
> 
> I doubt stash would allow things to progress ... but it's a cute idea.
> 

Please read up about git branches.

I don't see anything particularly broken. People use git to work on 10+
different feature at a time. It works.

Also, let's not derail this thread to git vs CVS, thanks.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 11:25, hasufell  wrote:

> Robin said
> > The Git commit-signing design explicitly signs the entire commit,
> including blob contents, to avoid this security problem.
>
> Is this correct or not?
>

I can verify a commit by hand with only the commit object and gpg, but
without any of the trees or parents.

https://gist.github.com/kentfredric/8448fe55ffab7d314ecb


-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Patrick Lauer:
> On Sunday 14 September 2014 15:42:15 hasufell wrote:
>> Patrick Lauer:
 Are we going to disallow merge commits and ask devs to rebase local
 changes in order to keep the history "clean"?
>>>
>>> Is that going to be sane with our commit frequency?
>>
>> You have to merge or rebase anyway in case of a push conflict, so the
>> only difference is the method and the effect on the history.
>>
>> Currently... CVS allows you to run repoman on an outdated tree and push
>> broken ebuilds with repoman being happy. Git will not allow this.
> 
> iow, git doesn't allow people to work on more than one item at a time?
> 

Completely the opposite. You can work on 400 packages, accumulate the
changes, commit them and push them in one blow instead of writing
fragile scripts or Makefiles that do >400 pushes, fail at some point in
the middle because of a conflict and then try to figure out what you
already pushed and what not.

> That'd mean I need half a dozen checkouts just to emulate cvs, which somehow 
> doesn't make much sense to me ...
> 

checkouts? You probably mean that you have to rebase your changes in
case someone pushed before you. That makes perfect sense, because the
ebuild you just wrote might be broken by now, because someone changed
profiles/.

We are talking about a one-liner in the shell that will work in the
majority of the cases. If it doesn't work (as in: merge conflict), then
that means there is something REALLY wrong and 2 people are working
uncoordinated on the same file at a time.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Patrick Lauer
On Monday 15 September 2014 11:27:34 Kent Fredric wrote:
> On 15 September 2014 11:21, Patrick Lauer  wrote:
> > iow, git doesn't allow people to work on more than one item at a time?
> > 
> > That'd mean I need half a dozen checkouts just to emulate cvs, which
> > somehow
> > doesn't make much sense to me ...
> 
> Use the Stash. Or just commit items, then swap branches, and then discard
> the commits sometime later before pushing.
> 
> Unlike CVS, git doesn't force you to work in "Keep millions of files in
> uncommitted states" mode just to work on a codebase, due to the commit <->
> replicate seperation.
But that's the feature!

I can work on bumping postgresql (takes about 1h walltime to compile and test 
all versions) *and* work on a few tiny python packages while doing that. 
Without breaking either process. Without multiple checkouts.

I doubt stash would allow things to progress ... but it's a cute idea.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread W. Trevor King
On Sun, Sep 14, 2014 at 11:25:33PM +, hasufell wrote:
> So can we get this clear now.
> 
> Robin said
>
> > The Git commit-signing design explicitly signs the entire commit,
> > including blob contents, to avoid this security problem.
> 
> Is this correct or not?

That is false.  The commit signature explicitly signs the commit,
which includes the root tree hash.  That is the only connection
between the signature and the tree contents.

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 11:21, Patrick Lauer  wrote:

> iow, git doesn't allow people to work on more than one item at a time?
>
> That'd mean I need half a dozen checkouts just to emulate cvs, which
> somehow
> doesn't make much sense to me ...
>

Use the Stash. Or just commit items, then swap branches, and then discard
the commits sometime later before pushing.

Unlike CVS, git doesn't force you to work in "Keep millions of files in
uncommitted states" mode just to work on a codebase, due to the commit <->
replicate seperation.


-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Rich Freeman:
> On Sun, Sep 14, 2014 at 6:56 PM, hasufell  wrote:
>> According to Robin, it's not about rebasing, it's about signing all
>> commits so that messing with the blob (even if it has the same sha-1)
>> will cause signature verification failure.
>>
> 
> The only thing that gets signed is the commit message, and the only
> thing that ties the commit message to the code is the sha1 of the
> top-level tree.  If you can attack sha1 either at any tree level or at
> the blob level you can defeat the signature.
> 

So can we get this clear now.

Robin said
> The Git commit-signing design explicitly signs the entire commit, including 
> blob contents, to avoid this security problem.

Is this correct or not?



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread W. Trevor King
On Sun, Sep 14, 2014 at 07:13:21PM -0400, Rich Freeman wrote:
> The only thing that gets signed is the commit message, and the only
> thing that ties the commit message to the code is the sha1 of the
> top-level tree.  If you can attack sha1 either at any tree level or at
> the blob level you can defeat the signature.
> 
> That is way better than nothing though - I think it is worth pursuing
> until somebody comes up with a way to upgrade git to more secure
> hashes.  Most projects don't gpg sign their trees at all, including
> linux.

I'm not worried about the attack (as I explained earlier in this
thread).  I'm just arguing for signing first-parent commits to master,
and not worrying about signatures on any side-branch commits.  So long
as the merge gets signed, you've got all the security you're going to
get.  Leaving the side-branch commits unchanged allows you to preserve
any non-dev commit hashes, which makes it easier for contributors to
verify that their changes have landed (the same way that GitHub is
checking to know when to automatically close pull requests).

Cheers,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 11:15, W. Trevor King  wrote:

> All cherry-pick and am do is apply one commit's diff to a different
> parent.  Changing the parent hash (which is stored in the commit body
> [1]), so old signatures won't apply to the new commit.  If there have
> been other tree changes between the initial parent and the new parent,
> the tree hash will also change, which would also break old signatures.
> None of that has anything to do with a malicious blob being pushed
> into the tree disguised as a same-hashed good blob.  Such a blob will
> *not* break any signatures, since GnuPG is *never hashing the blob
> contents* when signing commits [1,2].  You're only signing the commit
> object, not the tree and blob objects referenced by that commit.
>
> Cheers,
> Trevor
>


And given that the method of "security" against attacks is established by a
chain of custody from a signed commit, through multiple child unsigned SHA1
objects, having a parent being an unsigned commit is no *less* secure than
having a tree or file blob being unsigned, it doesn't make perfect sense to
me that "all" commits have to be signed.  ( Because doing so doesn't give
the benefit of security we think it does ).

Thus, a "I signed this commit, establishing a chain of trust relying on
SHA1 integrity to the previous signed commit" is all that seems truly
necessary. Anything else is decreased utility with no increase in security.


-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


[gentoo-dev] Last rites: net-misc/netcomics-cvs

2014-09-14 Thread Dion Moult
Masked for removal in 30 days. See bug #515028

Ancient and unmaintained.

-- 
Dion Moult



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Patrick Lauer
On Sunday 14 September 2014 15:42:15 hasufell wrote:
> Patrick Lauer:
> >> Are we going to disallow merge commits and ask devs to rebase local
> >> changes in order to keep the history "clean"?
> > 
> > Is that going to be sane with our commit frequency?
> 
> You have to merge or rebase anyway in case of a push conflict, so the
> only difference is the method and the effect on the history.
> 
> Currently... CVS allows you to run repoman on an outdated tree and push
> broken ebuilds with repoman being happy. Git will not allow this.

iow, git doesn't allow people to work on more than one item at a time?

That'd mean I need half a dozen checkouts just to emulate cvs, which somehow 
doesn't make much sense to me ...



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 10:56, hasufell  wrote:

> According to Robin, it's not about rebasing, it's about signing all
> commits so that messing with the blob (even if it has the same sha-1)
> will cause signature verification failure.
>

Correct me if I'm wrong, but wouldn't a SHA1 attack on the tree object or
file blobs be completely invisible to the commit SHA1?

As the Signature only signs content of the commit object, not any of the
nodes it refers to.

Granted, getting a tree/file object to replicate might be interesting.

-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread W. Trevor King
On Sun, Sep 14, 2014 at 10:56:33PM +, hasufell wrote:
> W. Trevor King:
> > On Sun, Sep 14, 2014 at 10:38:41PM +, hasufell wrote:
> >> So we'd basically end up using either "git cherry-pick" or "git
> >> am" for "pulling" user stuff, so that we also sign the blobs.
> > 
> > Rebasing the original commits doesn't protect you from the
> > birthday attach either, because the vulnerable hash is likely
> > going to still be in the rebased commit's tree.  All rebasing does
> > is swap the committer and drop the initial signature.
> 
> According to Robin, it's not about rebasing, it's about signing all
> commits so that messing with the blob (even if it has the same
> sha-1) will cause signature verification failure.

All cherry-pick and am do is apply one commit's diff to a different
parent.  Changing the parent hash (which is stored in the commit body
[1]), so old signatures won't apply to the new commit.  If there have
been other tree changes between the initial parent and the new parent,
the tree hash will also change, which would also break old signatures.
None of that has anything to do with a malicious blob being pushed
into the tree disguised as a same-hashed good blob.  Such a blob will
*not* break any signatures, since GnuPG is *never hashing the blob
contents* when signing commits [1,2].  You're only signing the commit
object, not the tree and blob objects referenced by that commit.

Cheers,
Trevor

[1]: http://article.gmane.org/gmane.linux.gentoo.devel/77537
[2]: http://git.kernel.org/cgit/git/git.git/tree/commit.c?id=v2.1.0#n1076

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Rich Freeman
On Sun, Sep 14, 2014 at 6:56 PM, hasufell  wrote:
> According to Robin, it's not about rebasing, it's about signing all
> commits so that messing with the blob (even if it has the same sha-1)
> will cause signature verification failure.
>

The only thing that gets signed is the commit message, and the only
thing that ties the commit message to the code is the sha1 of the
top-level tree.  If you can attack sha1 either at any tree level or at
the blob level you can defeat the signature.

That is way better than nothing though - I think it is worth pursuing
until somebody comes up with a way to upgrade git to more secure
hashes.  Most projects don't gpg sign their trees at all, including
linux.

--
Rich



[gentoo-dev] Last rites: app-text/pastebin

2014-09-14 Thread Dion Moult
Masked for removal in 30 days. Please see bug #434366

It has no support for new API since 2012. A good replacement of this package,
app-text/pastebinit, is already stabilized.

-- 
Dion Moult



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
W. Trevor King:
> On Sun, Sep 14, 2014 at 10:38:41PM +, hasufell wrote:
>> Yes, there is a possible attack vector mentioned in this comment
>> https://bugs.gentoo.org/show_bug.cgi?id=502060#c16
> 
> From that comment, the point 1.2 is highly unlikely [1]:
> 
>   1. Attacker constructs a init.d script, regular part at the start,
>  malicious part at the end
>   1.1. This would be fairly simple, just construct two start()
>  functions, one of which is mundane, the other is malicious.
>   1.2. Both variants of the script have the same SHA1...
> 
>> So we'd basically end up using either "git cherry-pick" or "git am"
>> for "pulling" user stuff, so that we also sign the blobs.
> 
> Rebasing the original commits doesn't protect you from the birthday
> attach either, because the vulnerable hash is likely going to still be
> in the rebased commit's tree.  All rebasing does is swap the committer
> and drop the initial signature.
> 

According to Robin, it's not about rebasing, it's about signing all
commits so that messing with the blob (even if it has the same sha-1)
will cause signature verification failure.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread W. Trevor King
On Sun, Sep 14, 2014 at 10:38:41PM +, hasufell wrote:
> Yes, there is a possible attack vector mentioned in this comment
> https://bugs.gentoo.org/show_bug.cgi?id=502060#c16

From that comment, the point 1.2 is highly unlikely [1]:

  1. Attacker constructs a init.d script, regular part at the start,
 malicious part at the end
  1.1. This would be fairly simple, just construct two start()
 functions, one of which is mundane, the other is malicious.
  1.2. Both variants of the script have the same SHA1...

> So we'd basically end up using either "git cherry-pick" or "git am"
> for "pulling" user stuff, so that we also sign the blobs.

Rebasing the original commits doesn't protect you from the birthday
attach either, because the vulnerable hash is likely going to still be
in the rebased commit's tree.  All rebasing does is swap the committer
and drop the initial signature.

Cheers,
Trevor

[1]: http://article.gmane.org/gmane.comp.version-control.git/210622

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
W. Trevor King:
> On Sun, Sep 14, 2014 at 05:40:30PM +0200, Michał Górny wrote:
>> Dnia 2014-09-15, o godz. 03:15:14 Kent Fredric napisał(a):
>>> Only downside there is the way github pull reqs work is if the
>>> final SHA1's that hit tree don't match, the pull req doesn't
>>> close.
>>>
>>> Solutions:
>>>
>>> - A) Have somebody tasked with reaping old pull reqs with
>>> permissions granted. ( Uck )
>>> - B) Always use a merge of some kind to mark the pull req as dead
>>> ( for instance, an "ours" merge to mark the branch as deprecated )
>>>
>>> Both of those options are kinda ugly.
>>
>> If you merge a pull request, I suggest doing a proper 'git merge -S'
>> anyway to get a developer signature on top of all the changes.
> 
> Some previous package-tree-in-Git efforts suggested that only
> Gentoo-dev signatures were acceptable, and that those signatures would
> be required on every commit (not just the first-parent line) [1,2].  I
> don't see the point of that, so long as Gentoo devs are signing the
> first-parent line, but if folks still want Gentoo-dev signatures on
> every commit the ‘git merge -S’ approach will not work for closing
> PRs.
> 
> Cheers,
> Trevor
> 
> [1]: http://article.gmane.org/gmane.linux.gentoo.devel/77572
>  id:cagfcs_manfikevtj3cmcq1of-uqavebe2r1okykygwc5vom...@mail.gmail.com
> [2]: https://bugs.gentoo.org/show_bug.cgi?id=502060#c0
> 

Yes, there is a possible attack vector mentioned in this comment
https://bugs.gentoo.org/show_bug.cgi?id=502060#c16

So we'd basically end up using either "git cherry-pick" or "git am" for
"pulling" user stuff, so that we also sign the blobs.

Regular merges would still be possible for developer pull requests, but
that's probably not the primary use case anyway.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread W. Trevor King
On Sun, Sep 14, 2014 at 05:40:30PM +0200, Michał Górny wrote:
> Dnia 2014-09-15, o godz. 03:15:14 Kent Fredric napisał(a):
> > Only downside there is the way github pull reqs work is if the
> > final SHA1's that hit tree don't match, the pull req doesn't
> > close.
> > 
> > Solutions:
> > 
> > - A) Have somebody tasked with reaping old pull reqs with
> > permissions granted. ( Uck )
> > - B) Always use a merge of some kind to mark the pull req as dead
> > ( for instance, an "ours" merge to mark the branch as deprecated )
> > 
> > Both of those options are kinda ugly.
> 
> If you merge a pull request, I suggest doing a proper 'git merge -S'
> anyway to get a developer signature on top of all the changes.

Some previous package-tree-in-Git efforts suggested that only
Gentoo-dev signatures were acceptable, and that those signatures would
be required on every commit (not just the first-parent line) [1,2].  I
don't see the point of that, so long as Gentoo devs are signing the
first-parent line, but if folks still want Gentoo-dev signatures on
every commit the ‘git merge -S’ approach will not work for closing
PRs.

Cheers,
Trevor

[1]: http://article.gmane.org/gmane.linux.gentoo.devel/77572
 id:cagfcs_manfikevtj3cmcq1of-uqavebe2r1okykygwc5vom...@mail.gmail.com
[2]: https://bugs.gentoo.org/show_bug.cgi?id=502060#c0

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy


signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Peter Stuge
Michał Górny wrote:
> What I need others to do is provide the hosting for git repos.

I'm happy to set up repos on my git server with custom hooks and
accounts as needed.

It's probably not what we want long-term, but it might be useful as
proof of concept, so that infra only needs to do setup one time.

I even have some virtual hosting working, point an A to the right IP
and it looks like only desired repos are hosted there.

Gitweb, git-daemon and git over http and CAcert https with pretty URLs.


//Peter


pgp4M7ju1Sv1x.pgp
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Ivan Viso Altamirano
I think the better option Is to block rsync and force emerge-webrsync
.sended from a phone
Il 14/09/2014 14:03, Michał Górny ha scritto:
> The rsync tree
> --
>
> We'd also propagate things to rsync. We'd have to populate it with old
> ChangeLogs, new ChangeLog entries (autogenerated from git) and thick
> Manifests. So users won't notice much of a change.
>
If this will change all Changelog the first rsync from the users will
generate a lot of traffic, rsync network need to be prepared


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread viv...@gmail.com
Il 14/09/2014 14:03, Michał Górny ha scritto:
> The rsync tree
> --
>
> We'd also propagate things to rsync. We'd have to populate it with old
> ChangeLogs, new ChangeLog entries (autogenerated from git) and thick
> Manifests. So users won't notice much of a change.
>
If this will change all Changelog the first rsync from the users will
generate a lot of traffic, rsync network need to be prepared




Re: [gentoo-dev] gentoo git workflow

2014-09-14 Thread Andreas K. Huettel
> 
> However, rebasing changes *on* master, before they are pushed, is a good
> thing, because that kills non-fast-forward merges.
> 

Nontrivial rebases *on* master can be problematic because you're changing 
history. 

Imagine you pull some nice commits from a user. Then at some point you will 
have to rebase them before you push them. If this fails and requires manual 
interaction, the original version of the commits is lost (including 
signatures) and errors are not traceable. 

With a merge instead any manual intervention is clearly located in the merge 
commit and the authors of each change are uniquely identifiable.

-- 
Andreas K. Huettel
Gentoo Linux developer (council, kde)
dilfri...@gentoo.org
http://www.akhuettel.de/



Re: [gentoo-dev] gentoo git workflow

2014-09-14 Thread hasufell
"C. Bergström":
> Pretty please do NOT allow "merge" commits.. they are the bane of evil
> for the long term ability to have any sane work-flow.

It works pretty well for the linux kernel.

Ofc it's a matter of actually handling it. If people are unable to
properly handle tools/methods, everything could become the bane of evil,
no matter what tool you use.

> There's a big debate between merge vs rebase

I think most of those debates are nonsense. Both methods have their use
cases. But it matters when to use them. A lot of people don't even know
how to actually rebase, so they end up causing merge commits for
everything which will lead to a _very_ confusing history. Simply banning
that method is not a solution in my opinion.

The solution is to make a clear policy or recommendations when to use
one of them.



Re: [gentoo-dev] gentoo git workflow

2014-09-14 Thread C. Bergström

On 09/15/14 02:34 AM, hasufell wrote:

William Hubbs:

On Sun, Sep 14, 2014 at 08:04:12PM +0200, Andreas K. Huettel wrote:

Deciding on a _commit policy_ should be fairly straightforward and we
already have one point
* gpg sign every commit (unless it's a merged branch, then we only care
about the merge commit)

+1

Merge commits only happen if we allow non-fast-forward merges. I would
personally be against allowing merge commits on the master branch.


Allowing fast-forward merges will break signature verification if you
fetched from a user repo.
If we don't allow merge commits, then _every_ commit hast to be signed
by a gentoo dev (e.g. by using git-am). I don't see much sense in this.
It will rather complicate workflow.

The currently proposed verification script skips branch 'B', so what
matters is the signature of the merge commit which say "yes, I have
reviewed the users branch(es) and it's fine".

Merging from branches holds useful information. A linear history isn't
necessarily easier to understand, so from me linear history gets a

-1

It just isn't really "git" to me. But it also requires people to know
when to avoid merge commits.


Rebases involving commits that are already pushed to master probably
shouldn't be allowed.


Of course, yes. That has to be documented in a gentoo developer git guide.
Pretty please do NOT allow "merge" commits.. they are the bane of evil 
for the long term ability to have any sane work-flow. Trying browsing a 
commit history after a big merge commit.. or following the parent..


lastly - the "merge" commit itself could be very confusing to some 
people when viewed in github. (At least personally I find them 
frequently unreadable)


After 5 years of git where I work - they are now banned (policy) and I 
wish github would allow them to be banned (non-fast forward) to avoid 
mistakes


There's a big debate between merge vs rebase.. I'm not trying to go down 
the benefits of one workflow vs the other.. However, if rebase fails.. 
you can allow merge commits in the future.. The opposite isn't easily 
accomplished without squashing history and losing stuff..





Re: [gentoo-dev] gentoo git workflow

2014-09-14 Thread hasufell
William Hubbs:
> On Sun, Sep 14, 2014 at 08:04:12PM +0200, Andreas K. Huettel wrote:
>>
>>> Deciding on a _commit policy_ should be fairly straightforward and we
>>> already have one point
>>> * gpg sign every commit (unless it's a merged branch, then we only care
>>> about the merge commit)
>>
>> +1
> 
> Merge commits only happen if we allow non-fast-forward merges. I would
> personally be against allowing merge commits on the master branch.
> 

Allowing fast-forward merges will break signature verification if you
fetched from a user repo.
If we don't allow merge commits, then _every_ commit hast to be signed
by a gentoo dev (e.g. by using git-am). I don't see much sense in this.
It will rather complicate workflow.

The currently proposed verification script skips branch 'B', so what
matters is the signature of the merge commit which say "yes, I have
reviewed the users branch(es) and it's fine".

Merging from branches holds useful information. A linear history isn't
necessarily easier to understand, so from me linear history gets a

-1

It just isn't really "git" to me. But it also requires people to know
when to avoid merge commits.

> 
> Rebases involving commits that are already pushed to master probably
> shouldn't be allowed.
> 

Of course, yes. That has to be documented in a gentoo developer git guide.



Re: [gentoo-dev] gentoo git workflow

2014-09-14 Thread William Hubbs
On Sun, Sep 14, 2014 at 08:04:12PM +0200, Andreas K. Huettel wrote:
> 
> > Deciding on a _commit policy_ should be fairly straightforward and we
> > already have one point
> > * gpg sign every commit (unless it's a merged branch, then we only care
> > about the merge commit)
> 
> +1

Merge commits only happen if we allow non-fast-forward merges. I would
personally be against allowing merge commits on the master branch.

> 
> > More things to consider for commit policy are:
> > * commit message format (line length, maybe prepend category/PN?)
> 
> this could be done in part by repoman... having a meaningful shortlog would 
> be 
> nice.
 
 I don't see how repoman could do anything about this, but here is a
 good description of how to write git commit messages [1].

> > * do we expect repoman to run successfully for every commit (I'd say no)?
> 
> commit no, push yes?

+1, every time we push that should indicate a successful repoman run.

> > * additional information that must be provided

I'm not sure what additional information is being referred to.

> > * when to force/avoid merge commits
 
 I would be against merge commits on the master branch; everything
 should be a fast-forward merge.

> my take- disallow (by policy) nontrivial rebases by third parties, encourage 
> trivial rebases

Rebases involving commits that are already pushed to master probably
shouldn't be allowed.

However, rebasing changes *on* master, before they are pushed, is a good
thing, because that kills non-fast-forward merges.

William


signature.asc
Description: Digital signature


Re: [gentoo-dev] gentoo git workflow

2014-09-14 Thread Andreas K. Huettel

> Deciding on a _commit policy_ should be fairly straightforward and we
> already have one point
> * gpg sign every commit (unless it's a merged branch, then we only care
> about the merge commit)

+1

> More things to consider for commit policy are:
> * commit message format (line length, maybe prepend category/PN?)

this could be done in part by repoman... having a meaningful shortlog would be 
nice.

> * do we expect repoman to run successfully for every commit (I'd say no)?

commit no, push yes?

> * additional information that must be provided
> * when to force/avoid merge commits

my take- disallow (by policy) nontrivial rebases by third parties, encourage 
trivial rebases



-- 
Andreas K. Huettel
Gentoo Linux developer (council, kde)
dilfri...@gentoo.org
http://www.akhuettel.de/


signature.asc
Description: This is a digitally signed message part.


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread James Cloos
> "MG" == Michał Górny  writes:

MG> This means we don't have to wait till someone figures out the perfect
MG> way of converting the old CVS repository. You don't need that history
MG> most of the time, and you can play with CVS to get it if you really do.
MG> In any case, we would likely strip the history anyway to get a small
MG> repo to work with.

+1 on that.  The cvs repo can be converted to an historical git repo on
a slower timeframe, and remain available as cvs until then.

That old-vs-fresh concept worked fine for other projects (including Linux).

-JimC
-- 
James Cloos  OpenPGP: 0x997A9F17ED7DAEA6



Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Dirkjan Ochtman
On Sun, Sep 14, 2014 at 6:10 PM, hasufell  wrote:
> Let's try it with push access for every developer.

+1.

I'm pretty strongly opposed to leaving the history behind. I'd tend to
agree with Rich when he says that history conversion is pretty much a
solved problem, anyway.

Cheers,

Dirkjan



Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Rich Freeman:
> On Sun, Sep 14, 2014 at 10:56 AM, Michał Górny  wrote:
>> Dnia 2014-09-14, o godz. 10:33:03
>>
>> With git, we can finally do stuff like preparing everything and pushing
>> in one go. Rebasing or merging will be much easier then, since
>> the effective push rate will be smaller than current commit rate.
> 
> While I agree that the ability to consolidate commits will definitely
> help with the commit rate, I'm not sure it will make a big difference.
> It will turn a kde stablereq from 300 commits into 1, and do the same
> for things like package moves and such.  However, I suspect that the
> vast majority of our commits are things like bumps on individual
> packages that will always be individual commits.  Maybe insofar as one
> person does a bunch of them they can be pushed at the same time,
> but...
> 
> Looking at https://github.com/rich0/gentoo-gitmig-2014-02-21 it seems
> like we get about 150 commits/day on busy days.  I suspect that isn't
> evenly distributed, but you may be right that it will just work out.
> 

If the push frequency becomes so high that people barely get stuff
pushed because of conflicts, then we simply have to say goodbye to the
central repository workflow and have to establish a hierarchy where only
a handful of people have direct push access and the rest is worked out
through pull requests to project leads or dedicated reviewers.

So the merging and rebasing work would then be done by fewer people
instead of every single developer.

But given that currently project leads may or may not be active I'm not
sure that I'd vote for such a workflow. And I don't think we need that
yet (although enforced review workflow is ofc superior in many ways).

Let's try it with push access for every developer.



Re: [gentoo-dev] gentoo git workflow

2014-09-14 Thread Rich Freeman
On Sun, Sep 14, 2014 at 11:11 AM, hasufell  wrote:
>
> The only hard part is that people have to know the differences between
> merging/rebasing, fast-forward merges, non-fast-forward merges etc. and
> when and when not to do them.
>
> 'git rebase' is a powerful thing, but also pretty good to mess up your
> local history if used wrong.
>
> I think we can write up a gentoo-specific guide in 2-3 weeks.
>

Sounds good.  I think one thing we need to get over with the whole git
migration is the fact that it isn't going to be perfect.  We probably
will find minor errors in the migration itself, little glitches in the
back-end stuff, problems in the proposed workflow, and so on.  We're
just going to have to adapt.  We've been using cvs for eons and have
learned to ignore its shortcomings and have well-polished workflows.

It isn't like there are 500 devs doing commits every day.  We're a
reasonably tight community and we're just going to have to work
together to get over the inevitable bumps.

It may make sense to just start out with guidelines in the beginning,
and then we can turn them into rules when problems actually come up.
Once upon a time there wasn't a hard rule about changelog entries for
removals/etc, and the world didn't end, but we decided that having the
rule made more sense than not having it.  With git we should expect
more of the same - we won't get it 100% right out of the gate.

--
Rich



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Rich Freeman
On Sun, Sep 14, 2014 at 11:42 AM, hasufell  wrote:
> Patrick Lauer:
>>> Are we going to disallow merge commits and ask devs to rebase local
>>> changes in order to keep the history "clean"?
>>
>> Is that going to be sane with our commit frequency?
>>
>
> You have to merge or rebase anyway in case of a push conflict, so the
> only difference is the method and the effect on the history.
>
> Currently... CVS allows you to run repoman on an outdated tree and push
> broken ebuilds with repoman being happy. Git will not allow this.
>

Repoman is going to be a challenge here.  With cvs every package is
its own private repository with its own private history and cvs only
cares if there is a collision within the scope of a single file.

With git your commit is against the whole tree.  So, even though it is
trivial to merge, independent commits against two different packages
do collide and need to be rebased or merged.

Repoman can run against a single package fairly quickly, so assuming
we still allow that we could do a pull/rebase/repman/push workflow
even if people are doing commits every few minutes.  On the other
hand, if you're doing a package move or eclass change or some other
change that affects 300 packages, just doing the rebase might cost you
a few minutes (due to actual collisions), and running repoman against
the whole thing before doing a push isn't going to be practical.
Somebody doing a tree-wide commit would almost certainly have to run
repoman before the final rebase/merge, push that out, and then maybe
do another repoman after-the-fact and maybe clean up any issues.  For
all intents in purposes that is what we're doing today anyway, since
repoman+cvs doesn't offer any kind of tree-wide consistency guarantees
unless you're checking out based on a timestamp or something like
that.

--
Rich



Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Rich Freeman
On Sun, Sep 14, 2014 at 10:56 AM, Michał Górny  wrote:
> Dnia 2014-09-14, o godz. 10:33:03
>
> With git, we can finally do stuff like preparing everything and pushing
> in one go. Rebasing or merging will be much easier then, since
> the effective push rate will be smaller than current commit rate.

While I agree that the ability to consolidate commits will definitely
help with the commit rate, I'm not sure it will make a big difference.
It will turn a kde stablereq from 300 commits into 1, and do the same
for things like package moves and such.  However, I suspect that the
vast majority of our commits are things like bumps on individual
packages that will always be individual commits.  Maybe insofar as one
person does a bunch of them they can be pushed at the same time,
but...

Looking at https://github.com/rich0/gentoo-gitmig-2014-02-21 it seems
like we get about 150 commits/day on busy days.  I suspect that isn't
evenly distributed, but you may be right that it will just work out.

>>
>> Actually doing the conversion is basically a solved problem.  If this
>> were actually the blocker I'd be all for just sticking the history in
>> a different repo and starting from scratch with a new one.
>
> Was the resulting tree actually verified? How long does the conversion
> take? Can it be incremental, i.e. convert most of it, lock CVS, convert
> the remaining new commits?

The tree has been verified.  The verification approaches so far are
neither 100% thorough nor realtime in operation.  However, I think we
have a working migration process and I don't really see the need to do
a double-check at the time of the actual migration.

ferringb was able to do conversions in about 20min with a decent SSD
and a 32-core system.  His migration scripts can migrate categories in
parallel.  I haven't personally tried to run them myself, but I
believe robbat2 and patrick have experimented with them.  If there is
revived interested I can see if I can set them up to run in a chroot
with some documentation so that anybody can run it and satisfy
themselves that it works, assuming somebody else doesn't have such a
chroot ready to go.  If finding a host to run it on is a problem I'm
sure we could get the Trustees to spring for some time on EC2 or
whatever.  There is no reason that this couldn't be as simple as
extracting a tarball, bind-mounting a cvs repo inside, and firing off
the scripts.

I do not believe it can be made to be incremental.  But, the runtime
should be in keeping with your hour-or-two of downtime suggestion.  I
suspect a fair bit of the downtime will taken just to transfer the
copy of the cvroot to the migration server, and transfer the resulting
git tree to wherever it needs to go and get all the back-end scripts
running/etc.

>
> Are you willing to champion that, then? :)
>

Well, I'm in for what it matters.  I don't have root on any infra
boxes if that is what you're looking for.  :)

--
Rich



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Michał Górny
Dnia 2014-09-15, o godz. 03:15:14
Kent Fredric  napisał(a):

> On 15 September 2014 02:40, Michał Górny  wrote:
> 
> > However, I'm wondering if it would be possible to restrict people from
> > accidentally committing straight into github (e.g. merging pull
> > requests there instead of to our main server).
>
> => Github is just a read only mirror, any pull reqs submitted there will be
> fielded and pushed to gentoo directly.
> 
> Only downside there is the way github pull reqs work is if the final SHA1's
> that hit tree don't match, the pull req doesn't close.
> 
> Solutions:
> 
> - A) Have somebody tasked with reaping old pull reqs with permissions
> granted. ( Uck )
> - B) Always use a merge of some kind to mark the pull req as dead ( for
> instance, an "ours" merge to mark the branch as deprecated )
> 
> Both of those options are kinda ugly.

If you merge a pull request, I suggest doing a proper 'git merge -S'
anyway to get a developer signature on top of all the changes.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Patrick Lauer:
>> Are we going to disallow merge commits and ask devs to rebase local
>> changes in order to keep the history "clean"?
> 
> Is that going to be sane with our commit frequency?
> 

You have to merge or rebase anyway in case of a push conflict, so the
only difference is the method and the effect on the history.

Currently... CVS allows you to run repoman on an outdated tree and push
broken ebuilds with repoman being happy. Git will not allow this.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Patrick Lauer
On Sunday 14 September 2014 15:40:06 Davide Pesavento wrote:
> On Sun, Sep 14, 2014 at 2:03 PM, Michał Górny  wrote:
> > We have main developer repo where developers work & commit and are
> > relatively happy. For every push into developer repo, automated magic
> > thingie merges stuff into user sync repo and updates the metadata cache
> > there.
> 
> How long does the md5-cache regeneration process take? Are you sure it
> will be able to keep up with the rate of pushes to the repo during
> "peak hours"? If not, maybe we could use a time-based thing similar to
> the current cvs->rsync synchronization.

Best case only one package is affected - a few seconds
Worst case someone touches an eclass like eutils, then it expands to something 
on the order of one or two CPU-hours.
 
> Are we going to disallow merge commits and ask devs to rebase local
> changes in order to keep the history "clean"?

Is that going to be sane with our commit frequency?



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 02:40, Michał Górny  wrote:

> However, I'm wondering if it would be possible to restrict people from
> accidentally committing straight into github (e.g. merging pull
> requests there instead of to our main server).
>


Easy.

Put the Gentoo repo in its own group.
Don't give anyone any kinds of permissions on it.
Have only one approved account for the purpose of pushing commits.
Have a post-push hook that replicates to github as that approved account

=> Github is just a read only mirror, any pull reqs submitted there will be
fielded and pushed to gentoo directly.

Only downside there is the way github pull reqs work is if the final SHA1's
that hit tree don't match, the pull req doesn't close.

Solutions:

- A) Have somebody tasked with reaping old pull reqs with permissions
granted. ( Uck )
- B) Always use a merge of some kind to mark the pull req as dead ( for
instance, an "ours" merge to mark the branch as deprecated )

Both of those options are kinda ugly.



-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


[gentoo-dev] gentoo git workflow

2014-09-14 Thread hasufell
Rich Freeman:
> 
> This is one of the blockers.  We haven't actually decided how we want
> to use git.
> 

There are IMO 3 main things to consider for a git workflow:
* commit policy
* branching model
* remote model
(and history format somewhere implicitly)

Deciding on a _commit policy_ should be fairly straightforward and we
already have one point
* gpg sign every commit (unless it's a merged branch, then we only care
about the merge commit)

More things to consider for commit policy are:
* commit message format (line length, maybe prepend category/PN?)
* do we expect repoman to run successfully for every commit (I'd say no)?
* additional information that must be provided
* when to force/avoid merge commits

Deciding on _branching model_ should be pretty easy here too. We are
mainly working on master and there may be developer-specific branches etc.
History does not need to be linear.
Creating additional branches is up to developers and there are no
specific rules about that.

The _remote model_ is: use a central repository with every developer
having push access. I think this is pretty reasonable for our use case,
although I'd love to see a linux-like workflow with enforced reviews
that propagate through project members/leads. But I'm not sure we need
that much overhead, except for non-trivial stuff like eclasses where we
already require reviews (well, more or less).


The only hard part is that people have to know the differences between
merging/rebasing, fast-forward merges, non-fast-forward merges etc. and
when and when not to do them.

'git rebase' is a powerful thing, but also pretty good to mess up your
local history if used wrong.

I think we can write up a gentoo-specific guide in 2-3 weeks.



Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Michał Górny
Dnia 2014-09-14, o godz. 10:33:03
Rich Freeman  napisał(a):

> > Of course, that assumes infra is
> > going to cooperate quickly or someone else is willing to provide the
> > infra for it.
> 
> The infra components to a git infrastructure are one of the main
> blockers at this point.  I don't really see cooperation as the issue -
> just lack of manpower or interest.

By 'cooperating' I simply meant offering the necessary resources
in a reasonable time.

> >
> > 1. send announcement to devs to explain how to use git,
> 
> This is one of the blockers.  We haven't actually decided how we want
> to use git.
> 
> Sure, everybody knows how to use git.  The problem is that there are a
> dozen different ways we COULD use git, and nobody has picked the ONE
> way we WILL use it.
> 
> This isn't as trivial as you might think.  We have a fairly high
> commit rate and with a single repository that means that in-between a
> pull-merge/rebase-push there is a decent chance of another commit that
> will make the resulting push a non-fast-forward.
> 
> People love to point out linux and its insane commit rate.  The thing
> is, the mainline git repo with all those commits has exactly one
> committer - Linus himself.  They don't have one big repo with one
> master branch that everybody pushes to.  At least, that is my
> understanding (and there are certainly others here who are more
> involved with kernel development).

It's hard to talk about commit rate when we combine crippled CVS with
awfully stupid two-part repoman committing. This forces us to commit
everything immediately, and makes some of us not committing anything
at all anymore...

With git, we can finally do stuff like preparing everything and pushing
in one go. Rebasing or merging will be much easier then, since
the effective push rate will be smaller than current commit rate.

> > On top of user sync repo rsync is propagated. The rsync tree is populated
> > with all old ChangeLogs copied from CVS (stored in 30M git repo), new
> > ChangeLogs are generated from git logs and Manifests are expanded.
> 
> So, I don't really have a problem with your design.  I still question
> whether we still need to be generating changelogs - they seem
> incredibly redundant.  But, if people really want a redundant copy of
> the git log, whatever...

I don't want them too. However, I'm pretty sure people will bikeshed
this to death if we kill them... Especially that rsync has no git log.
Not that many users make real use of ChangeLogs, esp. considering
how useless messages often are there...

> > Main developer repo
> > ---
> >
> > I was able to create a start git repository that takes around 66M
> > as a git pack (this is how much you will have to fetch to start working
> > with it). The repository is stripped clean of history and ChangeLogs,
> > and has thin Manifests only.
> >
> > This means we don't have to wait till someone figures out the perfect
> > way of converting the old CVS repository. You don't need that history
> > most of the time, and you can play with CVS to get it if you really do.
> > In any case, we would likely strip the history anyway to get a small
> > repo to work with.
> 
> We already have a migration process that coverts the old CVS
> repository, generating both a shallow repository that lacks history
> and a full repository that contains all of history. Additionally,
> these two are consistent - that is the last branch of the full
> repository has the same commit ID as the base of the shallow
> repository.  Basically we generate the full history and then trim out
> 99% of it so that the commit in the shallow repository points to a
> parent that isn't in the packed repository.
> 
> Actually doing the conversion is basically a solved problem.  If this
> were actually the blocker I'd be all for just sticking the history in
> a different repo and starting from scratch with a new one.

Was the resulting tree actually verified? How long does the conversion
take? Can it be incremental, i.e. convert most of it, lock CVS, convert
the remaining new commits?

> > I think we should also merge gentoo-news & glsa & herds.xml into
> > the repository. They all reference Gentoo packages at a particular
> > state in time, and it would be much nicer to have them synced properly.
> >
> 
> I can see the pros/cons here, but I don't personally have an issue
> with merging them.  As has been brought up elsewhere herds.xml may
> just go away.
> 
> If somebody can come up with a set of hooks/scripts that will create
> the various trees and the only thing that is left is to get infra to
> host them, I think we can make real progress.  I don't think this is
> something that needs to take a long time.  The pieces are mostly there
> - they just have to be assembled.

Are you willing to champion that, then? :)

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Michał Górny
Dnia 2014-09-14, o godz. 15:23:24
Jauhien Piatlicki  napisał(a):

> Another question: will it be possible to maintain a copy of tree on github to 
> make contributions for users simpler (similarly to e.g. science overlay)? 
> (Can it somehow be combined with proposed signing mechanism?)

Yes. I'm planning to have a mirror on github and bitbucket,
and auto-pushing to both.

However, I'm wondering if it would be possible to restrict people from
accidentally committing straight into github (e.g. merging pull
requests there instead of to our main server).

In fact, I would start my experiments straight into github if not
the fact that they don't allow us to set our own update hooks.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Michał Górny
Dnia 2014-09-14, o godz. 15:40:06
Davide Pesavento  napisał(a):

> On Sun, Sep 14, 2014 at 2:03 PM, Michał Górny  wrote:
> > We have main developer repo where developers work & commit and are
> > relatively happy. For every push into developer repo, automated magic
> > thingie merges stuff into user sync repo and updates the metadata cache
> > there.
> 
> How long does the md5-cache regeneration process take? Are you sure it
> will be able to keep up with the rate of pushes to the repo during
> "peak hours"? If not, maybe we could use a time-based thing similar to
> the current cvs->rsync synchronization.

This strongly depends on how much data is there to update. A few
ebuilds are quite fast, eclass change isn't ;). I was thinking of
something along the lines of, in pseudo-code speaking:

  systemctl restart cache-regen

That is, we start the regen on every update. If it finishes in time, it
commits the new metadata. If another update occurs during regen, we
just restart it to let it catch the new data.

Of course, if we can't spare the resources to do intermediate updates,
we may as well switch to cron-based update method.

> [...]
> > In any case, we would likely strip the history anyway to get a small
> > repo to work with.
> >
> > I have prepared a basic git update hook that keeps master clean
> > and attached it to the bug [1]. It enforces basic policies, prevents
> > forced updates and checks GPG signatures on left-most history line. It
> > can also be extended to do more extensive tree checks.
> 
> Are we going to disallow merge commits and ask devs to rebase local
> changes in order to keep the history "clean"?

I don't think we should cripple git. Just to be clear, 'accidental'
merges won't happen because the automatic merges are unsigned
and the 'update' hook will refuse them.

The developers will have to either rebase and resign the commits, or
use a signed merge commit whichever makes more sense in particular
context.

Signed merge commits will also allow merging user-submitted changes
while preserving original history.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Michał Górny
Dnia 2014-09-14, o godz. 15:17:41
Ulrich Mueller  napisał(a):

> > On Sun, 14 Sep 2014, Michał Górny wrote:
> 
> > I think we should also merge gentoo-news & glsa & herds.xml into the
> > repository. They all reference Gentoo packages at a particular state
> > in time, and it would be much nicer to have them synced properly.
> 
> Not a good idea, because we may want to grant commit access to these
> repos for people who are not necessarily ebuild devs.

We may want to add metadata.xml access to those people too.

If you really are that distrustful of our contributors, I believe we
can do per-path filtering in the 'update' hook, or use pull request
or intermediate-repository based workflow.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Michał Górny
Dnia 2014-09-14, o godz. 15:09:25
Jauhien Piatlicki  napisał(a):

> 14.09.14 14:03, Michał Górny написав(ла):
> > Hi,
> > 
> > I'm quite tired of promises and all that perfectionist non-sense which
> > locks us up with CVS for next 10 years of bikeshed. Therefore, I have
> > prepared a plan how to do git migration, and I believe it's doable in
> > less than 2 weeks (plus the testing). Of course, that assumes infra is
> > going to cooperate quickly or someone else is willing to provide the
> > infra for it.
> > 
> 
> as always, nice effort, but I foresee lots of bikeshedding in this thread. )

Yes. I'm planning to ignore most of bikeshed and take only serious
answers into consideration. Otherwise, we will be stuck with CVS.

> > This means we don't have to wait till someone figures out the perfect
> > way of converting the old CVS repository. You don't need that history
> > most of the time, and you can play with CVS to get it if you really do.
> > In any case, we would likely strip the history anyway to get a small
> > repo to work with.
> 
> Is it so difficult to convert CVS history?

It may be difficult to convert it properly, especially considering
the splitting of ebuild+Manifest commit. Then we need to somehow check
if it was converted properly. I don't even want to waste my time on
this. IMO the history doesn't have such a great value.

> > The rsync tree
> > --
> > 
> > We'd also propagate things to rsync. We'd have to populate it with old
> > ChangeLogs, new ChangeLog entries (autogenerated from git) and thick
> > Manifests. So users won't notice much of a change.
> > 
> 
> How will user check the ebuild integrity with thick manifests using rsync?

The same way he currently does :).

> > The remaining issue is signing of stuff. We could supposedly sign
> > Manifests but IMO it's a waste of resources considered how poor
> > the signing system is for non-git repos.
> 
> Again, how will user check the integrity and authenticity if Manifests are 
> unsigned?

As far as I'm concerned, user can use the user git tree to get proper
signatures or any other method that has proper signing support already.

If someone wants proper GPG support in rsync, he can work on that.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


[gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Rich Freeman
On Sun, Sep 14, 2014 at 8:03 AM, Michał Górny  wrote:
>
> I'm quite tired of promises and all that perfectionist non-sense which
> locks us up with CVS for next 10 years of bikeshed.

While I tend to agree with the sentiment, I don't think you're
actually targeting the problems that aren't already solved here.

> Of course, that assumes infra is
> going to cooperate quickly or someone else is willing to provide the
> infra for it.

The infra components to a git infrastructure are one of the main
blockers at this point.  I don't really see cooperation as the issue -
just lack of manpower or interest.

>
> I can provide some testing repos once someone is willing to provide
> the hardware.

We already have plenty of testing repos (well, minus all the back-end stuff).

>
> 1. send announcement to devs to explain how to use git,

This is one of the blockers.  We haven't actually decided how we want
to use git.

Sure, everybody knows how to use git.  The problem is that there are a
dozen different ways we COULD use git, and nobody has picked the ONE
way we WILL use it.

This isn't as trivial as you might think.  We have a fairly high
commit rate and with a single repository that means that in-between a
pull-merge/rebase-push there is a decent chance of another commit that
will make the resulting push a non-fast-forward.

People love to point out linux and its insane commit rate.  The thing
is, the mainline git repo with all those commits has exactly one
committer - Linus himself.  They don't have one big repo with one
master branch that everybody pushes to.  At least, that is my
understanding (and there are certainly others here who are more
involved with kernel development).

>
> 2. lock CVS out to read-only,
>
> 3. create all the git repos, get hooks rolling,
>
> 4. enable R/W access to the repos.
>
> With some luck, no more than 2 hours downtime.

I agree that the actual conversion should be able to done quickly.

> On top of user sync repo rsync is propagated. The rsync tree is populated
> with all old ChangeLogs copied from CVS (stored in 30M git repo), new
> ChangeLogs are generated from git logs and Manifests are expanded.

So, I don't really have a problem with your design.  I still question
whether we still need to be generating changelogs - they seem
incredibly redundant.  But, if people really want a redundant copy of
the git log, whatever...

> Main developer repo
> ---
>
> I was able to create a start git repository that takes around 66M
> as a git pack (this is how much you will have to fetch to start working
> with it). The repository is stripped clean of history and ChangeLogs,
> and has thin Manifests only.
>
> This means we don't have to wait till someone figures out the perfect
> way of converting the old CVS repository. You don't need that history
> most of the time, and you can play with CVS to get it if you really do.
> In any case, we would likely strip the history anyway to get a small
> repo to work with.

We already have a migration process that coverts the old CVS
repository, generating both a shallow repository that lacks history
and a full repository that contains all of history. Additionally,
these two are consistent - that is the last branch of the full
repository has the same commit ID as the base of the shallow
repository.  Basically we generate the full history and then trim out
99% of it so that the commit in the shallow repository points to a
parent that isn't in the packed repository.

Actually doing the conversion is basically a solved problem.  If this
were actually the blocker I'd be all for just sticking the history in
a different repo and starting from scratch with a new one.

>
> I think we should also merge gentoo-news & glsa & herds.xml into
> the repository. They all reference Gentoo packages at a particular
> state in time, and it would be much nicer to have them synced properly.
>

I can see the pros/cons here, but I don't personally have an issue
with merging them.  As has been brought up elsewhere herds.xml may
just go away.

If somebody can come up with a set of hooks/scripts that will create
the various trees and the only thing that is left is to get infra to
host them, I think we can make real progress.  I don't think this is
something that needs to take a long time.  The pieces are mostly there
- they just have to be assembled.

--
Rich



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Kent Fredric
On 15 September 2014 00:03, Michał Górny  wrote:

> This means we don't have to wait till someone figures out the perfect
> way of converting the old CVS repository. You don't need that history
> most of the time, and you can play with CVS to get it if you really do.
>

Once somebody works this out, you can also simply make it available as a
"replacement" ref.

See 'git replace'

This would mean, essentially, you could push a ref called
'refs/replace/oldcvs' of value "firstsha1 oldcvssha1" and anyone who wanted
it  could manually fetch it, and any one who did fetch it would get the
full history in all of its glory, and then git would transparently pretend
that history was always there anyway.

No rebasing required, and available on a need-to-know basis :)

-- 
Kent

*KENTNL* - https://metacpan.org/author/KENTNL


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Davide Pesavento
On Sun, Sep 14, 2014 at 3:55 PM, hasufell  wrote:
> Davide Pesavento:
>>> In any case, we would likely strip the history anyway to get a small
>>> repo to work with.
>>>
>>> I have prepared a basic git update hook that keeps master clean
>>> and attached it to the bug [1]. It enforces basic policies, prevents
>>> forced updates and checks GPG signatures on left-most history line. It
>>> can also be extended to do more extensive tree checks.
>>
>> Are we going to disallow merge commits and ask devs to rebase local
>> changes in order to keep the history "clean"?
>>
>
> I'd say it doesn't make sense to create merge commits for conflicts that
> arise by someone having pushed earlier than you.
>
> Merge commits should only be there if they give useful information.
>

I totally agree. But is there a way to automatically enforce this?

> Also... if you merge from a _user_ who is untrusted and allow a
> fast-forward merge, then the signature verification fails. That means
> for such pull requests you either have to use "git am" or "git merge
> --no-ff".
>

Right. In that case you can either sign the merge commit or amend the
user's commit and sign it yourself (re-signing could be needed anyway
if you have to rebase).

Thanks,
Davide



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Jauhien Piatlicki:
> 
> Or well, have our own pull requests review tool.
> 
> 

Also only a secondary problem. Mirroring on github/bitbucket whatever
should be fairly straightforward to allow user contributions.

In addition the usual git workflow via e-mail/ML would become more
popular (either via git style patches or plain pull request information
with branch/commit/repository).

So I'd suggest to focus on the git migration first.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Davide Pesavento:
>> Main developer repo
>> ---
>>
>> I was able to create a start git repository that takes around 66M
>> as a git pack (this is how much you will have to fetch to start working
>> with it). The repository is stripped clean of history and ChangeLogs,
>> and has thin Manifests only.
>>
>> This means we don't have to wait till someone figures out the perfect
>> way of converting the old CVS repository. You don't need that history
>> most of the time, and you can play with CVS to get it if you really do.
> 
> +1
> 

+1

>> In any case, we would likely strip the history anyway to get a small
>> repo to work with.
>>
>> I have prepared a basic git update hook that keeps master clean
>> and attached it to the bug [1]. It enforces basic policies, prevents
>> forced updates and checks GPG signatures on left-most history line. It
>> can also be extended to do more extensive tree checks.
> 
> Are we going to disallow merge commits and ask devs to rebase local
> changes in order to keep the history "clean"?
> 

I'd say it doesn't make sense to create merge commits for conflicts that
arise by someone having pushed earlier than you.

Merge commits should only be there if they give useful information.

Also... if you merge from a _user_ who is untrusted and allow a
fast-forward merge, then the signature verification fails. That means
for such pull requests you either have to use "git am" or "git merge
--no-ff".



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread hasufell
Jauhien Piatlicki:
> 
> Again, how will user check the integrity and authenticity if Manifests are 
> unsigned?
> 

While this is an issue to be solved, it shouldn't be a blocker for the
git migration.

There is no regression if this isn't solved. There is no sane automated
method for verifying signed Manifests yet (that should be on PM level)
and signing them isn't even enforced throughout the tree. Moreover I
highly doubt that there is any user who runs around ebuild directories
and checks Manifest signatures by hand.

People who really care use emerge-webrsync.
If we use the proposed solution, then there is an additional method via
the User syncing repo, so it's a win.

We can put more effort into solving this for rsync mirrors later, but
I'd rather focus on the git migration.



Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Davide Pesavento
On Sun, Sep 14, 2014 at 2:03 PM, Michał Górny  wrote:
> We have main developer repo where developers work & commit and are
> relatively happy. For every push into developer repo, automated magic
> thingie merges stuff into user sync repo and updates the metadata cache
> there.

How long does the md5-cache regeneration process take? Are you sure it
will be able to keep up with the rate of pushes to the repo during
"peak hours"? If not, maybe we could use a time-based thing similar to
the current cvs->rsync synchronization.

[...]
> Main developer repo
> ---
>
> I was able to create a start git repository that takes around 66M
> as a git pack (this is how much you will have to fetch to start working
> with it). The repository is stripped clean of history and ChangeLogs,
> and has thin Manifests only.
>
> This means we don't have to wait till someone figures out the perfect
> way of converting the old CVS repository. You don't need that history
> most of the time, and you can play with CVS to get it if you really do.

+1

> In any case, we would likely strip the history anyway to get a small
> repo to work with.
>
> I have prepared a basic git update hook that keeps master clean
> and attached it to the bug [1]. It enforces basic policies, prevents
> forced updates and checks GPG signatures on left-most history line. It
> can also be extended to do more extensive tree checks.

Are we going to disallow merge commits and ask devs to rebase local
changes in order to keep the history "clean"?

Thanks a lot,
Davide



Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Ulrich Mueller
> On Sun, 14 Sep 2014, Johannes Huber wrote:

> Am Sonntag 14 September 2014, 15:17:41 schrieb Ulrich Mueller:
>> > On Sun, 14 Sep 2014, Michał Górny wrote:
>> > I think we should also merge gentoo-news & glsa & herds.xml into the
>> > repository. They all reference Gentoo packages at a particular state
>> > in time, and it would be much nicer to have them synced properly.
>> 
>> Not a good idea, because we may want to grant commit access to
>> these repos for people who are not necessarily ebuild devs.

> This could be solved by a pull requests review tool (gerrit,
> reviewboard, gitlab etc).

Second argument is that gentoo-x86 is large enough as it is, and we
shouldn't make it even larger by merging in things that are not
strictly necessary. Especially glsa has a non negligible size.

Ulrich


pgpl6AlNuOgQC.pgp
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Jauhien Piatlicki
14.09.14 15:25, "C. Bergström" написав(ла):
> On 09/14/14 08:24 PM, Jauhien Piatlicki wrote:
>> 14.09.14 15:23, Jauhien Piatlicki написав(ла):
>>> Another question: will it be possible to maintain a copy of tree on github 
>>> to make contributions for users simpler (similarly to e.g. science 
>>> overlay)? (Can it somehow be combined with proposed signing mechanism?)
>>>
>>>
>> Or well, have our own pull requests review tool.
> NIH? What would be the benefit of that.. before going down this path.. I 
> think there's some good tools around which may at least serve as a base to 
> (fork) from before starting a ground up project.
> 
> Sorry to jump in the middle of the conversation, but I know 1st hand how much 
> is involved here.
> 

I was not precise. By our own I mean hosted by us, not by github. )




signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread C. Bergström

On 09/14/14 08:24 PM, Jauhien Piatlicki wrote:

14.09.14 15:23, Jauhien Piatlicki написав(ла):

Another question: will it be possible to maintain a copy of tree on github to 
make contributions for users simpler (similarly to e.g. science overlay)? (Can 
it somehow be combined with proposed signing mechanism?)



Or well, have our own pull requests review tool.
NIH? What would be the benefit of that.. before going down this path.. I 
think there's some good tools around which may at least serve as a base 
to (fork) from before starting a ground up project.


Sorry to jump in the middle of the conversation, but I know 1st hand how 
much is involved here.




Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Jauhien Piatlicki
14.09.14 15:23, Jauhien Piatlicki написав(ла):
> Another question: will it be possible to maintain a copy of tree on github to 
> make contributions for users simpler (similarly to e.g. science overlay)? 
> (Can it somehow be combined with proposed signing mechanism?)
> 
> 

Or well, have our own pull requests review tool.




signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Jauhien Piatlicki
Another question: will it be possible to maintain a copy of tree on github to 
make contributions for users simpler (similarly to e.g. science overlay)? (Can 
it somehow be combined with proposed signing mechanism?)




signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Johannes Huber
Am Sonntag 14 September 2014, 15:17:41 schrieb Ulrich Mueller:
> > On Sun, 14 Sep 2014, Michał Górny wrote:
> > I think we should also merge gentoo-news & glsa & herds.xml into the
> > repository. They all reference Gentoo packages at a particular state
> > in time, and it would be much nicer to have them synced properly.
> 
> Not a good idea, because we may want to grant commit access to these
> repos for people who are not necessarily ebuild devs.
> 
> Ulrich

This could be solved by a pull requests review tool (gerrit, reviewboard, 
gitlab etc).

-- 
Johannes Huber (johu)
Gentoo Linux Developer / KDE Team
GPG Key ID F3CFD2BD

signature.asc
Description: This is a digitally signed message part.


[gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Ulrich Mueller
> On Sun, 14 Sep 2014, Michał Górny wrote:

> I think we should also merge gentoo-news & glsa & herds.xml into the
> repository. They all reference Gentoo packages at a particular state
> in time, and it would be much nicer to have them synced properly.

Not a good idea, because we may want to grant commit access to these
repos for people who are not necessarily ebuild devs.

Ulrich


pgpiMJ_H9pr1L.pgp
Description: PGP signature


Re: [gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Jauhien Piatlicki
Hi,

14.09.14 14:03, Michał Górny написав(ла):
> Hi,
> 
> I'm quite tired of promises and all that perfectionist non-sense which
> locks us up with CVS for next 10 years of bikeshed. Therefore, I have
> prepared a plan how to do git migration, and I believe it's doable in
> less than 2 weeks (plus the testing). Of course, that assumes infra is
> going to cooperate quickly or someone else is willing to provide the
> infra for it.
> 

as always, nice effort, but I foresee lots of bikeshedding in this thread. )

> This means we don't have to wait till someone figures out the perfect
> way of converting the old CVS repository. You don't need that history
> most of the time, and you can play with CVS to get it if you really do.
> In any case, we would likely strip the history anyway to get a small
> repo to work with.
> 

Is it so difficult to convert CVS history?

> 
> The rsync tree
> --
> 
> We'd also propagate things to rsync. We'd have to populate it with old
> ChangeLogs, new ChangeLog entries (autogenerated from git) and thick
> Manifests. So users won't notice much of a change.
> 

How will user check the ebuild integrity with thick manifests using rsync?

> The remaining issue is signing of stuff. We could supposedly sign
> Manifests but IMO it's a waste of resources considered how poor
> the signing system is for non-git repos.
> 

Again, how will user check the integrity and authenticity if Manifests are 
unsigned?

Also, it would be a good idea to add automatic signature checking to portage 
for overlays that use signing (or is it already done?).

--
Jauhien




signature.asc
Description: OpenPGP digital signature


[gentoo-dev] My masterplan for git migration (+ looking for infra to test it)

2014-09-14 Thread Michał Górny
Hi,

I'm quite tired of promises and all that perfectionist non-sense which
locks us up with CVS for next 10 years of bikeshed. Therefore, I have
prepared a plan how to do git migration, and I believe it's doable in
less than 2 weeks (plus the testing). Of course, that assumes infra is
going to cooperate quickly or someone else is willing to provide the
infra for it.

I can provide some testing repos once someone is willing to provide
the hardware.


What needs to be done
-

I can do most of the scripting. What I need others to do is provide
the hosting for git repos. We can't use public services like github
since they don't allow us to set our own update hook, so we can't
enforce signing policies etc.

Once basic infra is ready, I think the following is the best way to
switch:

1. send announcement to devs to explain how to use git,

2. lock CVS out to read-only,

3. create all the git repos, get hooks rolling,

4. enable R/W access to the repos.

With some luck, no more than 2 hours downtime.


The infra
-

The general idea is based on 3-level structure that's extension of how
Funtoo works. The following ultimately pretty picture explains that:

  ++
  | developer repo | - - - - - - - - - - -, 
  ++  v
  |   +--+
  |   | cache, DTDs and other extras |
  v   +--+
  ++  |
  | user sync repo | <'
  ++ - - - - - - - - - - ,
  |  v
  |   +-+
  |   | ChangeLogs, thick Manifests |
  v   +-+
  ++ |
  |rsync   | <---'
  ++

Text version:

We have main developer repo where developers work & commit and are
relatively happy. For every push into developer repo, automated magic
thingie merges stuff into user sync repo and updates the metadata cache
there.

User sync repo is for power users than want to fetch via git. It's quite
fast and efficient for frequent updates, and also saves space by being free
of ChangeLogs.

On top of user sync repo rsync is propagated. The rsync tree is populated
with all old ChangeLogs copied from CVS (stored in 30M git repo), new
ChangeLogs are generated from git logs and Manifests are expanded.


Main developer repo
---

I was able to create a start git repository that takes around 66M
as a git pack (this is how much you will have to fetch to start working
with it). The repository is stripped clean of history and ChangeLogs,
and has thin Manifests only.

This means we don't have to wait till someone figures out the perfect
way of converting the old CVS repository. You don't need that history
most of the time, and you can play with CVS to get it if you really do.
In any case, we would likely strip the history anyway to get a small
repo to work with.

I have prepared a basic git update hook that keeps master clean
and attached it to the bug [1]. It enforces basic policies, prevents
forced updates and checks GPG signatures on left-most history line. It
can also be extended to do more extensive tree checks.

For GPG signing, I relied upon gpg to do the right thing. That is, git
checks the signatures and we accept only trusted signatures. So
an external tool (gentoo-keys) need to play with gpg to import, trust
and revoke developer keys.

I think we should also merge gentoo-news & glsa & herds.xml into
the repository. They all reference Gentoo packages at a particular
state in time, and it would be much nicer to have them synced properly.

[1]:https://bugs.gentoo.org/show_bug.cgi?id=502060


User syncing repo
-

IMO this will be the most useful syncing method. The user syncing repo
is updated automatically for developer repo commits, and afterwards
md5-cache is regenerated and committed. Also other repositories (like
DTDs, glsas and others if you dislike the previous idea) are merged
into it.

This repo is still free of ChangeLogs (since git logs are more
efficient) and has thin Manifests. It's the space-efficient Gentoo
variant. And commits are signed so users can verify the trust.


The rsync tree
--

We'd also propagate things to rsync. We'd have to populate it with old
ChangeLogs, new ChangeLog entries (autogenerated from git) and thick
Manifests. So users won't notice much of a change.

The remaining issue is signing of stuff. We could supposedly sign
Manifests but IMO it's a waste of resources considered how poor
the signing system is for non-git repos.

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature