Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-09 Thread Donnie Berkholz
On 20:55 Sat 06 Aug , Robin H. Johnson wrote:
 Everything you have mentioned here was previously covered in the 
 discussions about Git conversion models. Please consult the history of 
 this list, as well as the -scm list. Additionally, a large discussion 
 about the pros and cons of all 3 models (package per repo, category 
 per repo, single repo) was had at the GSoC mentor summit last year, 
 and a number of the core Git developers were involved in the 
 discussion.

While noting the above [1 and its thread], I'd also like to point out 
that git submodules are conceptually a good fit but the implementation 
is lacking. Two examples:

- Creating new submodules requires administrative rights on the server. 
You can't just add one and push it up. This could conceivably be fixed 
by a hook that ran a specific privileged command to add a submodule, but 
I'm not really sure how or whether it's currently possible given the 
times available to run hooks.

- What we'd really want with submodules is to have the primary object 
storage shared in the master repo rather than in the submodule. That way 
we'd benefit from compression across packages, and furthmore, package 
moves wouldn't duplicate history.

If you're interested in fixing the above problems as well as the ones 
that exist regardless of repo format (linked on the main tracker bug 
[2]), then submodules could become a better option.

-- 
Thanks,
Donnie

Donnie Berkholz
Council Member / Sr. Developer
Gentoo Linux
Blog: http://dberkholz.com

1. 
http://archives.gentoo.org/gentoo-scm/msg_98932c55ec10fcc5445ab950e62b12dc.xml
2. https://bugs.gentoo.org/show_bug.cgi?id=333531


pgp7n15SHaMHz.pgp
Description: PGP signature


Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-08 Thread Andreas K. Huettel
Am Samstag 06 August 2011, 23:57:13 schrieb Fabio Erculiani:
 I really love the idea of being able to atomically push updates across
 multiple CPVs.
 This is also what KDE, GNOME, and many other teams are waiting for.
 Having multiple repos means no atomicity and at this point, I would
 rather prefer CVS (omg!).

Exactly. This is why I would also vote for a single tree and single modern vcs.

In addition, I would like to propose that we keep the number of required 
home-made addons and scripts to a minimum. As long as we have straight cvs or 
straight git, every tool developed for these systems just works. As soon as we 
start assembling our tree with a huge self-made infrastructure, we're all 
confined to our own tools for every operation that steps over the newly created 
repository limits.


-- 
Andreas K. Huettel
Gentoo Linux developer - kde, sci, arm, tex
dilfri...@gentoo.org
http://www.akhuettel.de/



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-08 Thread Paweł Hajdan, Jr.
On 8/8/11 7:42 AM, Andreas K. Huettel wrote:
 Am Samstag 06 August 2011, 23:57:13 schrieb Fabio Erculiani:
 I really love the idea of being able to atomically push updates
 across multiple CPVs. This is also what KDE, GNOME, and many other
 teams are waiting for. Having multiple repos means no atomicity and
 at this point, I would rather prefer CVS (omg!).
 
 Exactly. This is why I would also vote for a single tree and single
 modern vcs.

+1 here. I'm curious what problems multiple repos would be solving, or
is it just it's cool and Fedora/other distros does it ?

 In addition, I would like to propose that we keep the number of
 required home-made addons and scripts to a minimum. As long as we
 have straight cvs or straight git, every tool developed for these
 systems just works. As soon as we start assembling our tree with a
 huge self-made infrastructure, we're all confined to our own tools
 for every operation that steps over the newly created repository
 limits.

+1 here too. Vanilla git + repoman is cool. If we have a wrapper on top
of that that assembles the rsync tree it starts to be much more complex,
even more than our current CVS it seems.



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-07 Thread Fabian Groffen
On 06-08-2011 16:36:00 +0100, Markos Chandras wrote:
 I like your proposal but please clarify the following two questions
 
 1) Each package requires a new repository. Who is responsible to create
 that? Should developers be responsible to do that or they should ping infra?

I would prefer all ebuild devs to be able to create new packages
(repos), like they can right now.

 2) Assuming the repository for a new package was created. Who is
 responsible to include this in the rsync generation file?

The dev in question that wants it to be added to the rsync tree.

 I think your approach requires centralised administration to ensure
 minimal incidents in the infrastructure mechanisms.

Absolutely.  Typically the rsync generation file is in its own repo, and
requires as much centralisation as the current CVS tree, or the proposed
git variant of that tree.


-- 
Fabian Groffen
Gentoo on a different level



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-07 Thread Fabian Groffen
On 07-08-2011 00:07:41 +0530, Nirbheek Chauhan wrote:
 On Sat, Aug 6, 2011 at 7:43 PM, Fabian Groffen grob...@gentoo.org wrote:
  In short, the repo-per-package model means that each package
  (my-cat/package) is a separate repository in some VCS.
  Instead of having a huge tree that will only grow forever (gx86),
  packages are just in their own repository.
 
 I had mixed feelings while reading your email. The idea is certainly
 very intriguing, but there's a few things that make it a no-go for me:
 
 1. One of the big things I've been looking forward to with git is the
 ability to do atomic commits across the tree. Addition of GNOME
 releases, pkgmove changes across the tree, changing ebuild/eclass
 behaviour, etc. without inconsistency or praying that my connection
 doesn't get dropped in the middle of a hundred interrelated commits.
 Without this feature, I think some arch teams and GNOME/KDE teams will
 become sad.

I see this being possible by making a single commit to the rsync tree
generation script.

I also consider alternatives possible, as touched upon by James Cloos in
this thread where large projects like GNOME and KDE have a single
repository for all/most of their ebuilds, and perhaps even eclasses.
Repo-per-package may be too finegrained for projects like these, and
being flexible here is not going to be any problem AFAICT.

 2. The ability to do feature commits across the whole tree instead
 of hundreds of tiny commits everywhere. This combined with the
 ChangeLog generation will save a lot of time and space. This will
 especially benefit arch teams, but I've felt the need for this
 numerous times myself. Example: we moved to using .xz tarballs for
 GNOME, and that touched a lot of ebuilds, and it was extremely
 time-consuming to repeat echangelog  repoman commit per-package.

Consensus is that echangelog is eventually going to disappear, IIRC, and
repoman commit probably can be done on the entire tree/repo, with the
help of sub-repos, or when you have a repo for full GNOME.

Whether you script a loop, or make a single call to repoman, you always
have to pay for running repoman, since it's your QA tool, that you're
not supposed to skip/bypass.

 3. Adding packages from overlays via `cherry-pick` or `git am` will
 become extremely tedious. If thin manifests are implemented, a series
 of patches + a simple merge hook will be all you need to move
 KDE/GNOME releases from the overlay to the tree. Without a single
 tree, you need to go back to the current way of doing things.

With my proposal you wouldn't do this.  You would simply add a line in
the rsync tree script for including that package.  Most probably the
package would already live on g.g.o or something Gentooish, so it
wouldn't move at all, it would just be included.

In case you would have a repo with multiple packages, you would just
tell the script to now also include the directory where your package
lives.

 4. We'll need to write extra tools to keep the user's cat/pkg list
 up-to-date; adding and removing repositories as needed, etc. This is
 added complexity for which we'll need volunteers (we've been facing a
 manpower shortage already...)

I don't understand this.  Users don't see anything of this change.
Developers could use subtrees, forests, or just only what they care
about.

 5. The total size of the tree will increase a *lot* since all these
 repositories will no longer share data. The current gentoo-x86 tree
 stored in git without history takes only ~25MB because ebuilds are
 extremely redundant. The space requirements will balloon once we need
 to store 15,000 repositories. And arch teams will have to store *all*
 of them, often on devices with very low space.

I'm not too concerned about disk space.  Cloning a repo as-needed should
be fairly fast, and even arch teams won't need all 15,000 repositories.
It's easy to throw away repos for packages no longer necessary too.
For the limited disk-space arches, the specialised rsync trees do come
in handy though.


-- 
Fabian Groffen
Gentoo on a different level



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-07 Thread Fabian Groffen
On 06-08-2011 16:17:32 -0400, James Cloos wrote:
 Your idea is a step in the right direction, but the ideal config would
 have a top level portage.git with sub-modules for each category, as well
 as for eclass, licenses, profiles and scripts.  Each category.git should
 have sub-modules for each package therein.

I believe the size of a repo (how much it contains) should depend on
what it is.  Some packages (like e.g. Mutt) live very well on their own,
I understand larger projects like GNOME and KDE prefer to have many
sub-components in one repo.

I don't necessarily think there should be a clear hierarchy, although
subtrees may require that.

 Within the profiles.git it *might* be reasonable for each directory in
 arch/ also to be a sub-modules.  Or not.  That should be dicussed.
 
 And the bureaucracy should be minimal.  Adding, changing or removing a
 submodule from its parent repo should only require a call for consensus
 among the devs, and not be pushed through a small set of devs on some
 given team.

Currently, all devs can add and remove (with notice) packages, so I
don't see why that would require a consensus with this model, suddenly.

 It may also be useful for the process which generates metadata/ to push
 out to a repo, too, just before syncing out to the rsync mirrors.

I don't understand what you mean by this.  Can you elaborate?

 Having each package in its own repo is a great idea.  But a simple
 recursive git pull to update the whole thing is highly desireable.
 Git submodules fit the bill perfectly.

I assumed something like this possible to be able to get all easily or
something.


-- 
Fabian Groffen
Gentoo on a different level



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-07 Thread Fabian Groffen
On 06-08-2011 22:42:33 +0200, Krzysztof Pawlik wrote:
 To be honest I don't like that idea. I don't see any benefits from doing so:
  - tree generation is dynamic - actually I think this is a disadvantage, it 
 has
 a nice potential to eat a lot of resources on master rsync server, also having
 different flavours of the tree only brings in added complexity

To be honest, I don't see any problem there.  The rsync master server is
a modern machine.  Generating multiple trees, hardly takes more since
all repos in use are shared, of course.
With the prefix rsync tree generation [1] in mind, I think the extra
cost timewise aren't too bad either.

 So:
  - having it all in single repository means that I need to care only about one
 thing, not around 14956 of them

subtrees would help you here

  - git was designed to be efficient with large repositories, use this ability

I'm not claiming git is inefficient.  I think our current model is not
very flexible.  An alternatives like the one I proposed solves certain
problems that currently exist within Gentoo.


[1] http://stats.prefix.freens.org/timing-rsync0.png

-- 
Fabian Groffen
Gentoo on a different level



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-07 Thread Fabian Groffen
On 06-08-2011 20:55:05 +, Robin H. Johnson wrote:
 On Sat, Aug 06, 2011 at 04:13:52PM +0200, Fabian Groffen wrote:
  In this email, I step away from the current model that Gentoo uses for
  the gentoo-x86 repository.  Instead, I consider a repo-per-package
  model, as in use by e.g. Fedora [1] and Debian [2].
 Everything you have mentioned here was previously covered in the
 discussions about Git conversion models. Please consult the history of
 this list, as well as the -scm list. Additionally, a large discussion
 about the pros and cons of all 3 models (package per repo, category per
 repo, single repo) was had at the GSoC mentor summit last year, and a
 number of the core Git developers were involved in the discussion.

I see now my previous search wasn't complete.  Please correct me if I'm
wrong, but I have the impression the previous discussions looked at
repo-per-package just from a storage point of view, not from a
functional point of view.  The git overhead for repo-per-package is
admittedly quite undesirable.

 Problems:
 - atomic/well-ordered commits that span packages, eclasses and profiles/
   directories. (Esp. committing to eclasses and then packages
   afterwards).

This can be done with a single commit to the rsync tree script, and it
doesn't necessarily need git repos.


-- 
Fabian Groffen
Gentoo on a different level



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-07 Thread Michał Górny
On Sun, 7 Aug 2011 11:12:47 +0200
Fabian Groffen grob...@gentoo.org wrote:

  Problems:
  - atomic/well-ordered commits that span packages, eclasses and
  profiles/ directories. (Esp. committing to eclasses and then
  packages afterwards).
 
 This can be done with a single commit to the rsync tree script, and it
 doesn't necessarily need git repos.

And have you considered the function PoV on this?

With clean git repo: few commits, git push

With your split-tree: a lot of commits to random packages, potentially
using random VCS-es, a lot of pushes, hacking some magical rsync stuff
and finally guessing what went wrong this time

-- 
Best regards,
Michał Górny


signature.asc
Description: PGP signature


Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-07 Thread Fabian Groffen
On 07-08-2011 11:21:51 +0200, Michał Górny wrote:
 Fabian Groffen grob...@gentoo.org wrote:
  This can be done with a single commit to the rsync tree script, and it
  doesn't necessarily need git repos.
 
 And have you considered the function PoV on this?
 
 With clean git repo: few commits, git push
 
 With your split-tree: a lot of commits to random packages, potentially
 using random VCS-es, a lot of pushes, hacking some magical rsync stuff
 and finally guessing what went wrong this time

Ideally, only one VCS would be in use.  For the current situation there
is both CVS and git, though.

With some experience from the Prefix rsync tree generation (CVS + SVN),
I can tell the magic is quite absent, and I've seen no guessing what
went wrong this time.

I have considered it.


-- 
Fabian Groffen
Gentoo on a different level



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-07 Thread Rich Freeman
On Sun, Aug 7, 2011 at 5:12 AM, Fabian Groffen grob...@gentoo.org wrote:
 On 06-08-2011 20:55:05 +, Robin H. Johnson wrote:
 Problems:
 - atomic/well-ordered commits that span packages, eclasses and profiles/
   directories. (Esp. committing to eclasses and then packages
   afterwards).

 This can be done with a single commit to the rsync tree script, and it
 doesn't necessarily need git repos.


What exactly are you thinking about here.  How about this use case:

I have a list of 150 packages/versions.  I want to make all of them go
from ~x86 to x86 at the same time.

If they're all in one git repo, then I can use a script or whatever to
go through every one at leisure and rekeyword them.  Then I can do a
repoman scan on the entire repository for an hour or two if I want.
When I'm happy I can commit everything atomically.

How do you envision doing this by just making a single commit to the
rsync tree script if the files are in multiple repos?  Right now that
rsync tree is pulling in all those files already - in the ~x86
version.  Do you propose cloning all the repos, fixing the arch flag
in the new repos, and then re-pointing the rsync tree atomically?
That would work, but any commits to the 150 packages by others in the
meantime would get lost, and it seems a bit painful to do it this way.
 I can see how you could atomically add or remove 150 packages
entirely, but not how you can tweak individual versions of packages
without a fair bit of pain.  Admittedly, you could have some clever
solution in mind that I'm just not grokking.

The other thing that was tossed out is having multiple repos, but
putting things like kde/gnome in their own bigger repos.  I'm not sure
this is going to work, since it only works for those particular
situations.  A package can only be in one repo, so you can't have one
repo for kde, and another repo for everything that uses qt, and
another for everything that uses pulseaudio, or whatever.  Atomic
changes to many packages could be required for any number of unforseen
reasons.

I can see the elegance of allowing the portage tree to be a collage of
packages from different sources, but I'm not convinced we really need
this.  Users can already accomplish this on their end with overlays.
It seems like we're just making the portage tree an overlay of its
own.  I'm not sure what it really buys us.  Just using git in the
first place already simplifies distributed development.  If you took
this idea to an extreme you might not have the rsync server assemble
the tree at all, but just push out the official list as a
recommended list of overlays, and let the users put their own trees
together (with the ability to override parts of it).

Rich



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-07 Thread Fabian Groffen
On 07-08-2011 07:05:03 -0400, Rich Freeman wrote:
 What exactly are you thinking about here.  How about this use case:
 
 I have a list of 150 packages/versions.  I want to make all of them go
 from ~x86 to x86 at the same time.
 
 If they're all in one git repo, then I can use a script or whatever to
 go through every one at leisure and rekeyword them.  Then I can do a
 repoman scan on the entire repository for an hour or two if I want.
 When I'm happy I can commit everything atomically.
 
 How do you envision doing this by just making a single commit to the
 rsync tree script if the files are in multiple repos?  Right now that
 rsync tree is pulling in all those files already - in the ~x86
 version.  Do you propose cloning all the repos, fixing the arch flag
 in the new repos, and then re-pointing the rsync tree atomically?
 That would work, but any commits to the 150 packages by others in the
 meantime would get lost, and it seems a bit painful to do it this way.
  I can see how you could atomically add or remove 150 packages
 entirely, but not how you can tweak individual versions of packages
 without a fair bit of pain.  Admittedly, you could have some clever
 solution in mind that I'm just not grokking.

Not sure.  You could branch I guess.  It takes more work, undoubtedly.

 The other thing that was tossed out is having multiple repos, but
 putting things like kde/gnome in their own bigger repos.  I'm not sure
 this is going to work, since it only works for those particular
 situations.  A package can only be in one repo, so you can't have one
 repo for kde, and another repo for everything that uses qt, and
 another for everything that uses pulseaudio, or whatever.  Atomic
 changes to many packages could be required for any number of unforseen
 reasons.

This indeed makes it difficult.

 I can see the elegance of allowing the portage tree to be a collage of
 packages from different sources, but I'm not convinced we really need
 this.  Users can already accomplish this on their end with overlays.
 It seems like we're just making the portage tree an overlay of its
 own.  I'm not sure what it really buys us.  Just using git in the
 first place already simplifies distributed development.  If you took
 this idea to an extreme you might not have the rsync server assemble
 the tree at all, but just push out the official list as a
 recommended list of overlays, and let the users put their own trees
 together (with the ability to override parts of it).

I don't feel users should be playing with these things in general.  I
see the tree assembling thing more as a technical way to deal with some
given legacy and limitations.  Admittedly, it isn't perfect, and many
people seem to intend doing things with a git-based tree that cannot be
done with now CVS, and an assembled tree wouldn't really support it out
of the box either.


-- 
Fabian Groffen
Gentoo on a different level



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-07 Thread Nathan Phillip Brink
On Sat, Aug 06, 2011 at 04:13:52PM +0200, Fabian Groffen wrote:
 - tree generation is dynamic
   + easy to move packages around, their category is specified by the
 tree configuration, the repository the package lives in doesn't change,
 probably overlays, betagarden, graveyard, sunset, etc. can all go
 - per package branches
   + instead of developing in overlays, simply branches could be used,
 such that a single place is sufficient to for each package

Recreating the overlay experience with many repos sounds
difficult. Many overlays include multi-component packages or changes
to interdependent packages. Using per-package branching instead of
overlays would complicate this, with a user (or layman) having to
search each package's repository for branches associated with a
particular overlay when trying to guess which overlay a package should
be pulled from.

The current behavior of PORTDIR_OVERLAY is quite well-defined and
easier to understand. It even allows overlays to gracefully fall
behind in keeping their packages up to date. For example, when a fix
in an overlay is committed to gentoo-x86 as a new ebuild revision, the
overlay maintainer can forget that he has a stale version of the
package without harming anyone because portage chooses the newest
package. It seems that the traditional overlay idea -- where overlays
overlay gentoo-x86 and eachother -- can't quite exist with per-package
branches. To recreate this idea, you'd need to have one checkout per
package per repo (including overlays) and you'd still use
PORTDIR_OVERLAY.

I sorta like how overlays work currently ;-).

-- 
binki

Look out for missing or extraneous apostrophes!


pgpkfdJI7hpcw.pgp
Description: PGP signature


Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-06 Thread Markos Chandras
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA512

On 08/06/2011 03:13 PM, Fabian Groffen wrote:
 All,
 
I think this post belongs to either -project or -scm MLs but anyway

 When we migrate away from CVS for gentoo-x86 (gx86), as it looks
 now,

I like your proposal but please clarify the following two questions

1) Each package requires a new repository. Who is responsible to create
that? Should developers be responsible to do that or they should ping infra?

2) Assuming the repository for a new package was created. Who is
responsible to include this in the rsync generation file?

I think your approach requires centralised administration to ensure
minimal incidents in the infrastructure mechanisms.

- -- 
Regards,
Markos Chandras / Gentoo Linux Developer / Key ID: B4AFF2C2
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.18 (GNU/Linux)

iQIcBAEBCgAGBQJOPV9gAAoJEPqDWhW0r/LCXYUP/12m801wqAFfb0mdLkckCpa4
x/B4JNYPRqu+ec8ItO+WqOlDpNdg/QSfaGy/6YwCqp4jS0Ijz+MoZDGElgyjnhTD
0M8KiYKZKlhPsf/skWfs1wfFH0IPzCBfz7+soCAp8Lx30LMqZUJjFu5jTpQRS9KX
Aegn8LIlhJIF8tQk9RlfsMdqybMLLw6IGPlylDGJ0pRcJ8oGycRbePF4Gko5m5QJ
iBofXfYhkZTL5vhlFotbdnVdW3q+MlwvSge4liVKiWhjLUJGvJdvJCfL85fOSQGO
z1qBkOKannmdc4O4xxN2H4dVseA8rHbY1ZzxHqo5w0B5YHSJjPMe0a7CuuBXx0fW
VKbC/ctVgUq1sE9caXWZQTKoV/Sy0pmokrcV0tiNELXvuw8zotNH6QO/Po3ud1WL
/iLPGgyM2hT3956Zwf2nEsiTyYZIbJ0yQnFdVf4xBM//ngZfEs1cuMOAqNd7JMb+
D77Gwgs4TB2wie7WKWbYN6jrWcOCjH3BrIWz9ZHZ7+JbE1kemWG/EzNh3OO+XDKD
OiKsr6IgC75K2/jTCGf8yqMlw49RodCVLHnpORlxtBgzJbVHm/hxARaFllTTAaGx
7bp25JlQId1R1lMcVOR2T5G7AMmaHEeymK6Kizx3M9xIdowxDGGx1dYRmV3a6D0c
8jL2ZFvO4AZmL+y6jQLc
=XFxx
-END PGP SIGNATURE-



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-06 Thread Nirbheek Chauhan
Hey,

On Sat, Aug 6, 2011 at 7:43 PM, Fabian Groffen grob...@gentoo.org wrote:
 In this email, I step away from the current model that Gentoo uses for
 the gentoo-x86 repository.  Instead, I consider a repo-per-package
 model, as in use by e.g. Fedora [1] and Debian [2].

 In short, the repo-per-package model means that each package
 (my-cat/package) is a separate repository in some VCS.
 Instead of having a huge tree that will only grow forever (gx86),
 packages are just in their own repository.


I had mixed feelings while reading your email. The idea is certainly
very intriguing, but there's a few things that make it a no-go for me:

1. One of the big things I've been looking forward to with git is the
ability to do atomic commits across the tree. Addition of GNOME
releases, pkgmove changes across the tree, changing ebuild/eclass
behaviour, etc. without inconsistency or praying that my connection
doesn't get dropped in the middle of a hundred interrelated commits.

Without this feature, I think some arch teams and GNOME/KDE teams will
become sad.

2. The ability to do feature commits across the whole tree instead
of hundreds of tiny commits everywhere. This combined with the
ChangeLog generation will save a lot of time and space. This will
especially benefit arch teams, but I've felt the need for this
numerous times myself. Example: we moved to using .xz tarballs for
GNOME, and that touched a lot of ebuilds, and it was extremely
time-consuming to repeat echangelog  repoman commit per-package.

3. Adding packages from overlays via `cherry-pick` or `git am` will
become extremely tedious. If thin manifests are implemented, a series
of patches + a simple merge hook will be all you need to move
KDE/GNOME releases from the overlay to the tree. Without a single
tree, you need to go back to the current way of doing things.

4. We'll need to write extra tools to keep the user's cat/pkg list
up-to-date; adding and removing repositories as needed, etc. This is
added complexity for which we'll need volunteers (we've been facing a
manpower shortage already...)

5. The total size of the tree will increase a *lot* since all these
repositories will no longer share data. The current gentoo-x86 tree
stored in git without history takes only ~25MB because ebuilds are
extremely redundant. The space requirements will balloon once we need
to store 15,000 repositories. And arch teams will have to store *all*
of them, often on devices with very low space.

The per-package models looks very neat and tidy in some respects, but
the loss of a common git repository is too great, IMO.

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-06 Thread James Cloos
Your idea is a step in the right direction, but the ideal config would
have a top level portage.git with sub-modules for each category, as well
as for eclass, licenses, profiles and scripts.  Each category.git should
have sub-modules for each package therein.

Within the profiles.git it *might* be reasonable for each directory in
arch/ also to be a sub-modules.  Or not.  That should be dicussed.

And the bureaucracy should be minimal.  Adding, changing or removing a
submodule from its parent repo should only require a call for consensus
among the devs, and not be pushed through a small set of devs on some
given team.

It may also be useful for the process which generates metadata/ to push
out to a repo, too, just before syncing out to the rsync mirrors.

Having each package in its own repo is a great idea.  But a simple
recursive git pull to update the whole thing is highly desireable.
Git submodules fit the bill perfectly.

This would require re-doing the cvs→git conversion, but it’d be worth it.

-JimC
-- 
James Cloos cl...@jhcloos.com OpenPGP: 1024D/ED7DAEA6



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-06 Thread Krzysztof Pawlik

On 06/08/11 16:13, Fabian Groffen wrote:
 There probably are drawbacks to this system as well.  I, however, only
 see big advantages for the moment.
 Comments, thoughts, ideas welcome.

To be honest I don't like that idea. I don't see any benefits from doing so:
 - history per package - huh? git log for specific path/file works, pulling all
the history for whole repository is one-time thing, does not happen often,
Nirbheek already pointed out some history-sharing issues

 - tree generation is dynamic - actually I think this is a disadvantage, it has
a nice potential to eat a lot of resources on master rsync server, also having
different flavours of the tree only brings in added complexity

 - per package branches - I like overlays, I couldn't care less about branches
for single packages :)

So:
 - having it all in single repository means that I need to care only about one
thing, not around 14956 of them
 - git was designed to be efficient with large repositories, use this ability
 - KISS

-- 
Krzysztof Pawlik  nelchael at gentoo.org  key id: 0xF6A80E46
desktop-misc, java, vim, kernel, python, apache...



signature.asc
Description: OpenPGP digital signature


Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-06 Thread Robin H. Johnson
On Sat, Aug 06, 2011 at 04:13:52PM +0200, Fabian Groffen wrote:
 When we migrate away from CVS for gentoo-x86 (gx86), as it looks now,
 the same structure will be kept as we have in CVS now.  Policies to
 reject merge commits and only allow rebases on e.g. the Git
 infrastructure will even more closely match the central and
 server-based way of working Gentoo is used to now.
The discussion about rejecting merges was never completed IIRC. I think
there may be some very valid cases where we need merges still (esp the
big atomic commit cases from KDE/GNOME), but they should still be used
sparingly. Additionally, the rebase problem has problems of requiring
everybody else to hard-reset their trees if they have pushed to multiple
places, then rebase to push to the main tree, so I don't know if that
will actually fly.

 In this email, I step away from the current model that Gentoo uses for
 the gentoo-x86 repository.  Instead, I consider a repo-per-package
 model, as in use by e.g. Fedora [1] and Debian [2].
Everything you have mentioned here was previously covered in the
discussions about Git conversion models. Please consult the history of
this list, as well as the -scm list. Additionally, a large discussion
about the pros and cons of all 3 models (package per repo, category per
repo, single repo) was had at the GSoC mentor summit last year, and a
number of the core Git developers were involved in the discussion.

Problems:
- atomic/well-ordered commits that span packages, eclasses and profiles/
  directories. (Esp. committing to eclasses and then packages
  afterwards).
- Massive space overhead: Every .git directory requires a minimum of 25
  inodes [1], covering at least 100KiB. We have 15k packages in the tree
  right now. Assuming there is no tail-packing in use, that's a minimum
  of 1.5GiB on .git overhead.
- Massive space overhead(2): Having a repo per package also removes ANY
  git compression advantage that would be gained where ebuilds between
  packages are substantially similar. The _complete_ history packfile
  for the Tree right is under 1GiB [2].
- Pain in branching/forking: instead of being able to just have your own
  local clone of the single git repo, a user wanting to work on multiple
  packages together would need to have repos for ALL of them. No
  pull/merge ability at all.

[1] Git space usage testcase:
mkdir foo  cd foo  git init \
 touch bar  git commit -m '.' bar \
 git gc  du .git --exclude '*.sample'  find .git ! -name
'*.sample' |wc -l

[2] Packfile size:
The final proposal regarding packfile size was that we were going to
partition older history using grafts, similar to when Linus moved the
kernel into Git, and had a graft available of the old history. Initial
packfile size was under 50MiB.

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Trustee  Infrastructure Lead
E-Mail : robb...@gentoo.org
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85



Re: [gentoo-dev] [RFC] gentoo-x86 migration to repo-per-package

2011-08-06 Thread Fabio Erculiani
I really love the idea of being able to atomically push updates across
multiple CPVs.
This is also what KDE, GNOME, and many other teams are waiting for.
Having multiple repos means no atomicity and at this point, I would
rather prefer CVS (omg!).

-- 
Fabio Erculiani