Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-19 Thread Jan Kundrát

Rémi Cardona wrote:

- svn uses a lot of disk space


Could you please elaborate? Are you referring to the checkout size 
(which is about twice the actual size because SVN stores two copies of a 
file in the checkout to be able to perform diffs against latest revision 
without contacting the server) or something else?


Cheers,
-jkt

--
cd /local/pub  more beer  /dev/mouth
--
gentoo-dev@gentoo.org mailing list



[gentoo-dev] Re: [RFC] Some sync control

2007-01-19 Thread Steve Long
i appreciate that source control is needed to maintain files over a period
of time and to roll back changes. does that happen with ebuilds?

-- 
gentoo-dev@gentoo.org mailing list



[gentoo-dev] Re: [RFC] Some sync control

2007-01-19 Thread Steve Long
Side point: i am now aware that there is a better way to do this (pkgcore
cache/template.py and sql_template.py) thanks to ferringb.

-- 
gentoo-dev@gentoo.org mailing list



Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-19 Thread Alec Warner
Jan Kundrát wrote:
 Rémi Cardona wrote:
 - svn uses a lot of disk space
 
 Could you please elaborate? Are you referring to the checkout size
 (which is about twice the actual size because SVN stores two copies of a
 file in the checkout to be able to perform diffs against latest revision
 without contacting the server) or something else?
 
 Cheers,
 -jkt
 

SVN is a big beast on the server.

http://www.gentoo.org/proj/en/infrastructure/cvs-migration.xml
-- 
gentoo-dev@gentoo.org mailing list



Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-18 Thread sanchan
Markus Ullmann wrote:
 This was one of the big reasons. They (and we maybe as well) have people
 there with 56k/64k dialup connections. Checking out the whole thing
 would take ages.

I can confirm we have people with 56k dial up :-) Checking out portage every day
for a developer on 56k takes already a lot of time using cvs. If I had to check
out the whole history every day I'd never become a gentoo developer.

 And the last thing was the idea about distribution. There is one
 centrally maintained tree and people commit to it all day. So the
 chance of getting conflicts in pushes if one is on tour for three days
 would be very likely and so the distributed part of the VCs wouldn't be
 helpful.

I agree.
-- 
Sandro (sanchan)
-- 
gentoo-dev@gentoo.org mailing list



Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-18 Thread Greg KH
On Thu, Jan 18, 2007 at 07:52:23PM +0100, sanchan wrote:
 Markus Ullmann wrote:
  This was one of the big reasons. They (and we maybe as well) have people
  there with 56k/64k dialup connections. Checking out the whole thing
  would take ages.
 
 I can confirm we have people with 56k dial up :-) Checking out portage every 
 day
 for a developer on 56k takes already a lot of time using cvs. If I had to 
 check
 out the whole history every day I'd never become a gentoo developer.

Once you have the original tree, syncing with git for the update would
be very small and workable with a 56k dialup.  It's just the original
sync would be a pain, but again, probably no more than the original cvs
checkout was.

thanks,

greg k-h
-- 
gentoo-dev@gentoo.org mailing list



[gentoo-dev] Re: [RFC] Some sync control

2007-01-18 Thread Steve Long
Thanks for all the comments about the different SCM systems.

I'm a bit confused about all the portage tree stuff. Since a couple of us
were discussing a QA db on this list, I've been working on a script to pull
the info from the /usr/portage/ hierarchy. There's just under 25,000
ebuilds, which are maintained by about 100 devs (not sure of exact number,
taken from a forum post.) I guess what I'm asking is why this isn't just a
database. You have a live tree, what's wrong with a live db?

Please note, I'm not talking about applications like portage or pkgcore,
just the ebuild text files, which I understand have one maintainer?


-- 
gentoo-dev@gentoo.org mailing list



[gentoo-dev] Re: [RFC] Some sync control

2007-01-17 Thread Markus Ullmann

Donnie Berkholz schrieb:

Greg KH wrote:

What was the reasons he cited?


Given that ports is pretty similar to our gentoo-x86, I'd guess about 
the same ones mentioned at 
http://dev.gentoo.org/~antarus/projects/gleps/glep-0666.txt -- I quote 
from there:


1. Git currently requires you to check out the whole repository.
   This includes *all of the history*.
2. Git cannot update portions of the repository, it can only update
   the entire thing.


This was one of the big reasons. They (and we maybe as well) have people 
there with 56k/64k dialup connections. Checking out the whole thing 
would take ages.


Second thing was that absolutely none of the scripts would be able to 
handle it and they would have to be rewritten from ground up whereas 
most of them would work with svn if you just change the binary path (or 
symlink it even)


 The conversion to GIT from CVS was also lengthy
 (approximately two weeks) althought many projects attempted a switch
 this summer and tools have improved in speed.

This one was the third. At the time they tried, the conversion could not 
be suspended, so cvs would have to be taken offline for a really long time.


And the last thing was the idea about distribution. There is one 
centrally maintained tree and people commit to it all day. So the 
chance of getting conflicts in pushes if one is on tour for three days 
would be very likely and so the distributed part of the VCs wouldn't be 
helpful.


Jokey

--
gentoo-dev@gentoo.org mailing list



Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-17 Thread Donnie Berkholz

Markus Ullmann wrote:
And the last thing was the idea about distribution. There is one 
centrally maintained tree and people commit to it all day. So the 
chance of getting conflicts in pushes if one is on tour for three days 
would be very likely and so the distributed part of the VCs wouldn't be 
helpful.


The other points were valid, but if it works anything like Gentoo, I 
think this is BS. Sure, everyone commits to the same tree, but not to 
the same lines of the same file. Unless all they do over in BSD-land is 
global seds all day long, I don't see this scenario.


Thanks,
Donnie
--
gentoo-dev@gentoo.org mailing list



Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-17 Thread Ciaran McCreesh
On Wed, 17 Jan 2007 00:30:39 -0800 Donnie Berkholz
[EMAIL PROTECTED] wrote:
| Markus Ullmann wrote:
|  And the last thing was the idea about distribution. There is one 
|  centrally maintained tree and people commit to it all day. So the 
|  chance of getting conflicts in pushes if one is on tour for three
|  days would be very likely and so the distributed part of the VCs
|  wouldn't be helpful.
| 
| The other points were valid, but if it works anything like Gentoo, I 
| think this is BS. Sure, everyone commits to the same tree, but not to 
| the same lines of the same file. Unless all they do over in BSD-land
| is global seds all day long, I don't see this scenario.

You mean like when eight or so archs keyword something for a security
bug within a few hours of each other? Or when three or four archs go
stable with a new KDE or Gnome release on the same day?

-- 
Ciaran McCreesh
Mail: ciaranm at ciaranm.org
Web : http://ciaranm.org/
Paludis, the secure package manager : http://paludis.pioto.org/



signature.asc
Description: PGP signature


Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-17 Thread Robin H. Johnson
On Wed, Jan 17, 2007 at 09:03:59AM +0100, Markus Ullmann wrote:
 1. Git currently requires you to check out the whole repository.
This includes *all of the history*.
 2. Git cannot update portions of the repository, it can only update
the entire thing.
 
 This was one of the big reasons. They (and we maybe as well) have people 
 there with 56k/64k dialup connections. Checking out the whole thing 
 would take ages.
See lower down in the GLEP where it states that upstream are working on
it, and such features would be completed sooner is Gentoo added some
manpower. I do however personally expect them to be ready by mid-2007
already.

 Second thing was that absolutely none of the scripts would be able to 
 handle it and they would have to be rewritten from ground up whereas 
 most of them would work with svn if you just change the binary path (or 
 symlink it even)
I disagree with this statement. There are several mapping scripts that
provide interfaces for the old CVS commands as close as possible
(exceedingly close actually).

  The conversion to GIT from CVS was also lengthy
  (approximately two weeks) althought many projects attempted a switch
  this summer and tools have improved in speed.
 This one was the third. At the time they tried, the conversion could not 
 be suspended, so cvs would have to be taken offline for a really long time.
Upstream has moved beyond this point. If we were to convert to GIT right
now, it is intelligent enough to be able to start the conversion with a
snapshot, and then add the changes between the snapshot being taken, and
the final point after the initial conversion is complete.

 And the last thing was the idea about distribution. There is one 
 centrally maintained tree and people commit to it all day. So the 
 chance of getting conflicts in pushes if one is on tour for three days 
 would be very likely and so the distributed part of the VCs wouldn't be 
 helpful.
I refute this statement. You are no more or less likely to get conflicts
than with CVS (ignoring that fact that GIT has smarter merge
algorithms). If you do a CVS checkout, go away for 3 days, and then try
to commit, CVS will require you to update and resolve checkouts before
accepting your commit. GIT is no different, except that you can at least
have multiple revisions of your changes locally while working on them.

-- 
Robin Hugh Johnson
Gentoo Linux Developer
E-Mail : [EMAIL PROTECTED]
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85


pgp00rwakidcv.pgp
Description: PGP signature


Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-17 Thread Robin H. Johnson
On Wed, Jan 17, 2007 at 08:38:34AM +, Ciaran McCreesh wrote:
 | The other points were valid, but if it works anything like Gentoo, I 
 | think this is BS. Sure, everyone commits to the same tree, but not to 
 | the same lines of the same file. Unless all they do over in BSD-land
 | is global seds all day long, I don't see this scenario.
 You mean like when eight or so archs keyword something for a security
 bug within a few hours of each other? Or when three or four archs go
 stable with a new KDE or Gnome release on the same day?
You get conflicts with CVS already in that case, it's not going to
increase the number of conflicts in any way.

As a different note, if we were so inclined GIT would actually allow us
to specify that the KEYWORDS line is safe to perform additive merges on
always - I wouldn't trust such a behavior myself, but the option is
there.

-- 
Robin Hugh Johnson
Gentoo Linux Developer
E-Mail : [EMAIL PROTECTED]
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85


pgpS9OmbZBAH6.pgp
Description: PGP signature


Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-17 Thread Ciaran McCreesh
On Wed, 17 Jan 2007 01:06:31 -0800 Robin H. Johnson
[EMAIL PROTECTED] wrote:
| On Wed, Jan 17, 2007 at 08:38:34AM +, Ciaran McCreesh wrote:
|  | The other points were valid, but if it works anything like
|  | Gentoo, I think this is BS. Sure, everyone commits to the same
|  | tree, but not to the same lines of the same file. Unless all they
|  | do over in BSD-land is global seds all day long, I don't see this
|  | scenario.
|  You mean like when eight or so archs keyword something for a
|  security bug within a few hours of each other? Or when three or
|  four archs go stable with a new KDE or Gnome release on the same
|  day?
|
| You get conflicts with CVS already in that case, it's not going to
| increase the number of conflicts in any way.

Except that with CVS, you just update that one directory, which isn't
particularly painful even for all the arch people who live in backwards
countries with wet string internet connections.

-- 
Ciaran McCreesh
Mail: ciaranm at ciaranm.org
Web : http://ciaranm.org/
Paludis, the secure package manager : http://paludis.pioto.org/



signature.asc
Description: PGP signature


Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-17 Thread Robin H. Johnson
On Wed, Jan 17, 2007 at 09:14:41AM +, Ciaran McCreesh wrote:
 | You get conflicts with CVS already in that case, it's not going to
 | increase the number of conflicts in any way.
 Except that with CVS, you just update that one directory, which isn't
 particularly painful even for all the arch people who live in backwards
 countries with wet string internet connections.

Please see my posting that I made before Donnie's (message id
[EMAIL PROTECTED]), in which I
state:
  See lower down in the GLEP where it states that upstream are working on
  it, and such features would be completed sooner is Gentoo added some
  manpower. I do however personally expect them to be ready by mid-2007
  already.

I fully agree that right now GIT is not suitable as you cannot do partial
checkouts in time or directory dimensions. But it really is coming in the
future.

After the initial checkout (which sucks on wet-string+cans as well), GIT
actually uses less bandwidth than CVS, because it doesn't need to send
entire files back to the server to get diffs. It just uses rsync (where
available) to pull over the new files (the actual revision data files
never change once committed).

-- 
Robin Hugh Johnson
Gentoo Linux Developer
E-Mail : [EMAIL PROTECTED]
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85


pgpWWmyJ7X89N.pgp
Description: PGP signature


[gentoo-dev] Re: [RFC] Some sync control

2007-01-17 Thread Holger Hoffstaette
On Tue, 16 Jan 2007 16:12:21 +, Steve Long wrote:

 Robin H. Johnson wrote:
 My personal view (not infra) on it, is that I'm mostly negative about
 changing VCS at all - I would prefer not to change, because the status
 quo works very well as it is. If a change is going to be made, it should
 be taken as a chance to resolve as many different issues at one time as
 possible, and for that reason I favour GIT over SVN.
 
 noob alert I'm looking for a distributed SCM atm, and have come down to
 git, bzr, svn or arch. (darcs looks nice but adds haskell dependency.)

As others have said, svn is centralized, and the working models of
distributed and centralized systems differ greatly. Process is a very
important factor for such a decision.
That being said, mercurial (http://www.selenic.com/mercurial/wiki/) was
not yet mentioned but might be a good choice because just like portage it
is written in python (with C core for performance).

There's a Google TechTalk video about it:
http://video.google.com/videoplay?docid=-7724296011317502612

-h


-- 
gentoo-dev@gentoo.org mailing list



Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-17 Thread Greg KH
On Tue, Jan 16, 2007 at 05:36:34PM -0800, Donnie Berkholz wrote:
 The conversion to GIT from CVS was also lengthy (approximately two
 weeks) althought many projects attempted a switch this summer and
 tools have improved in speed.

Yes, the speed has increased a _lot_ now.  In fact yesterday someone
tweaked the tools even more and has reduced the time it takes to import
the entirety of the glibc cvs tree into git to a mere 45 minutes (with
all branches).  So this isn't really an issue anymore either.

 Note: Both history and repository slicing are in the works for GIT,
 but there is no date of completion for them.

Is there ever a completion date for opensource projects :)

thanks,

greg k-h
-- 
gentoo-dev@gentoo.org mailing list



Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-17 Thread Rémi Cardona
Steve Long wrote:
 Robin H. Johnson wrote:
 My personal view (not infra) on it, is that I'm mostly negative about
 changing VCS at all - I would prefer not to change, because the status
 quo works very well as it is. If a change is going to be made, it should
 be taken as a chance to resolve as many different issues at one time as
 possible, and for that reason I favour GIT over SVN.

 noob alert I'm looking for a distributed SCM atm, and have come down to
 git, bzr, svn or arch. (darcs looks nice but adds haskell dependency.) I'd
 really like to know which one gentoo-devs prefer and why (without starting
 a flame- if this is OT then np.) I'm leaning to git simply because it's
 used for the kernel, which seems like a project that would really stretch a
 VCS.
 

As others have pointed out, SVN isn't distributed but it's very stable
and very user friendly (much more than git imho, but YMMV).

KDE migrated to svn about a year ago, and Gnome folks did it too a few
weeks ago. One thing that might bite us when doing any migration (svn,
git, whatever) is the CVS data.

Because atomic commits don't exist in CVS, the scripts rely on
commit/modification dates to recreate atomic commits in svn/git.
Unfortunately, in some not-so-rare cases, it can definitely mess things
up, and Gnome folks took about 6 months to get rid of these issues.

In any way, I think it'd be best to contact admins from
Gnome/KDE/FreeDesktop/kernel/... to see how to handle issues on the
server side, eg :

- git uses very little space but a lot of CPU
- svn uses a lot of disk space
- add your favorite statement here

Are those statements true for portage? Do they actually matter for us?

*Conclusion* We'd need to try a migration of snapshots to see how much
load it would be to migrate gentoo-x86 from CVS onto something else.

Cheers,

Rémi

PS, my daddy's SCM can beat the crap out your daddy's SCM anytime! ;)

-- 
gentoo-dev@gentoo.org mailing list



Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-17 Thread Robin H. Johnson
On Thu, Jan 18, 2007 at 12:09:30AM +0100, R?mi Cardona wrote:
 Because atomic commits don't exist in CVS, the scripts rely on
 commit/modification dates to recreate atomic commits in svn/git.
 Unfortunately, in some not-so-rare cases, it can definitely mess things
 up, and Gnome folks took about 6 months to get rid of these issues.
You should use dates AND usernames.

 Are those statements true for portage? Do they actually matter for us?
 
 *Conclusion* We'd need to try a migration of snapshots to see how much
 load it would be to migrate gentoo-x86 from CVS onto something else.
Could you please actually look at the rest of the links in this thread?
Antarus did exactly this for his Summer-of-Code work. The original plan
was testing migrations to SVN, GIT and Mercurial. I believe that
Mercurial flunked very early, and I do not know the details.

-- 
Robin Hugh Johnson
Gentoo Linux Developer
E-Mail : [EMAIL PROTECTED]
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85


pgpHhPsHI6ZNr.pgp
Description: PGP signature


[gentoo-dev] Re: [RFC] Some sync control

2007-01-16 Thread Steve Long
Robin H. Johnson wrote:
 My personal view (not infra) on it, is that I'm mostly negative about
 changing VCS at all - I would prefer not to change, because the status
 quo works very well as it is. If a change is going to be made, it should
 be taken as a chance to resolve as many different issues at one time as
 possible, and for that reason I favour GIT over SVN.
 
noob alert I'm looking for a distributed SCM atm, and have come down to
git, bzr, svn or arch. (darcs looks nice but adds haskell dependency.) I'd
really like to know which one gentoo-devs prefer and why (without starting
a flame- if this is OT then np.) I'm leaning to git simply because it's
used for the kernel, which seems like a project that would really stretch a
VCS.

-- 
gentoo-dev@gentoo.org mailing list



[gentoo-dev] Re: [RFC] Some sync control

2007-01-16 Thread Markus Ullmann

Steve Long schrieb:

noob alert I'm looking for a distributed SCM atm, and have come down to
git, bzr, svn or arch.


svn is centralized ;)


I'm leaning to git simply because it's used for the kernel, which seems

 like a project that would really stretch a VCS.

Well the kernel is quite large but doesn't use that many different 
things, so it heavily depends on what you do.


 Robin H. Johnson wrote:
 My personal view (not infra) on it, is that I'm mostly negative about
 changing VCS at all - I would prefer not to change, because the status
 quo works very well as it is.

Well it works, no question on that. But there's still room for 
enhancements ;)


 If a change is going to be made, it should be taken as a chance to
 resolve as many different issues at one time as possible, and for
 that reason I favour GIT over SVN.

I've talked to a friend of mine recently. He's a FreeBSD dev and he said 
they tried git for their ports tree (which is basically the same what 
we're talking about) and it was more or less a big pain for multiple 
reasons.

He said he'd personally take svn after that experience.

Jokey

--
gentoo-dev@gentoo.org mailing list



Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-16 Thread Greg KH
On Wed, Jan 17, 2007 at 01:53:12AM +0100, Markus Ullmann wrote:
 I've talked to a friend of mine recently. He's a FreeBSD dev and he said 
 they tried git for their ports tree (which is basically the same what 
 we're talking about) and it was more or less a big pain for multiple 
 reasons.
 He said he'd personally take svn after that experience.

What was the reasons he cited?

thanks,

greg k-h
-- 
gentoo-dev@gentoo.org mailing list



Re: [gentoo-dev] Re: [RFC] Some sync control

2007-01-16 Thread Donnie Berkholz

Greg KH wrote:

On Wed, Jan 17, 2007 at 01:53:12AM +0100, Markus Ullmann wrote:
I've talked to a friend of mine recently. He's a FreeBSD dev and he said 
they tried git for their ports tree (which is basically the same what 
we're talking about) and it was more or less a big pain for multiple 
reasons.

He said he'd personally take svn after that experience.


What was the reasons he cited?


Given that ports is pretty similar to our gentoo-x86, I'd guess about 
the same ones mentioned at 
http://dev.gentoo.org/~antarus/projects/gleps/glep-0666.txt -- I quote 
from there:


I think migration for many would be frustrating and detailed guides
for doing things in GIT would be the norm for quite some time.  GIT
also has some other issues:

1. Git currently requires you to check out the whole repository.
   This includes *all of the history*.

2. Git cannot update portions of the repository, it can only update
   the entire thing.

3. Due to git's choice of packing format (which does save a lot of
   space), the operations are quite CPU intensive.  Either the GIT
   server gets overwhelmed by the raw number of clients using it or
   the slower clients (arm, mips, sparc, hppa...basically anything not
   x86, ppc, ppc64, amd64) get screwed by the raw amount of CPU and
   RAM necessary to unpack a checkout from these packs.

4. git-daemon (and git over ssh) both are very stupid when it comes
   to generating packs for transfer, since often two or three
   fetches can be going on but the packs are not shared between
   fetches.  This only makes the already shakey server performance
   even worse, as the same packs are generated N times instead of
   once.

If GIT gets repository slicing (ability to check out and update
slices of the repository) as well as history slicing (only take the
last six months of history, for example) I think it would be a
better canidate.  The conversion to GIT from CVS was also lengthy
(approximately two weeks) althought many projects attempted a switch
this summer and tools have improved in speed.

Note: Both history and repository slicing are in the works for GIT,
but there is no date of completion for them.

--
gentoo-dev@gentoo.org mailing list