Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-16, o godz. 12:30:59 Alan McKinnon alan.mckin...@gmail.com napisał(a): On 16/09/2014 12:18, hasufell wrote: Ulrich Mueller: ChangeLogs are aimed at users Did any1 ask them if they care? I'm a user and I don't care. I use diff. I only go to the Changelog when I can't determine the maintainers intent from diff and the ebuild content. That happens maybe once a year. And if you need to do that, it usually means that someone didn't add a useful comment to the ebuild :). -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-16, o godz. 10:52:13 W. Trevor King wk...@tremily.us napisał(a): $ git pull --depth=1 for subsequent syncs. pym/_emerge/actions.py currently hardcodes ‘git pull’ for the latter, and doesn't seem to have any code for the former. On the other hand, it wouldn't be too terrible to force users to shallow their history manually whenever they felt like it. This isn't a good idea at all :). For git, --depth=1 fetching is the same thing as --depth=1 clone. This way, you refetch everything rather than just getting the update. Instead, plain 'pull' is more appropriate to just get the new objects. However, we may want to strip the history afterwards to reduce the clone size. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-16, o godz. 10:18:35 hasufell hasuf...@gentoo.org napisał(a): Ulrich Mueller: ChangeLogs are aimed at users Did any1 ask them if they care? A bit off-topic but asking such a question usually makes some developers point out that they are users too and they do care :). However, in this particular context it could be helpful to do some research. As people have pointed out, users may actually prefer using some git web-interface or git clone to get the details. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Wed, 17 Sep 2014, Michał Górny wrote: Dnia 2014-09-16, o godz. 10:18:35 hasufell hasuf...@gentoo.org napisał(a): Ulrich Mueller: ChangeLogs are aimed at users Did any1 ask them if they care? A bit off-topic but asking such a question usually makes some developers point out that they are users too and they do care :). However, in this particular context it could be helpful to do some research. As people have pointed out, users may actually prefer using some git web-interface or git clone to get the details. It is two questions that need to be answered. The first question is if we want to keep a ChangeLog that is separate from commit messages. The second question (and it can only be answered after the first one) is how the information should be transmitted to users. It doesn't make sense to mix the two questions, or to answer the second one before the first. The original intention back in 2002 was having a ChangeLog containing information somewhat complementary to commit messages. This can still be found in skel.ChangeLog: | This changelog is targeted to users. This means that the comments | should be well explained and written in clean English. And: | Any details about what exactly changed in the code should be added | as a message when the changes are committed to cvs, not in this | file. That different messages can be used was acknowledged in the November 2011 council meeting (and yours truly had voted in favour of it): | The Council agreed that developers are free to use different | messages for ChangeLog and commit, but they are responsible for the | messages, and the Council still expects appropriate messages to be | used. So the research that needs to be done first is to find out how often our ChangeLog entries differ from the commit log. If it turns out that they are identical in 99 % of all cases, then it obviously makes no sense to maintain the same information in two places, and ChangeLogs should be abandoned. (For my own commits, I would estimate that messages are different for 20 % of commits.) Only when this has been answered, we should discuss how the information should be formatted and how users should obtain it. Some ideas: - We could have an echangelog replacement (integrated into repoman?) for nice formatting of commit messages. - If we abandon separately maintained ChangeLog files, then there should be some means for correcting mistakes in commit messages. Maybe git notes could be used? - There is certainly room for improvement how to communicate news about a package to users, apart from elog messages and GLEP 42 news items. Maybe readme.gentoo.eclass could be extended to optionally install a NEWS.gentoo file along with README.gentoo? Ulrich pgpHKghj_khPc.pgp Description: PGP signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Wed, Sep 17, 2014 at 5:56 AM, Ulrich Mueller u...@gentoo.org wrote: So the research that needs to be done first is to find out how often our ChangeLog entries differ from the commit log. If it turns out that they are identical in 99 % of all cases, then it obviously makes no sense to maintain the same information in two places, and ChangeLogs should be abandoned. (For my own commits, I would estimate that messages are different for 20 % of commits.) When I do commits the commit message is scripted to be identical to the changelog message. I doubt I have a single commit where they differed, unless I went back to modify a changelog to fix a typo or something. They're all intended to be readable by anybody. Only when this has been answered, we should discuss how the information should be formatted and how users should obtain it. So, this will be on the Council agenda. By all means go out and dig up whatever info you think will be useful for making a decision, but I don't want to put this off hoping that somebody else will do it. I don't think it is essential to determine whether changelog messages are different from commit messages in practice. If a majority of council members disagree we can defer the decision. Some ideas: No objection to any of your ideas per se, but I don't want to make any of them blockers for a git migration. I think getting off of cvs is orthogonal to improving our documentation and communications. There are a million ways we could be spending our effort on Gentoo, but I don't think that making our commit messages more nicely formatted/etc is something worthy of a rule (ie something all devs are forced to contribute to). 95% of it is noise, so if there is a message that really needs to get out to users it should go into something like a news item that is distributed BEFORE the change is made. If documentation improvements are built into a new echangelog-like tool, I think that will greatly help adoption, but again I don't want to hold up the git migration for this. The git migration has been a moving target forever because there has always been just one more thing that needs to be done, and most who have gotten involved have gotten frustrated/bored/whatever and moved on. How many FOSS projects of our scale are still using cvs anyway? -- Rich
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Wed, Sep 17, 2014 at 10:36:45AM +0200, Michał Górny wrote: Dnia 2014-09-16, o godz. 10:52:13 W. Trevor King napisał(a): $ git pull --depth=1 for subsequent syncs. pym/_emerge/actions.py currently hardcodes ‘git pull’ for the latter, and doesn't seem to have any code for the former. On the other hand, it wouldn't be too terrible to force users to shallow their history manually whenever they felt like it. This isn't a good idea at all :). For git, --depth=1 fetching is the same thing as --depth=1 clone. This way, you refetch everything rather than just getting the update. You don't refetch everything, but the pull fails because it doesn't know how the original and new shallow commits are related (so it can't (ff-)merge them). It works if you fetch and reset (skipping the merge). Here's my test script: #!/bin/sh rm -rf upstream local-1 local-2 mkdir upstream ( cd upstream git init echo 'Some project' README git add README git commit -m 'Start the project' for X in 1 2 3 4 5 6 7 8 9 do echo ${X} ${X} git add ${X} done git commit -m 'Add a bunch of dummy files' ) git clone --depth 1 file://${PWD}/upstream local-1 ( cd upstream echo 'Build with ...' README git commit -am 'Add building instructions' echo 'Test with ...' README git commit -am 'Add testing instructions' ) ( cd local-1 git fetch --depth 1 git reset --hard origin/master git --no-pager log --oneline --decorate du -s . git reflog expire --expire=now --all git gc --aggressive --prune=now du -s . ) git clone --depth 1 file://${PWD}/upstream local-2 ( cd local-2 du -s . ) and here are some excerpts of it's output: * The shallow fetch only pulls in three objects (the new README, tree, and commit): remote: Counting objects: 3, done. remote: Compressing objects: 100% (2/2), done. remote: Total 3 (delta 1), reused 0 (delta 0) Unpacking objects: 100% (3/3), done. From file:///tmp/upstream + 73f6253...5abbe64 master - origin/master (forced update) HEAD is now at 5abbe64 Add testing instructions * After the shallow fetch and reset, local-1 has a shallow history: 5abbe64 (grafted, HEAD, origin/master, origin/HEAD, master) Add testing instructions * But it still has references to the old master in the reflog, and that takes up some space: 168 . * Expiring the reflog and garbage collectiong gets us that space back (although in practice, I'd just let Git expire these automatically in the course of time): Counting objects: 12, done. Delta compression using up to 2 threads. Compressing objects: 100% (2/2), done. Writing objects: 100% (12/12), done. Total 12 (delta 0), reused 9 (delta 0) 140 . * A fresh shallow clone gets all 12 objects (not just the three new ones): Cloning into 'local-2'... remote: Counting objects: 12, done. remote: Compressing objects: 100% (2/2), done. remote: Total 12 (delta 0), reused 0 (delta 0) Receiving objects: 100% (12/12), done. Checking connectivity... done. * And takes up as much space as our garbage-collected local-1: 140 . Again, I'm happy to leave it to users to manually $ git fetch --depth 1 $ git reset --hard origin/master to shorten their history, but I expect many will not bother, and then get annoyed as that unpurged history takes up more and more space ;). In any case, we don't have to resolve this before the transition. Cheers, Trevor -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Mon, 15 Sep 2014, Rich Freeman wrote: I'll add this to the next Council agenda. I think this is ripe for discussion. The last discussion of this really wasn't aimed at git anyway. Some of the arguments back then were that a) ChangeLogs are aimed at users, so they don't necessarily contain the same information as the commit log (i.e. could be seen as NEWS files) and b) that it should be possible to edit them, for example, to correct typos or wrong bug# references. I fail to see how anything of this would depend on the underlying VCS used. Ulrich pgpVnJdUsiCfH.pgp Description: PGP signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Ulrich Mueller: ChangeLogs are aimed at users Did any1 ask them if they care?
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On 16/09/2014 12:18, hasufell wrote: Ulrich Mueller: ChangeLogs are aimed at users Did any1 ask them if they care? I'm a user and I don't care. I use diff. I only go to the Changelog when I can't determine the maintainers intent from diff and the ebuild content. That happens maybe once a year. -- Alan McKinnon alan.mckin...@gmail.com
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Tue, Sep 16, 2014 at 6:18 AM, hasufell hasuf...@gentoo.org wrote: Ulrich Mueller: ChangeLogs are aimed at users Did any1 ask them if they care? I'm sure somebody will reply and say that they care. It still seems like a lot of overhead to me for a very one-off workflow. Maybe if portage automatically output the relevant changelog entries in pretend mode we could pretend that they're news or something like that. Most likely, if you stick something important in the changelog it will be read by maybe 0.1% of our users before emerging the package. Maybe if you're lucky 20% of people running into some kind of breakage will read the changelog after the fact. I imagine that 19.5% of those 20% would check the git log if the changelog didn't exist. If we actually move to a model where many users actually sync their trees from git, then I'd expect the changelogs to be even less useful. After all, git will actually tell you what changed since your last sync. -- Rich
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Rich Freeman: On Tue, Sep 16, 2014 at 6:18 AM, hasufell hasuf...@gentoo.org wrote: Ulrich Mueller: ChangeLogs are aimed at users Did any1 ask them if they care? I'm sure somebody will reply and say that they care. It still seems like a lot of overhead to me for a very one-off workflow. Maybe if portage automatically output the relevant changelog entries in pretend mode we could pretend that they're news or something like that. Most likely, if you stick something important in the changelog it will be read by maybe 0.1% of our users before emerging the package. Maybe if you're lucky 20% of people running into some kind of breakage will read the changelog after the fact. I imagine that 19.5% of those 20% would check the git log if the changelog didn't exist. If we actually move to a model where many users actually sync their trees from git, then I'd expect the changelogs to be even less useful. After all, git will actually tell you what changed since your last sync. And git allows you to _properly_ check for changes, because all changes are in one history, so you don't have to grep 3+ ChangeLogs (e.g. in eclasses, profiles and licenses) in order to know what happened. Even easier... related changes might just go in one commit and when you look for it you'll also see the other files that have been modified as part of a version bump (e.g. a useflag mask or whatever). The only place I actually look for changes is the gentoo-commits ML which is kind of the poor version of a git history.
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
El mar, 16-09-2014 a las 07:26 -0400, Rich Freeman escribió: On Tue, Sep 16, 2014 at 6:18 AM, hasufell hasuf...@gentoo.org wrote: Ulrich Mueller: ChangeLogs are aimed at users Did any1 ask them if they care? I'm sure somebody will reply and say that they care. It still seems like a lot of overhead to me for a very one-off workflow. Maybe if portage automatically output the relevant changelog entries in pretend mode we could pretend that they're news or something like that. Most likely, if you stick something important in the changelog it will be read by maybe 0.1% of our users before emerging the package. Maybe if you're lucky 20% of people running into some kind of breakage will read the changelog after the fact. I imagine that 19.5% of those 20% would check the git log if the changelog didn't exist. If we actually move to a model where many users actually sync their trees from git, then I'd expect the changelogs to be even less useful. After all, git will actually tell you what changed since your last sync. -- Rich Maybe one option would be to kill Changelogs and provide a script to let people get git messages and reformat them in a way similar as current ChangeLog files, that way people will still be able to save this information for the future (if they won't have internet conection later for example) and read it simply with less for example. With this option, we won't need to provide Changelogs and distribute them but people wanting to have them will still be able to generate them if wanted (for example, just after updating portage tree)
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Tue, Sep 16, 2014 at 9:44 AM, Pacho Ramos pa...@gentoo.org wrote: Maybe one option would be to kill Changelogs and provide a script to let people get git messages and reformat them in a way similar as current ChangeLog files, that way people will still be able to save this information for the future (if they won't have internet conection later for example) and read it simply with less for example. With this option, we won't need to provide Changelogs and distribute them but people wanting to have them will still be able to generate them if wanted (for example, just after updating portage tree) Or they could just clone the git tree, and they can look at per-file logs anytime they want to. I mean, sure, we COULD do this stuff. But, why? It isn't like kernel.org has some tool that lets kernel users generate per-file changelog histories just in case they don't want to use git. If somebody wants to build a tool like this by all means go ahead and do it. I just don't see it as something that should be a migration pre-requisite. That's just my opinion though. -- Rich
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
El mar, 16-09-2014 a las 09:55 -0400, Rich Freeman escribió: On Tue, Sep 16, 2014 at 9:44 AM, Pacho Ramos pa...@gentoo.org wrote: Maybe one option would be to kill Changelogs and provide a script to let people get git messages and reformat them in a way similar as current ChangeLog files, that way people will still be able to save this information for the future (if they won't have internet conection later for example) and read it simply with less for example. With this option, we won't need to provide Changelogs and distribute them but people wanting to have them will still be able to generate them if wanted (for example, just after updating portage tree) Or they could just clone the git tree, and they can look at per-file logs anytime they want to. I mean, sure, we COULD do this stuff. But, why? It isn't like kernel.org has some tool that lets kernel users generate per-file changelog histories just in case they don't want to use git. If somebody wants to build a tool like this by all means go ahead and do it. I just don't see it as something that should be a migration pre-requisite. That's just my opinion though. -- Rich I don't consider it a pre-requisite either, was only trying to give an option to still tell people how to get a ChangeLog similar to current ones easily (as looks like they are used a lot per the past discussion :/) I remember something similar was done in the past when gnome stuff moved to git: https://wiki.gnome.org/Git/ChangeLog But I guess once we get habituated to simply review something equivalent to https://git.gnome.org/browse/ not many people will miss the old Changelogs ;)
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Tue, Sep 16, 2014 at 05:35:08PM +, Duncan wrote: W. Trevor King posted on Mon, 15 Sep 2014 13:33:46 -0700 as excerpted: On Mon, Sep 15, 2014 at 01:29:44PM -0700, W. Trevor King wrote: I don't see any benefit to using rsync vs. a shallow clone as the transmission protocol. Other than the fact that before you dropped it you'd need to push a ‘emerge sync’ that could handle either rsync or Git, stabilize that Portage, and then wait for folks to adopt it. Portage already handles it. =:^) Oh, lovely :). Looks like that landed in 2.2.0 with 47e8d22d (Add support for multiple repositories in `emerge --sync`, 2013-07-23). There are older Portages in the tree though (back to 2.1.6.7_p1), so you'd still want to wait until those were gone before dropping rsync. Also, I don't see a way to say “use Git to sync, but keep a shallow repository”. Ideally, we'd want: $ git clone --depth=1 git://git.gentoo.org/gentoo-portage.git for the initial clone (modulo whatever URI), and: $ git pull --depth=1 for subsequent syncs. pym/_emerge/actions.py currently hardcodes ‘git pull’ for the latter, and doesn't seem to have any code for the former. On the other hand, it wouldn't be too terrible to force users to shallow their history manually whenever they felt like it. Cheers, Trevor -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Tue, Sep 16, 2014 at 10:52:13AM -0700, W. Trevor King wrote: Oh, lovely :). Looks like that landed in 2.2.0 with 47e8d22d (Add support for multiple repositories in `emerge --sync`, 2013-07-23). Actually, ‘git pull’ support in one form or another dates back to ba797c11 (Add --sync support for `git pull`, 2008-12-11), which landed in v2.2_rc18. There are older Portages in the tree though (back to 2.1.6.7_p1), so you'd still want to wait until those were gone before dropping rsync. The ‘git pull’ support was also backported to the 2.1.6.7_p1 series with d3c42937 (Add --sync support for `git pull`, 2008-12-11), which landed in v2.1.6.1, so I doubt any Portage users lack pull support. I'm not sure about folks using other package managers though. Cheers, Trevor -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Tue, 16 Sep 2014 10:52:13 -0700 W. Trevor King wk...@tremily.us wrote: On Tue, Sep 16, 2014 at 05:35:08PM +, Duncan wrote: W. Trevor King posted on Mon, 15 Sep 2014 13:33:46 -0700 as excerpted: On Mon, Sep 15, 2014 at 01:29:44PM -0700, W. Trevor King wrote: I don't see any benefit to using rsync vs. a shallow clone as the transmission protocol. Other than the fact that before you dropped it you'd need to push a ‘emerge sync’ that could handle either rsync or Git, stabilize that Portage, and then wait for folks to adopt it. Portage already handles it. =:^) Oh, lovely :). Looks like that landed in 2.2.0 with 47e8d22d (Add support for multiple repositories in `emerge --sync`, 2013-07-23). There are older Portages in the tree though (back to 2.1.6.7_p1), so you'd still want to wait until those were gone before dropping rsync. Also, I don't see a way to say “use Git to sync, but keep a shallow repository”. Ideally, we'd want: $ git clone --depth=1 git://git.gentoo.org/gentoo-portage.git for the initial clone (modulo whatever URI), and: $ git pull --depth=1 for subsequent syncs. pym/_emerge/actions.py currently hardcodes ‘git pull’ for the latter, and doesn't seem to have any code for the former. On the other hand, it wouldn't be too terrible to force users to shallow their history manually whenever they felt like it. Cheers, Trevor The depth option will be added to the new portage plugin-sync system in final stages of development now. There will be 2 new repos.conf options 1 for new repo install eg. git clone: --depth=1 another for sync options: eg. git pull: --rebase origin master this will allow changes to a repo in a different branch be updated from the master branch. they will be repo specific options, not global. The new system will allow emerge-webrync type repos to be synced via emerge --sync instead of the emerge-webrsync command. Plus it will add an svn type and the ability for a layman type which layman already has code for. Layman is just waiting for the new sync system to land in portage's master branch before enabling it to be installed. It will also allow for third party sync modules to be created and easily installed. So a squashfs sync type could be created and installed for those repos where that a squashfs tree is offered. -- Brian Dolbec dolsen signature.asc Description: PGP signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Il 16/09/2014 20:02, Duncan ha scritto: Rich Freeman posted on Tue, 16 Sep 2014 09:55:31 -0400 as excerpted: Or they could just clone the git tree, and they can look at per-file logs anytime they want to. Give me ro access to a current git repo and I'll *VERY* happily leave changelogs to history along with 8-track tapes and 5.25-inch floppies! =:^) I was strongly in favor of keeping changelogs (and mandating proper add/ change/deletion entries) the last time the topic came up, but that was in the context of (web)?rsync being the only viable user sync method and thus changelogs being the only user-local-accessible record. With user- git-repo access, I'll /very/ (very, very, very...) happily leave rsync behind for git, and changelogs along with it! =:^) yes, this probably is the same for everyone, and if it's not it should be anyway.
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 09/16/2014 05:18 AM, hasufell wrote: Ulrich Mueller: ChangeLogs are aimed at users Did any1 ask them if they care? If the tree switches to git and there's an option within Portage/emerge to fetch via git instead of rsync, then I'd rather rely on `git log` than a bunch of scattered files. -BEGIN PGP SIGNATURE- Version: GnuPG v2 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJUGRVnAAoJEJUrb08JgYgH1pQH/RiBmtnUvewDY+dm1cdbAdWb A8YXcHDHhYnVtll3x7hB+YphKLNYBN+baLLiKXHAR4LaWIfc+Z0NHMpfN3pNQTwZ o3XjzShWMhZ9Z5mTafPuFgR1f+sAuqSG0lOhMm3tHwKmBEHt3fh2bnAZVkGtnJRE L/xDCU5sniGPJCLhXBaPfU3om99xeEQtahXWR+rVHj64h93t9Cb1hHIlWRvjPzDT M5kC9Rz/BS1wO4mwPqi/jW5mbQnLUhcy7y4OSszQeAMyroCIhkxwwKLeWES62XQr bo6AKqv1SKMFVYIgYVRei0iTXbQ2/pWzlpatM11G6djqMtTvDlMR7f3wPbAiw2U= =EIKj -END PGP SIGNATURE-
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On 14-09-2014 16:56:24 +0200, Michał Górny wrote: Rich Freeman ri...@gentoo.org napisał(a): So, I don't really have a problem with your design. I still question whether we still need to be generating changelogs - they seem incredibly redundant. But, if people really want a redundant copy of the git log, whatever... I don't want them too. However, I'm pretty sure people will bikeshed this to death if we kill them... Especially that rsync has no git log. Not that many users make real use of ChangeLogs, esp. considering how useless messages often are there... Council had some discussions on this topic: http://www.gentoo.org/proj/en/council/meeting-logs/2008-summary.txt http://www.gentoo.org/proj/en/council/meeting-logs/20111011-summary.txt Conclusion back then was that ChangeLog files need to stay. -- Fabian Groffen Gentoo on a different level signature.asc Description: Digital signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Mon, Sep 15, 2014 at 09:53:43AM +0200, Fabian Groffen wrote: On 14-09-2014 16:56:24 +0200, Michał Górny wrote: Rich Freeman ri...@gentoo.org napisał(a): So, I don't really have a problem with your design. I still question whether we still need to be generating changelogs - they seem incredibly redundant. But, if people really want a redundant copy of the git log, whatever... I don't want them too. However, I'm pretty sure people will bikeshed this to death if we kill them... Especially that rsync has no git log. Not that many users make real use of ChangeLogs, esp. considering how useless messages often are there... Council had some discussions on this topic: http://www.gentoo.org/proj/en/council/meeting-logs/2008-summary.txt http://www.gentoo.org/proj/en/council/meeting-logs/20111011-summary.txt Conclusion back then was that ChangeLog files need to stay. I would have no problem with the council revisiting/changing this. I tend to agree that the ChangeLogs in the portage tree will be obsoleted when we switch to git because git's logging facilities are much easier to use than those in CVS. Not to mention how much smaller the portage tree would be without ChangeLogs. William signature.asc Description: Digital signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On 09/15/14 15:30, William Hubbs wrote: On Mon, Sep 15, 2014 at 09:53:43AM +0200, Fabian Groffen wrote: On 14-09-2014 16:56:24 +0200, Michał Górny wrote: Rich Freeman ri...@gentoo.org napisał(a): So, I don't really have a problem with your design. I still question whether we still need to be generating changelogs - they seem incredibly redundant. But, if people really want a redundant copy of the git log, whatever... I don't want them too. However, I'm pretty sure people will bikeshed this to death if we kill them... Especially that rsync has no git log. Not that many users make real use of ChangeLogs, esp. considering how useless messages often are there... Council had some discussions on this topic: http://www.gentoo.org/proj/en/council/meeting-logs/2008-summary.txt http://www.gentoo.org/proj/en/council/meeting-logs/20111011-summary.txt Conclusion back then was that ChangeLog files need to stay. I would have no problem with the council revisiting/changing this. I tend to agree that the ChangeLogs in the portage tree will be obsoleted when we switch to git because git's logging facilities are much easier to use than those in CVS. Not to mention how much smaller the portage tree would be without ChangeLogs. William If the argument is that there are no Changelogs in rsync, then let's write git hooks to generate them when the repository is mirrored to the rsync host. The only problem I see is with this is then adding ChangeLog to the manifest and gpg signing it which has to be done at the developer's side. But, I think the tree that users get from rsync should have the logs. Having *both* a ChangeLog file and git log is redundant. -- Anthony G. Basile, Ph.D. Gentoo Linux Developer [Hardened] E-Mail: bluen...@gentoo.org GnuPG FP : 1FED FAD9 D82C 52A5 3BAB DC79 9384 FA6E F52D 4BBA GnuPG ID : F52D4BBA
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Mon, Sep 15, 2014 at 3:55 PM, Anthony G. Basile bluen...@gentoo.org wrote: On 09/15/14 15:30, William Hubbs wrote: I would have no problem with the council revisiting/changing this. I tend to agree that the ChangeLogs in the portage tree will be obsoleted when we switch to git because git's logging facilities are much easier to use than those in CVS. Not to mention how much smaller the portage tree would be without ChangeLogs. William If the argument is that there are no Changelogs in rsync, then let's write git hooks to generate them when the repository is mirrored to the rsync host. The only problem I see is with this is then adding ChangeLog to the manifest and gpg signing it which has to be done at the developer's side. But, I think the tree that users get from rsync should have the logs. Having *both* a ChangeLog file and git log is redundant. I'll add this to the next Council agenda. I think this is ripe for discussion. The last discussion of this really wasn't aimed at git anyway. -- Rich
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-15, o godz. 15:55:35 Anthony G. Basile bluen...@gentoo.org napisał(a): On 09/15/14 15:30, William Hubbs wrote: On Mon, Sep 15, 2014 at 09:53:43AM +0200, Fabian Groffen wrote: On 14-09-2014 16:56:24 +0200, Michał Górny wrote: Rich Freeman ri...@gentoo.org napisał(a): So, I don't really have a problem with your design. I still question whether we still need to be generating changelogs - they seem incredibly redundant. But, if people really want a redundant copy of the git log, whatever... I don't want them too. However, I'm pretty sure people will bikeshed this to death if we kill them... Especially that rsync has no git log. Not that many users make real use of ChangeLogs, esp. considering how useless messages often are there... Council had some discussions on this topic: http://www.gentoo.org/proj/en/council/meeting-logs/2008-summary.txt http://www.gentoo.org/proj/en/council/meeting-logs/20111011-summary.txt Conclusion back then was that ChangeLog files need to stay. I would have no problem with the council revisiting/changing this. I tend to agree that the ChangeLogs in the portage tree will be obsoleted when we switch to git because git's logging facilities are much easier to use than those in CVS. Not to mention how much smaller the portage tree would be without ChangeLogs. If the argument is that there are no Changelogs in rsync, then let's write git hooks to generate them when the repository is mirrored to the rsync host. The only problem I see is with this is then adding ChangeLog to the manifest and gpg signing it which has to be done at the developer's side. But, I think the tree that users get from rsync should have the logs. Having *both* a ChangeLog file and git log is redundant. Can't we just kill rsync then? The whole ChangeLog seems to take more effort than the actual benefit it gives. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Mon, Sep 15, 2014 at 10:18:39PM +0200, Michał Górny wrote: Dnia 2014-09-15, o godz. 15:55:35 Anthony G. Basile napisał(a): If the argument is that there are no Changelogs in rsync, then let's write git hooks to generate them when the repository is mirrored to the rsync host. The only problem I see is with this is then adding ChangeLog to the manifest and gpg signing it which has to be done at the developer's side. But, I think the tree that users get from rsync should have the logs. Having *both* a ChangeLog file and git log is redundant. Can't we just kill rsync then? The whole ChangeLog seems to take more effort than the actual benefit it gives. I'm +1 for killing rsync and having everyone use Git. With --shallow clones for folks who don't care about the history, and deep clones for those who do (and you can change your mind both ways), I think everyone gets what they want without messing around with a Git → rsync conversion layer. Of course, it would be nice if the CSV → Git migration added any ChangeLog notes to the associated commit message to avoid losing information, but I imagine it would be hard to automate that and still get readable commit messages ;). I don't see any benefit to using rsync vs. a shallow clone as the transmission protocol. Cheers, Trevor -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Mon, Sep 15, 2014 at 01:29:44PM -0700, W. Trevor King wrote: I don't see any benefit to using rsync vs. a shallow clone as the transmission protocol. Other than the fact that before you dropped it you'd need to push a ‘emerge sync’ that could handle either rsync or Git, stabilize that Portage, and then wait for folks to adopt it. That's going to slow down your migration a bit ;). I think an rsync-able version is a better choice for the migration, but since it's not destined to live long (in my view), I don't think it really matters what goes into it. Cheers, Trevor -- This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy signature.asc Description: OpenPGP digital signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Mon, Sep 15, 2014 at 4:18 PM, Michał Górny mgo...@gentoo.org wrote: Can't we just kill rsync then? The whole ChangeLog seems to take more effort than the actual benefit it gives. I'm not sure ditching rsync entirely is necessary - it might be more trouble than it is worth as it is a very effective simple way to distribute the tree. However, I'm not really opposed to it either. However, I do really question whether we need changelogs in rsync. It seems like many projects are going away from these - or doing what the kernel is doing and just dumping a git log into them. I don't think we need to try to shoehorn the old changelogs into our git history - I'd just leave them in the tree for migration and then prune then post-migration. Oh, in case it is useful to know, a full historical git bundle is about 1.2GB, and a clone+checkout of the bundle uses about 2.1GB of space. A compressed cvs tarball with the full history is about 575MB in comparison, though I see it has grown by about 50MB in the last six months. Bottom line is that non-shallow checkouts will need a decent amount of space. Then again, my tmpfs /usr/portage uses 735M just by itself. -- Rich
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Rich Freeman: I'm not sure ditching rsync entirely is necessary - it might be more trouble than it is worth as it is a very effective simple way to distribute the tree. However, I'm not really opposed to it either. The few people I personally know who use gentoo never use rsync for syncing, because I told them to never use it, unless they want random backdoors in their tree.
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On 09/15/14 16:49, Rich Freeman wrote: On Mon, Sep 15, 2014 at 4:18 PM, Michał Górny mgo...@gentoo.org wrote: Can't we just kill rsync then? The whole ChangeLog seems to take more effort than the actual benefit it gives. I'm not sure ditching rsync entirely is necessary - it might be more trouble than it is worth as it is a very effective simple way to distribute the tree. However, I'm not really opposed to it either. I can live with git only but I'm not sure what would happen if we tried this? There are lots of users and scripts out there that assume rsync. That's one cold shower. However, I do really question whether we need changelogs in rsync. It seems like many projects are going away from these - or doing what the kernel is doing and just dumping a git log into them. I don't think we need to try to shoehorn the old changelogs into our git history - I'd just leave them in the tree for migration and then prune then post-migration. We could just push out the word that ChangeLogs are going away and they have to read the git repo. That might be the easiest solution. I do have users that quote my ChangeLogs though. Oh, in case it is useful to know, a full historical git bundle is about 1.2GB, and a clone+checkout of the bundle uses about 2.1GB of space. A compressed cvs tarball with the full history is about 575MB in comparison, though I see it has grown by about 50MB in the last six months. Bottom line is that non-shallow checkouts will need a decent amount of space. Then again, my tmpfs /usr/portage uses 735M just by itself. -- Rich -- Anthony G. Basile, Ph.D. Gentoo Linux Developer [Hardened] E-Mail: bluen...@gentoo.org GnuPG FP : 1FED FAD9 D82C 52A5 3BAB DC79 9384 FA6E F52D 4BBA GnuPG ID : F52D4BBA
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On 15-09-2014 15:58:00 -0400, Rich Freeman wrote: If the argument is that there are no Changelogs in rsync, then let's write git hooks to generate them when the repository is mirrored to the rsync host. The only problem I see is with this is then adding ChangeLog to the manifest and gpg signing it which has to be done at the developer's side. But, I think the tree that users get from rsync should have the logs. Having *both* a ChangeLog file and git log is redundant. I'll add this to the next Council agenda. I think this is ripe for discussion. The last discussion of this really wasn't aimed at git anyway. Not sure if you've read the discussions that were done on this topic. The Council decided (due to git and auto-generation of ChangeLogs) that ChangeLogs need to be amended, updated or changed. That to fix misleading typos, and more. ChangeLogs are meant for our users. For this reason repoman was changed to update the ChangeLog automatically on commit, if no changes to this file had been made. -- Fabian Groffen Gentoo on a different level signature.asc Description: Digital signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Am Sonntag 14 September 2014, 15:17:41 schrieb Ulrich Mueller: On Sun, 14 Sep 2014, Michał Górny wrote: I think we should also merge gentoo-news glsa herds.xml into the repository. They all reference Gentoo packages at a particular state in time, and it would be much nicer to have them synced properly. Not a good idea, because we may want to grant commit access to these repos for people who are not necessarily ebuild devs. Ulrich This could be solved by a pull requests review tool (gerrit, reviewboard, gitlab etc). -- Johannes Huber (johu) Gentoo Linux Developer / KDE Team GPG Key ID F3CFD2BD signature.asc Description: This is a digitally signed message part.
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Sun, 14 Sep 2014, Johannes Huber wrote: Am Sonntag 14 September 2014, 15:17:41 schrieb Ulrich Mueller: On Sun, 14 Sep 2014, Michał Górny wrote: I think we should also merge gentoo-news glsa herds.xml into the repository. They all reference Gentoo packages at a particular state in time, and it would be much nicer to have them synced properly. Not a good idea, because we may want to grant commit access to these repos for people who are not necessarily ebuild devs. This could be solved by a pull requests review tool (gerrit, reviewboard, gitlab etc). Second argument is that gentoo-x86 is large enough as it is, and we shouldn't make it even larger by merging in things that are not strictly necessary. Especially glsa has a non negligible size. Ulrich pgpl6AlNuOgQC.pgp Description: PGP signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-14, o godz. 15:17:41 Ulrich Mueller u...@gentoo.org napisał(a): On Sun, 14 Sep 2014, Michał Górny wrote: I think we should also merge gentoo-news glsa herds.xml into the repository. They all reference Gentoo packages at a particular state in time, and it would be much nicer to have them synced properly. Not a good idea, because we may want to grant commit access to these repos for people who are not necessarily ebuild devs. We may want to add metadata.xml access to those people too. If you really are that distrustful of our contributors, I believe we can do per-path filtering in the 'update' hook, or use pull request or intermediate-repository based workflow. -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Dnia 2014-09-14, o godz. 10:33:03 Rich Freeman ri...@gentoo.org napisał(a): Of course, that assumes infra is going to cooperate quickly or someone else is willing to provide the infra for it. The infra components to a git infrastructure are one of the main blockers at this point. I don't really see cooperation as the issue - just lack of manpower or interest. By 'cooperating' I simply meant offering the necessary resources in a reasonable time. 1. send announcement to devs to explain how to use git, This is one of the blockers. We haven't actually decided how we want to use git. Sure, everybody knows how to use git. The problem is that there are a dozen different ways we COULD use git, and nobody has picked the ONE way we WILL use it. This isn't as trivial as you might think. We have a fairly high commit rate and with a single repository that means that in-between a pull-merge/rebase-push there is a decent chance of another commit that will make the resulting push a non-fast-forward. People love to point out linux and its insane commit rate. The thing is, the mainline git repo with all those commits has exactly one committer - Linus himself. They don't have one big repo with one master branch that everybody pushes to. At least, that is my understanding (and there are certainly others here who are more involved with kernel development). It's hard to talk about commit rate when we combine crippled CVS with awfully stupid two-part repoman committing. This forces us to commit everything immediately, and makes some of us not committing anything at all anymore... With git, we can finally do stuff like preparing everything and pushing in one go. Rebasing or merging will be much easier then, since the effective push rate will be smaller than current commit rate. On top of user sync repo rsync is propagated. The rsync tree is populated with all old ChangeLogs copied from CVS (stored in 30M git repo), new ChangeLogs are generated from git logs and Manifests are expanded. So, I don't really have a problem with your design. I still question whether we still need to be generating changelogs - they seem incredibly redundant. But, if people really want a redundant copy of the git log, whatever... I don't want them too. However, I'm pretty sure people will bikeshed this to death if we kill them... Especially that rsync has no git log. Not that many users make real use of ChangeLogs, esp. considering how useless messages often are there... Main developer repo --- I was able to create a start git repository that takes around 66M as a git pack (this is how much you will have to fetch to start working with it). The repository is stripped clean of history and ChangeLogs, and has thin Manifests only. This means we don't have to wait till someone figures out the perfect way of converting the old CVS repository. You don't need that history most of the time, and you can play with CVS to get it if you really do. In any case, we would likely strip the history anyway to get a small repo to work with. We already have a migration process that coverts the old CVS repository, generating both a shallow repository that lacks history and a full repository that contains all of history. Additionally, these two are consistent - that is the last branch of the full repository has the same commit ID as the base of the shallow repository. Basically we generate the full history and then trim out 99% of it so that the commit in the shallow repository points to a parent that isn't in the packed repository. Actually doing the conversion is basically a solved problem. If this were actually the blocker I'd be all for just sticking the history in a different repo and starting from scratch with a new one. Was the resulting tree actually verified? How long does the conversion take? Can it be incremental, i.e. convert most of it, lock CVS, convert the remaining new commits? I think we should also merge gentoo-news glsa herds.xml into the repository. They all reference Gentoo packages at a particular state in time, and it would be much nicer to have them synced properly. I can see the pros/cons here, but I don't personally have an issue with merging them. As has been brought up elsewhere herds.xml may just go away. If somebody can come up with a set of hooks/scripts that will create the various trees and the only thing that is left is to get infra to host them, I think we can make real progress. I don't think this is something that needs to take a long time. The pieces are mostly there - they just have to be assembled. Are you willing to champion that, then? :) -- Best regards, Michał Górny signature.asc Description: PGP signature
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 10:56 AM, Michał Górny mgo...@gentoo.org wrote: Dnia 2014-09-14, o godz. 10:33:03 With git, we can finally do stuff like preparing everything and pushing in one go. Rebasing or merging will be much easier then, since the effective push rate will be smaller than current commit rate. While I agree that the ability to consolidate commits will definitely help with the commit rate, I'm not sure it will make a big difference. It will turn a kde stablereq from 300 commits into 1, and do the same for things like package moves and such. However, I suspect that the vast majority of our commits are things like bumps on individual packages that will always be individual commits. Maybe insofar as one person does a bunch of them they can be pushed at the same time, but... Looking at https://github.com/rich0/gentoo-gitmig-2014-02-21 it seems like we get about 150 commits/day on busy days. I suspect that isn't evenly distributed, but you may be right that it will just work out. Actually doing the conversion is basically a solved problem. If this were actually the blocker I'd be all for just sticking the history in a different repo and starting from scratch with a new one. Was the resulting tree actually verified? How long does the conversion take? Can it be incremental, i.e. convert most of it, lock CVS, convert the remaining new commits? The tree has been verified. The verification approaches so far are neither 100% thorough nor realtime in operation. However, I think we have a working migration process and I don't really see the need to do a double-check at the time of the actual migration. ferringb was able to do conversions in about 20min with a decent SSD and a 32-core system. His migration scripts can migrate categories in parallel. I haven't personally tried to run them myself, but I believe robbat2 and patrick have experimented with them. If there is revived interested I can see if I can set them up to run in a chroot with some documentation so that anybody can run it and satisfy themselves that it works, assuming somebody else doesn't have such a chroot ready to go. If finding a host to run it on is a problem I'm sure we could get the Trustees to spring for some time on EC2 or whatever. There is no reason that this couldn't be as simple as extracting a tarball, bind-mounting a cvs repo inside, and firing off the scripts. I do not believe it can be made to be incremental. But, the runtime should be in keeping with your hour-or-two of downtime suggestion. I suspect a fair bit of the downtime will taken just to transfer the copy of the cvroot to the migration server, and transfer the resulting git tree to wherever it needs to go and get all the back-end scripts running/etc. Are you willing to champion that, then? :) Well, I'm in for what it matters. I don't have root on any infra boxes if that is what you're looking for. :) -- Rich
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
Rich Freeman: On Sun, Sep 14, 2014 at 10:56 AM, Michał Górny mgo...@gentoo.org wrote: Dnia 2014-09-14, o godz. 10:33:03 With git, we can finally do stuff like preparing everything and pushing in one go. Rebasing or merging will be much easier then, since the effective push rate will be smaller than current commit rate. While I agree that the ability to consolidate commits will definitely help with the commit rate, I'm not sure it will make a big difference. It will turn a kde stablereq from 300 commits into 1, and do the same for things like package moves and such. However, I suspect that the vast majority of our commits are things like bumps on individual packages that will always be individual commits. Maybe insofar as one person does a bunch of them they can be pushed at the same time, but... Looking at https://github.com/rich0/gentoo-gitmig-2014-02-21 it seems like we get about 150 commits/day on busy days. I suspect that isn't evenly distributed, but you may be right that it will just work out. If the push frequency becomes so high that people barely get stuff pushed because of conflicts, then we simply have to say goodbye to the central repository workflow and have to establish a hierarchy where only a handful of people have direct push access and the rest is worked out through pull requests to project leads or dedicated reviewers. So the merging and rebasing work would then be done by fewer people instead of every single developer. But given that currently project leads may or may not be active I'm not sure that I'd vote for such a workflow. And I don't think we need that yet (although enforced review workflow is ofc superior in many ways). Let's try it with push access for every developer.
Re: [gentoo-dev] Re: My masterplan for git migration (+ looking for infra to test it)
On Sun, Sep 14, 2014 at 6:10 PM, hasufell hasuf...@gentoo.org wrote: Let's try it with push access for every developer. +1. I'm pretty strongly opposed to leaving the history behind. I'd tend to agree with Rich when he says that history conversion is pretty much a solved problem, anyway. Cheers, Dirkjan