Re: [PATCH] proposed v3 source format using .git.tar.gz
Joey Hess [EMAIL PROTECTED] writes: ... However, stashing away uncommitted changes and not including them in the build violates least suprise. I'd except to see them either commited automatically, or the current error forcing me to resolve them before building. The advantage to auto-committing, of course, is that you don't have to know how to use git (or debcommit) to build a package that uses it. Error out looks to be the most robust thing to do. Otherwise we can start to get people not properly commiting changes themselfs ofthenly. ... 4) aj suggested in this thread to add a Source-Depends field which could be used to specify the dependencies needed to unpack the package. I guess that could prove useful, but I really would like to avoid that all packages need to specify it (even though that might be solvable with substvars defined by the plugin). OTOH if dpkg uses an internal mechanism to map format to dependencies it would be more difficult for other programs like apt to get to this information. Or is this all over-engineering and the plugin should check its pre-requisites itself and note the dependencies in the error message like the current code does. One appoach would be for dpkg to build a dpkg-dev-git package that includes the git format (and depends on git-core), and so on, then Format: 3.0 (foo) could be converted to dpkg-dev-foo. Couldn't dpkg adds the needed packages, automatically, as build-depends? This looks more logical to me. -- O T A V I OS A L V A D O R - E-mail: [EMAIL PROTECTED] UIN: 5906116 GNU/Linux User: 239058 GPG ID: 49A5F855 Home Page: http://otavio.ossystems.com.br - Microsoft sells you Windows ... Linux gives you the whole house. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote: I have a sourcev3 branch with my changes at git://kitenet.net/dpkg, and have also attached a diff to this mail. I feel that this is ready for review and hopefully merging into dpkg now. Looking forward to your comments. I've now added this branch to the official dpkg repository on alioth with the intention to work on it. I've at least fixed it up so that it works with the current code base. After thinking a bit about this proposal I have the following suggestions for changes that I would like to put up for discussion: 1) I don't really like the current behaviour when there are uncommitted changes in the package directory. I would suggest as default behaviour creating a commit containing these changes. This would eliminate the need for people having to commit changes if they don't really care. The most elegant solution would probably to create the commit, clone it and then do a git reset HEAD^ in the package directory. Don't know if that is robust enough, though. Prompting the user for the commit message would probably be best but would break if people try to run the program non-interactivly. 2) Independently from the default behaviour on pack we should definetly add a command-line option for the user to choose between the three possibilities 1) error out, 2) create a commit, 3) create a commit interactivly 3) About the plugin interface: I was considering whether it would be better to move the tar generation into the plugin itself. This would allow other plugins more flexibility (e.g. generating more than one file). My masterplan includes making source formats 1.0 and 2.0 plugins internally ;) This would of course require to move the tar generating and compressing code to a module that can then be used by the plugins. 4) aj suggested in this thread to add a Source-Depends field which could be used to specify the dependencies needed to unpack the package. I guess that could prove useful, but I really would like to avoid that all packages need to specify it (even though that might be solvable with substvars defined by the plugin). OTOH if dpkg uses an internal mechanism to map format to dependencies it would be more difficult for other programs like apt to get to this information. Or is this all over-engineering and the plugin should check its pre-requisites itself and note the dependencies in the error message like the current code does. Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Frank Lichtenheld wrote: I've now added this branch to the official dpkg repository on alioth with the intention to work on it. I've at least fixed it up so that it works with the current code base. Excellent. I had kept it merged to master, but haven't checked that it's not bit-rotted lately. After thinking a bit about this proposal I have the following suggestions for changes that I would like to put up for discussion: 1) I don't really like the current behaviour when there are uncommitted changes in the package directory. I would suggest as default behaviour creating a commit containing these changes. This would eliminate the need for people having to commit changes if they don't really care. The most elegant solution would probably to create the commit, clone it and then do a git reset HEAD^ in the package directory. Don't know if that is robust enough, though. Sounds like git stash? However, stashing away uncommitted changes and not including them in the build violates least suprise. I'd except to see them either commited automatically, or the current error forcing me to resolve them before building. The advantage to auto-committing, of course, is that you don't have to know how to use git (or debcommit) to build a package that uses it. Prompting the user for the commit message would probably be best but would break if people try to run the program non-interactivly. I don't think it's a good idea to prompt for a commit message. 2) Independently from the default behaviour on pack we should definetly add a command-line option for the user to choose between the three possibilities 1) error out, 2) create a commit, 3) create a commit interactivly Not sure sure what you mean here? 3) About the plugin interface: I was considering whether it would be better to move the tar generation into the plugin itself. This would allow other plugins more flexibility (e.g. generating more than one file). My masterplan includes making source formats 1.0 and 2.0 plugins internally ;) This would of course require to move the tar generating and compressing code to a module that can then be used by the plugins. That would of course be fine. I didn't want to touch doing that in my branch for obvious reasons. :-) 4) aj suggested in this thread to add a Source-Depends field which could be used to specify the dependencies needed to unpack the package. I guess that could prove useful, but I really would like to avoid that all packages need to specify it (even though that might be solvable with substvars defined by the plugin). OTOH if dpkg uses an internal mechanism to map format to dependencies it would be more difficult for other programs like apt to get to this information. Or is this all over-engineering and the plugin should check its pre-requisites itself and note the dependencies in the error message like the current code does. One appoach would be for dpkg to build a dpkg-dev-git package that includes the git format (and depends on git-core), and so on, then Format: 3.0 (foo) could be converted to dpkg-dev-foo. -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Tue, Oct 16, 2007 at 01:42:06PM -0400, Joey Hess wrote: Phillip Susi wrote: Joey Hess wrote: A sample dpkg source package built using this is at http://kitenet.net/~joey/tmp/git-demo/. This demo package includes only the last 200 commits to the dpkg git repo, so it's more than 1 mb *smaller* than dpkg's normal .tar.gz! What was removed from the source tree when importing it into git to save this space? Like I said, I included only the last 200 commits in the git repo. Note that he was talking about the size of the working tree, not the git repository. The distribution tarball contains more than a checkout from dpkg's git repository, though, since it does contains the files copied and/or generated by autoreconf. However, this shouldn't really make a huge difference in size. Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Wed, Oct 17, 2007 at 05:24:10PM -0400, Phillip Susi wrote: Exactly... it seemed to be an 8 MB difference though, which would account for why the git repo was smaller; it started with 8 MB less files. My point is that git doesn't magically make the same set of files plus their history smaller than just the original set of files. When I poked around with du a bit it looks like the missing space in the git repo is mostly those .po files. Are these auto generated? If so, why are they included in the source package? You are most likely speaking about the .gmo files, not the .po files. Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using .git.tar.gz): On Mon, 15 Oct 2007 17:55:13 +0100, Ian Jackson [EMAIL PROTECTED] said: Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using .git.tar.gz): Well, this is tricky. I am not sure how the NMU'er communicates with the developer; I assume it is by sending in a diff. If so, this works with an arch checked out dir, and unmodified dpkg. Ideally the NMUer would simply upload and would not need to send a diff to the BTS. The maintainer would fetch the source from the archive and would be able commit the NMUers changes and then merge etc. appropriately. This works better for the distributed VCS's with the model that every checkout contains a copy of the whole repository. With a distributed model where every checkout does not pull in a copy of the repo, this means the NMU'er would have to have write access to the repo (unlikely), or create their own public repo with tagged version of the software, or send in a diff. I was talking about the case where the NMUer is RCS-naive. They download the source edit it, test it, and upload it, all using using the standard tools (apt-get source, dpkg-source, dpkg-buildpackage etc.), Obviously this means that the NMUer's download, and their corresponding upload, have to contain a working tree. By this I mean it has to contain, or imply in a way that the tools can construct, both a complete set of the actual checked-out source code, and also an indication of what the version was that was checked out (the information that CVS puts in the CVS/Entries file) so that it can be merged properly later. Ian. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Phillip Susi wrote: Joey Hess wrote: A sample dpkg source package built using this is at http://kitenet.net/~joey/tmp/git-demo/. This demo package includes only the last 200 commits to the dpkg git repo, so it's more than 1 mb *smaller* than dpkg's normal .tar.gz! What was removed from the source tree when importing it into git to save this space? Like I said, I included only the last 200 commits in the git repo. -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using .git.tar.gz): Well, this is tricky. I am not sure how the NMU'er communicates with the developer; I assume it is by sending in a diff. If so, this works with an arch checked out dir, and unmodified dpkg. Ideally the NMUer would simply upload and would not need to send a diff to the BTS. The maintainer would fetch the source from the archive and would be able commit the NMUers changes and then merge etc. appropriately. Ian. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Tue, Oct 09, 2007 at 06:58:19PM +0100, Ian Jackson wrote: [...] Goals I would suggest: * Abolish dpatch (and similar excresences) and specifically to get back to the point where a Debian source package can be unpacked to the point of seeing the source code without having to execute any of it. Really, that's probably the most valuable part of this, even if not the most interesting -- having a sane way to unpack source packages to the *actual* working tree makes it much more sane to do analysis of the source, hack on it, whatever. And something that works for a pure tarball of a .git directory all the way to unpacked .c files seems like it should certainly be general enough to achieve that. That seems (to me) like it means: - keep the perl module structure Joey's created and expect to use it with other ways of dealing with patches internally to a source package (quilt, bzr, darcs, whatever) - finalise the remaining tweaks: drop the bracketed (git) from the Format: field and handle it some other way? add a Source-Depends: field? * Make it possible (once more) for NMUers to make a change to a to acquire the source, inspect it, edit it, build it, test it, and upload it, using only tools which either do not depend on the RCS or which entirely hide it, without disrupting or being disrupted by the revision control system. It seems... remarkable that making the source package format more dependent on the revision control system would make NMUers and others more able to ignore it. The remaining big question seems to be whether to have Debian source packages include the working tree directly so people don't need git to get at it; but that seems to me something that can be decided by policy mechanisms outside dpkg. So, afaics, the dpkg maintainers should: - add Source-Depends: (I'm biassed :) - upload dpkg with modular support to unstable - upload git/bzr support as part of either dpkg or the git/bzr packages, with appropriate autogenerated Source-Depends: and ftpmaster should start accepting git/bzr source packages to experimental so we can get some practical experience with the format, and decide whether to have .git.tgz or .git+.orig+.xdelta .tgz's or whatever to unstable. I'd expect we'd either wait for lenny to release, or an updated dpkg with Format:3.0 support to be in an etch point release before accepting such packages in unstable either way, but better to get started sooner, afaics. Cheers, aj signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Ian Jackson wrote: Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using .git.tar.gz): What exactly is the goal of this dpkg addition? This is a sensible question to ask. Goals I would suggest: I find myself wondering the same thing. It seems to me that one of the main functions that the debian source format fulfills is essentially that of a version control system. It allows multiple versions to coexist in the archive, provides a change log to track the history, has tools to examine changes across revisions in detail ( debdiff ), and so on. While less refined than VCS like git, svn, et al, the debian source format does manage to provide the core functions of a VCS. Therefore, I ask, why would you pack one VCS ( git ) inside another ( deb src )? * Enable all people who work with a Debian source package to do so with the benefits of the distributed revision control system in use, which includes smart merging, and so forth; * Specifically, to enable the above for NMUers in such a way that a minimum of additional work is needed by the maintainer to merge changes. * Abolish dpatch (and similar excresences) and specifically to get back to the point where a Debian source package can be unpacked to the point of seeing the source code without having to execute any of it. * Make life easier for derived distributions by making it possible for them to merge from us, and us from them, using all of the usual features of the RCSs in use. * Make it possible (once more) for NMUers to make a change to a package without having to learn and interact with a revision control system, even if the maintainers are using one. Ie, make it possible to acquire the source, inspect it, edit it, build it, test it, and upload it, using only tools which either do not depend on the RCS or which entirely hide it, without disrupting or being disrupted by the revision control system. * When an RCS-agnostic NMUer has done their work, still give the benefit of the RCS to the maintainer (and others) when merging the NMUer's work. This is a nice set of goals, and if we are ok with leaving behind the current source package format to achieve this, then it seems to me that using git ( or possibly another VCS ) is a good way to do this, but if you are going to use git, then _really_ use it. Convert the archive over into a bunch of git repositories - one for each package, and be done with it. Why go into it half assed by packaging git inside the old format? -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
* Phillip Susi [Wed, 10 Oct 2007 14:25:46 -0400]: Why go into it half assed by packaging git inside the old format? Because otherwise the change won't happen (TM). -- Adeodato Simó dato at net.com.org.es Debian Developer adeodato at debian.org Guy on cell: Yeah, I mean she's not easy to talk to, because, you know, she'll be like, What did you do this weekend? and I'll say, Nothing, but really I was fucking some other girl. -- http://www.overheardinnewyork.com/archives/003179.html -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Adeodato Simó wrote: * Phillip Susi [Wed, 10 Oct 2007 14:25:46 -0400]: Why go into it half assed by packaging git inside the old format? Because otherwise the change won't happen (TM). Why is that a bad thing? What good does it do to have the git repo packed inside the source archive? How is that any better than just using git yourself and leaving the archives alone? -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Phillip Susi wrote: Why is that a bad thing? What good does it do to have the git repo packed inside the source archive? http://kitenet.net/~joey/blog/entry/an_evolutionary_change_to_the_Debian_source_package_format/ -- see shy jo, over and over, and out signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Hi, On Wed, 10 Oct 2007, Phillip Susi wrote: Adeodato Simó wrote: * Phillip Susi [Wed, 10 Oct 2007 14:25:46 -0400]: Why go into it half assed by packaging git inside the old format? Because otherwise the change won't happen (TM). Why is that a bad thing? What good does it do to have the git repo packed inside the source archive? How is that any better than just using git yourself and leaving the archives alone? Because Debian is all about cooperation and making the git repository available is an essential step in the process. We currently use alioth.debian.org for that purpose but it's not related to our standard packaging process and the logic to go further is either the idea of Joey (upload git repository as source) or someone in ftpmaster that implements a direct connection between a git repository and incoming, so that we can upload packages throught a git repository (and thus have a real canonical git repository for a given package). We can rarely affort to design from scratch and must take into account various parameters... such as nobody has done the second variant yet while Joey did the first one. Cheers, -- Raphaël Hertzog Premier livre français sur Debian GNU/Linux : http://www.ouaza.com/livre/admin-debian/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Raphael Hertzog wrote: Because Debian is all about cooperation and making the git repository available is an essential step in the process. We currently use alioth.debian.org for that purpose but it's not related to our standard packaging process and the logic to go further is either the idea of Joey (upload git repository as source) or someone in ftpmaster that implements a direct connection between a git repository and incoming, so that we can upload packages throught a git repository (and thus have a real canonical git repository for a given package). Connecting the git repository to the ftp archive is a good idea, but that is not what this thread is about. This thread is about packaging the git repository directly into the source archive, and I do not see any benefit to that. Why not keep the existing source ftp archive as is, but connect it with the git repo so that a change in the release branch of the git repo automatically generates the new source archive, rather than tar up the whole repo and make that the archive? -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Tue, 9 Oct 2007 14:17:17 +1000, Anthony Towns [EMAIL PROTECTED] said: So that leaves: I still think that shipping a full working dir, with no dpkg changes, seem to be the way to go, along with a tla grab file, which I think I should consider putting into the package itself (If I can work around the chicken and egg issue of adding a grab file changes the source revision which means the grab file should change which means a new revision is needed ) If you're just distributing a snapshot, rather than a full repository as Joey's basically proposing, why can't your grab file be autogenerated? ie, 1. hack on the source, merge changes, blahblah, finish, tag 2. do a checkout from version control 3. autogenerate anything necessary 4. create source package 5. build 6. upload If you're using pristine-tar to create a pristine .orig.tgz from your repo (rather than keeping one around), that needs to be autogenerated at step 3 too, afaics. Worst case you could check the autogenerated files into a parallel repository and use a config or something, afaics. I can (and do) autogenerate the grab file -- and I guess I can add it to the source package after I check things out of the version control. I guess I was quibbling over having stuff in the source package that was not in my version control and not generated by dpkg and friends -- but even I can see it is a pretty weak quibble. Anyway, thanks for the clarifications: I'll just re-start shipping a full working sir in the source tree, along with a grab file for registration; the overhead is pretty minimal compared to that of the full repo that git ships; and if people can deal with .git dirs, they can deal with {arch} and .arch-id dirs as well. Which concludes my involvement in this thread. manoj -- He flung himself on his horse and rode madly off in all directions. Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Firstly I'd just like to say that I think this is a fantastic direction to be heading in. I look forward very much to the demise of dpatch :-). I do however very much share Colin's view about the desirability of preserving the .orig.tar.gz's, the ability to unpack a Debian source package with non-Debian tools, and the ability to unpack a source package without needing to install a suitably recent one of fourteen possible revision control systems :-). On Sun, Oct 07, 2007 at 10:18:17PM +1000, Anthony Towns wrote: (I have a strong adverse reaction to duplicated information, so shipping the working tree in .git format and .orig.tar.gz format irks me, particularly if it's required) Like Colin, I can quite understand this point of view. I'd like to make a completely crazy suggestion. How about we ship the .orig.tar.gz, plus an rsync batched update (with a suitably early rsync version) which turns the unpacked source into working tree plus revision history ? Ian. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using .git.tar.gz): What exactly is the goal of this dpkg addition? This is a sensible question to ask. Goals I would suggest: * Enable all people who work with a Debian source package to do so with the benefits of the distributed revision control system in use, which includes smart merging, and so forth; * Specifically, to enable the above for NMUers in such a way that a minimum of additional work is needed by the maintainer to merge changes. * Abolish dpatch (and similar excresences) and specifically to get back to the point where a Debian source package can be unpacked to the point of seeing the source code without having to execute any of it. * Make life easier for derived distributions by making it possible for them to merge from us, and us from them, using all of the usual features of the RCSs in use. * Make it possible (once more) for NMUers to make a change to a package without having to learn and interact with a revision control system, even if the maintainers are using one. Ie, make it possible to acquire the source, inspect it, edit it, build it, test it, and upload it, using only tools which either do not depend on the RCS or which entirely hide it, without disrupting or being disrupted by the revision control system. * When an RCS-agnostic NMUer has done their work, still give the benefit of the RCS to the maintainer (and others) when merging the NMUer's work. Ian. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Ian Jackson wrote: How about we ship the .orig.tar.gz, plus an rsync batched update (with a suitably early rsync version) which turns the unpacked source into working tree plus revision history ? I'm afraid that due to consisting of many small gzipped compontents, .git is not ameanable to being efficiently binary deltaed, so, you'll still end up with approximatly 2x doubled data. This is probably true of many revision control backends, though not all .. you might be able to do it with CVS. It might be possible to start with the pristine source, check it into git, and apply a set of git packs that merges the resulting repository forward to match the maintainer's git repository. However, I think this could only work if the maintainer's git repository began with importing that same pristine source[1]. Which means throwing away your git repo for each new upstream version and starting afresh, which doesn't seem very practical. -- see shy jo [1] git's sha1sums are AIUI based on the entire history of the repo, so you can't go back and change history signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
FWIW, I listed my goals and reasons for working on this in the blog post linked to in the head of this thread. I feel that I should bow out of this thread here. I've presented an idea, a working implementation, and addressed the issues with it to the best of my ability. Far too many times in this project I've seen a good idea be indefinitely delayed or killed when everyone piles on and nitpicks it to death. This idea is in danger of that happening. If the dpkg maintainers decide to add support to this format to dpkg, I'll be happy to work with them to make any further fixes needed to my patch. (My git repo has a couple more fixes in it BTW.) -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Tue, 9 Oct 2007 16:42:38 -0400, Joey Hess [EMAIL PROTECTED] said: FWIW, I listed my goals and reasons for working on this in the blog post linked to in the head of this thread. I feel that I should bow out of this thread here. I've presented an idea, a working implementation, and addressed the issues with it to the best of my ability. Far too many times in this project I've seen a good idea be indefinitely delayed or killed when everyone piles on and nitpicks it to death. This idea is in danger of that happening. I do apologize if my quest for understanding your proposal sounded like nitpicking; that ws not my intent. I truly did not understand what I needed to do while using arch (and it turns out no changes are really required in dpkg for arch). manoj feeling obtuse -- Suicide is the sincerest form of self-criticism. Donald Kaul Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Tue, 9 Oct 2007 18:58:19 +0100, Ian Jackson [EMAIL PROTECTED] said: I am going to comment on this with my I use arch hat on. Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using .git.tar.gz): What exactly is the goal of this dpkg addition? This is a sensible question to ask. Goals I would suggest: Thanks for clarifying. * Enable all people who work with a Debian source package to do so with the benefits of the distributed revision control system in use, which includes smart merging, and so forth; This, of course, means you have to have the distributed SCM system installed and configured, and perhaps a bit of configuration work done. Shipping an arch working dir, with {arch} and .arch-ids; allows people to see the log history, and, after they have registered the repository this was checked from, to do diffs and so on. Commits won't be possible unless they have commit access to the distributed repo; but they can tag/branch to their local repo, and ask the developer to pull from there. This requires no dpkg change. * Specifically, to enable the above for NMUers in such a way that a minimum of additional work is needed by the maintainer to merge changes. Sure. Tag the checked out tree to a repo you have commit rights to, ask developers to pull from there. * Abolish dpatch (and similar excresences) and specifically to get back to the point where a Debian source package can be unpacked to the point of seeing the source code without having to execute any of it. All for it. * Make life easier for derived distributions by making it possible for them to merge from us, and us from them, using all of the usual features of the RCSs in use. ok * Make it possible (once more) for NMUers to make a change to a package without having to learn and interact with a revision control system, even if the maintainers are using one. Ie, make it possible to acquire the source, inspect it, edit it, build it, test it, and upload it, using only tools which either do not depend on the RCS or which entirely hide it, without disrupting or being disrupted by the revision control system. Hmm, OK. Well, as long as people ignore the extra directories, shipping an arch checked out dir will allow people to work with plain old make, etc, with no changes to dpkg. * When an RCS-agnostic NMUer has done their work, still give the benefit of the RCS to the maintainer (and others) when merging the NMUer's work. Well, this is tricky. I am not sure how the NMU'er communicates with the developer; I assume it is by sending in a diff. If so, this works with an arch checked out dir, and unmodified dpkg. So, in conclusion, I can happily say that no change in dpkg is needed to help arch using developers accomplish these goals; they need just stop stripping out the {arch} and .arch-id directories to accomplish all these. Silencing Lintian would be a good start. manoj -- If I am elected, the concrete barriers around the WHITE HOUSE will be replaced by tasteful foam replicas of ANN MARGARET! Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 10:59:18PM -0500, Manoj Srivastava wrote: On Mon, 8 Oct 2007 02:55:37 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: Lets not exagerate. At least for git the repository will usually be smaller or only little larger than the working directory. It will probably compress worse though. How is this magic done? If I have several dozen feature branches, all feeding back and forth, and have made lots and lots of changes in my sources, how does git preserve all this information without a commensurate increase in size? This makes the information theory geek in me very very skeptical. By already using compression in the repository and by aggressively storing data as delta against earlier versions (both for binary and textual data). Or are you talking about typical usage, and is that why people go around making shallow copies to cut down on the size of the shipped repo? Shallow copies are not a very typical thing to do, IME. Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Joey Hess [EMAIL PROTECTED] writes: Frank Lichtenheld wrote: This should probably error out. Aren't v3 packages always native in the sense tested here? Not necessarily. I wanted to leave the option open to use wig-n-pen to constuct mixed source packages that maybe use vcs for debian/ and pristine source for the rest + a diff.gz, or something like that. I think the code will basically handle unpacking such a mongrel, although there are no tools to create one. -- see shy jo Shouldn't we allow any number of any files in the dsc and dpkg-source would unpack/apply them each in turn. For example you could have: Files: yyy foobar.orig.tar.bz2 yyy images.tar yyy debian.git.tar.gz yyy security.diff.gz dpkg-source would unpack the orig.tar.bz2 first, then add the images.tar, merge the debian.git and last apply the security patch. Dpkg-source should record the files it used to construct a source dir in debian/something so that subsequent source builds can recreate the procedure. When building source the last entry should be modified where possible or a new diff.gz added otherwise. Meaning dpkg should unpack foobar.orig.tar.bz2, images.tar and debian.git.tar.gz and then create a new security.diff.gz in this case. Tools like svn-buildpackage could create a new debian.svn.tar.gz file before building source and dpkg could skip adding an empty diff.gz to the end of the dsc in such a case. For many projects you would then end up with: Files: yyy foobar.orig.tar.gz yyy debian.svn.tar.gz (or whatever VCS is used). MfG Goswin -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Mon, Oct 08, 2007 at 12:59:52PM +0200, Frank Lichtenheld wrote: On Sun, Oct 07, 2007 at 10:59:18PM -0500, Manoj Srivastava wrote: How is this magic done? If I have several dozen feature branches, all feeding back and forth, and have made lots and lots of changes in my sources, how does git preserve all this information without a commensurate increase in size? This makes the information theory geek in me very very skeptical. By already using compression in the repository and by aggressively storing data as delta against earlier versions (both for binary and textual data). For reference, a current clone I have of Linus' linux-2.6 repository with full history and working tree is 489M of which 194M is .git. -- You grabbed my hand and we fell into it, like a daydream - or a fever. signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Mon, 8 Oct 2007 12:59:52 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: On Sun, Oct 07, 2007 at 10:59:18PM -0500, Manoj Srivastava wrote: On Mon, 8 Oct 2007 02:55:37 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: Lets not exagerate. At least for git the repository will usually be smaller or only little larger than the working directory. It will probably compress worse though. How is this magic done? If I have several dozen feature branches, all feeding back and forth, and have made lots and lots of changes in my sources, how does git preserve all this information without a commensurate increase in size? This makes the information theory geek in me very very skeptical. By already using compression in the repository and by aggressively storing data as delta against earlier versions (both for binary and textual data). Well, arch does this in the repo: base versions and cacherevs are tar.gz files, and then it stores deltas from the most recent base version or cached revisions (I generally cache every 20th revision). In any case, I think the kinds of actions taken by joey's and Colin's patches are probably not things that we'll have to do to support shipping an arh working directory in the source packagel if we have {arch} and .arch-id dirs in the source, the end user has access to the distributed version control system; as soon as they register the archive location mentioned in the control file entry. I am not sure how the pritine-tar bit fits in into the picture yet. manoj -- Eighty percent of married men cheat in America. The rest cheat in Europe. Jackie Mason Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Mon, Oct 08, 2007 at 09:16:52AM -0500, Manoj Srivastava wrote: In any case, I think the kinds of actions taken by joey's and Colin's patches are probably not things that we'll have to do to support shipping an arh working directory in the source packagel if we have {arch} and .arch-id dirs in the source, the end user has access to the distributed version control system; Joey's thing lets you do a clean tarball that only contains the git (or bzr, or darcs) information, and recreates the working directory by a checkout. For CVS the equivalent would be shipping the CVSROOT, for rcs the equivalent would be shipping only the ,v files. If you don't have git, you can't do *anything* with a .git.tar.gz source package. If you unpack it by hand, all you get is the .git directory -- no debian/control, no debian/rules, nothing. You could do something similar with darcs/git/bzr atm simply by shipping the .git, _darcs or .bzr directories as part of your source package -- that's discouraged atm because it's duplicate information that bloats the source package, but it's entirely possible -- some ifupdown uploads have included the _darcs directory, eg. Ultimately, it turns the source package into a snapshot of not just the current codebase, but the history as well -- or in the case of a shallow tree, the recent history. What's the point of that? There may not be any -- if you're just packaging something that's completely straightforward, just adding a pointer to the official repository is probably the most sensible thing to do anyway; whether that be a subversion url or a tla grab file, or something else, and you can already do that. Where it starts becoming relevant (afaics) is when there's a Debian-specific patch history (either due to it being a native package, complicated packaging, or significant patches against upstream) and we want the archive, as the primary way we distribute the source, to include a real change history rather than a simple snapshot. You can do that to some extent via all the dpatch tools, but they're kludgy in various ways; having the source format allow for an actual repository from a real VCS solves that in a really powerful way. I am not sure how the pritine-tar bit fits in into the picture yet. I don't think it really does; though it makes it possible to confirm that the point in the repo that claims to match some upstream release, really does match the official tarball of that release from upstream, which might have some use. pristine-tar seems mostly useful for generating a v1 source package purely from a remote repository; this allows you to turn a repository _into_ a (v3) source package. Cheers, aj signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Tue, 9 Oct 2007 01:10:00 +1000, Anthony Towns [EMAIL PROTECTED] said: On Mon, Oct 08, 2007 at 09:16:52AM -0500, Manoj Srivastava wrote: In any case, I think the kinds of actions taken by joey's and Colin's patches are probably not things that we'll have to do to support shipping an arh working directory in the source packagel if we have {arch} and .arch-id dirs in the source, the end user has access to the distributed version control system; Joey's thing lets you do a clean tarball that only contains the git (or bzr, or darcs) information, and recreates the working directory by a checkout. Well, an additional factor is that git/bzr/darcs contains all the data required in the .git/.bzr/.darcs directories to recreate all the sources, and do the diffs, etc, which is not the case with arch -- rch does not follow the model where every checkout is a repo; so the checked dirs do not have all the info (you refer to the repo for the rest). Unless you use {arch}/++pristine trees, which I have not used in years. [Snip bunches of git/bzr/darcs material] What's the point of that? There may not be any -- if you're just packaging something that's completely straightforward, just adding a pointer to the official repository is probably the most sensible thing to do anyway; whether that be a subversion url or a tla grab file, or something else, and you can already do that. Right. I am not sure what I package is always trivial, though. Where it starts becoming relevant (afaics) is when there's a Debian-specific patch history (either due to it being a native package, complicated packaging, or significant patches against upstream) and we want the archive, as the primary way we distribute the source, to include a real change history rather than a simple snapshot. This seems to fit my use case; I have often large feature branches that only sporadically get merged back upstream. The question is, how do I do this if I use arch as a version control system? I can, or course, start shipping a cacherev + patches, but that can be large; and might not mean much unless I also ship all the feature branches and upstream branch at the same time; which can blow up badly: see the ps for details. If we just look at lenny, and I want to provide people with full details of all changes that have been made in various feature branches and upstream and debian packaging for lenny (etcvh is somewhat larger), I get: --8---cut here---start-8--- 3.0Mfvwm--autotools--2.5.18/ 368Kfvwm--autotools--2.5.21/ 88K fvwm--autotools--2.5.23/ 3.0Mfvwm--debian--2.5.18/ 356Kfvwm--debian--2.5.21/ 5.3Mfvwm--debian--2.5.23/ 3.1Mfvwm--devo--2.5.18/ 392Kfvwm--devo--2.5.21/ 1.7Mfvwm--devo--2.5.23/ 3.0Mfvwm--terminal-emulator--2.5.18/ 360Kfvwm--terminal-emulator--2.5.21/ 1.5Mfvwm--terminal-emulator--2.5.23/ 2.9Mfvwm--upstream--2.5.18/ 344Kfvwm--upstream--2.5.21/ 1.5Mfvwm--upstream--2.5.23/ 600Kdebian-dir--fvwm--0.1/ 27M total --8---cut here---end---8--- What I ship currently: --8---cut here---start-8--- 132 /usr/local/src/arch/done/fvwm_2.5.23-2.diff.gz 8 /usr/local/src/arch/done/fvwm_2.5.23-2.dsc 3244 /usr/local/src/arch/done/fvwm_2.5.23.orig.tar.gz 3.3M total. --8---cut here---end---8--- This is almost an order of magnitude increase in size, which I find hard to justify. I still think that shipping a full working dir, with no dpkg changes, seem to be the way to go, along with a tla grab file, which I think I should consider putting into the package itself (If I can work around the chicken and egg issue of adding a grab file changes the source revision which means the grab file should change which means a new revision is needed ) I am not sure how the pritine-tar bit fits in into the picture yet. I don't think it really does; though it makes it possible to confirm that the point in the repo that claims to match some upstream release, really does match the official tarball of that release from upstream, which might have some use. pristine-tar seems mostly useful for generating a v1 source package purely from a remote repository; this allows you to turn a repository _into_ a (v3) source package. Thanks for the clarification. manoj ps: This is from my lenny archive 1.8Mangband--autotools--3.0/ 1.8Mangband--debian--3.0/ 1.8Mangband--devo--3.0/ 1000K angband-doc--devel--3.0/ 1.7Mangband--upstream--3.0/ 292Kc2man--configure--2.0/ 292Kc2man--devo--2.0/ 296Kc2man--manpage-fix--2.0/ 248Kc2man--upstream--2.0/ 952Kcalc--debian--2.0/ 956Kcalc--devo--2.0/ 904Kcalc--upstream--2.0/ 148Kcheckpolicy--devo--1.32/ 128Kcheckpolicy--devo--1.34/ 176K
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Mon, Oct 08, 2007 at 03:59:05PM -0500, Manoj Srivastava wrote: Where it starts becoming relevant (afaics) is when there's a Debian-specific patch history (either due to it being a native package, complicated packaging, or significant patches against upstream) and we want the archive, as the primary way we distribute the source, to include a real change history rather than a simple snapshot. This seems to fit my use case; I have often large feature branches that only sporadically get merged back upstream. Right, but the caveat is important too -- we have to _also_ want the archive to include the real change history. Maybe when things get complicated enough that there are often large branches that sporadically get merged back, that part's no longer worth the hassle: This is almost an order of magnitude increase in size, which I find hard to justify. As far as cases where there are enough changes to make a repo interesting, but not so many that shipping a repo as the standard source becomes huge and clunky, it's possible that arch just isn't a useful tool for the job -- repo registration alone would be pretty annoying, and it's not like there aren't plenty of other VCS options for that case anyway. Subversion (or SVK) isn't an option either, afaics, eg, and I doubt CVS or RCS would work well either. So that leaves: I still think that shipping a full working dir, with no dpkg changes, seem to be the way to go, along with a tla grab file, which I think I should consider putting into the package itself (If I can work around the chicken and egg issue of adding a grab file changes the source revision which means the grab file should change which means a new revision is needed ) If you're just distributing a snapshot, rather than a full repository as Joey's basically proposing, why can't your grab file be autogenerated? ie, 1. hack on the source, merge changes, blahblah, finish, tag 2. do a checkout from version control 3. autogenerate anything necessary 4. create source package 5. build 6. upload If you're using pristine-tar to create a pristine .orig.tgz from your repo (rather than keeping one around), that needs to be autogenerated at step 3 too, afaics. Worst case you could check the autogenerated files into a parallel repository and use a config or something, afaics. Cheers, aj signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote: I've been working on making dpkg-source support a new source package format based upon git. Oh, one question that comes to mind: how does this affect checking for non-free stuff in past revisions? If 3.1-4 had some non-free files that get reimplemented for 3.2-1, do we (a) expect the maintainer to do a no-history upload for 3.2-1; (b) check that this happens somehow; (c) not worry about it as long as it's only in the history; (d) something else? Verifying that not just the current tree is DFSG-free, but all the history is too seems potentially difficult. Cheers, aj signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 02:56:47PM +1000, Anthony Towns wrote: On Sat, Oct 06, 2007 at 10:37:48PM +, Colin Watson wrote: The second possibility seems to me to be more flexible, though, and probably not all that hard to implement: build both a .tar.gz (containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source -x' to unpack the tree given at least one of these. This would allow various interesting possibilities such as: Would this be better in any way than having a web interface that provides an autogenerated version-1 source package? Presume it's a url like: http://v1source.qa.debian.org/i/ifupdown/ifupdown_0.6.8.dsc Autogenerated source packages won't (presumably, certainly not without some special arrangements) be mirrored on useful services like www.mirrorservice.org that let you peek inside tarballs without opening them, and seem difficult for people to mirror locally in general since it would put a lot of stress on v1source.qa.debian.org which I expect would be a lot less beefy than the regular Debian mirror network. I'm quite attached to being able to peek inside source packages quickly by sshing over to the local mirror I keep at home which grabs everything overnight so that I don't have to wait for it to download; particularly so for large source packages. * Derivative distributions who are slow to upgrade their dpkg-source could still interoperate to some degree. They'd need to pull sources from the autogenerated url; though they'd still probably have Build-Depends: issues if they're not updating packages generally. Oh, I was referring more to the buildd base system and archive maintenance code too; dak needs to be updated in order to accept format 3.0 source packages, for instance. Cheers, -- Colin Watson [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
* Joey Hess: I have a sourcev3 branch with my changes at git://kitenet.net/dpkg, and have also attached a diff to this mail. I feel that this is ready for review and hopefully merging into dpkg now. Looking forward to your comments. What about empty directories? I really think you need to work off a clone (or a cleaned-up cp -al'ed copy). For instance, you do not necessary want to upload the reflog, or unreachable objects. The GIT configuration stored inside .git is probably uninteresting, too. But it's still a nice idea, I think. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sat, Oct 06, 2007 at 10:09:22PM -0400, Joey Hess wrote: Colin Watson wrote: (So, FWIW, I'm not sold on git. Not sold at all yet. But it was a good choice for this implementation for several reasons.) (I don't think bzr is perfect either, of course; the lack of shallow branches (see below) is one flaw that's very relevant to this application. If there were a distributed VCS that were clearly better than the others in every respect, we'd probably all know about it ...) Still, this work looks pretty cool, and I'd like to be able to make use of it despite avoiding git whenever I can. I noticed that you'd helpfully structured your changes such that it would be possible to plug in a different revision control system, so I wrote a module to support bzr. Nice. The FAQ has some questions aimed at adding other revision control systems, could you try to answer those in the context of bzr? In particular, is the data that would be shipped in the source package the same data that bzr normally reads from untrusted sources, thus ensuring that using it this way is equally (in)secure as using bzr to pull data over the network? (Note that this wasn't 100% true for git and I have had to put in several workarounds.) I believe so; bzr has hooks but AFAICS they're only exposed to plugins (i.e. code that goes in /usr or in ~/.bazaar/plugins) rather than being something that lives in the .bzr directory. I don't know of anything executable in .bzr. I intentionally used 'bzr branch' to create the data that will be shipped, which is the same command used to branch from a network repository, so I believe that if there is a security flaw in this implementation then it would also be a security flaw in bzr itself. The only things I really needed to tweak were to remove a couple of bits of metadata which aren't useful in this context: branch-name ended up with blah.bzr.tar.gz.tmp or something like that in it, and it'll be detected from the unpacked directory name if it doesn't exist; and parent is just the directory 'bzr branch' branched from. And is the data format stable and/or one that bzr has a history of supporting old versions of in a way that ensures backwards compatability? The data format has changed a few times, but so far bzr has an excellent history of continuing to support old versions. Some data formats (dating from 0.8 or so) are marked as unsupported and you have to use 'bzr upgrade' before doing anything else. Everything else at worst nags you to run 'bzr upgrade'. I think they may have dropped support for some very old formats that basically only some early bzr developers used. Also, will the bzr repos always contain the full history, or is there an equivilant to git shallow clones? How big do they tend to be? I don't have as comfortable an answer here. There's no equivalent to git shallow clones yet (only a design, http://bazaar-vcs.org/HistoryHorizon; so this will probably get fixed one day). At present the .bzr tends if anything to be a little bigger than the source. I think due to historical performance issues people tend not to be using bzr much on very large trees yet, so I'm hoping this won't be an issue for a while; whereas the git backend has the immediate prospect of linux-2.6.git.tar.gz. ;-) * Some source packages want to ship non-VCS-managed files. It's very common for source packages to include autogenerated objects like configure, Makefile.in, etc. Whether to check these into a VCS is a somewhat religious matter (as acknowledged by the gettext info documentation, for instance), and personally I lean towards checking them in (with a few exceptions) just because it makes it easier to see when they change and keep an eye out for oddities, but I know that a lot of developers prefer to keep these outside their VCS. Shipping a working tree would make it easier to handle cases like this. Hmm, I hadn't considered that this might be a problem. I don't know if I'd want to write the code to do this, but shipping a partial working tree consisting of just those files would be enough to solve this. That ought to be relatively straightforward; just list all the files that the VCS knows about and unlink them. It seemed untidy though. Maybe put them in a separate directory (.bzr-extra-files or something) which is copied over after unpack, and make it a dpkg-source -b option rather than the default behaviour? FWIW, I was thinking much more of native packages here; non-native packages already tend to just import the upstream tarball which usually contains generated files, which is probably why this hasn't been a problem for things like git-buildpackage. If nothing else, there are several native packages in the d-i tree alone that don't have configure et al in Subversion. Alternatively, if people don't agree with me that we should ship the working tree by default, maybe it could be an option for the few packages
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 11:55:49AM +, Colin Watson wrote: Of course, a number of packages accidentally ship .svn directories and so on anyway, though I suppose there's a difference between officially blessed by dpkg and warned against by lintian ... That has to be the understatement of the year ;) Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 08:45:08AM +, Colin Watson wrote: I'm quite attached to being able to peek inside source packages quickly by sshing over to the local mirror I keep at home which grabs everything overnight so that I don't have to wait for it to download; particularly so for large source packages. How is that better than running apt-get source against your local mirror, though? Alternatively, is it really a problem to have your local mirror autogenerate v1 source packages in the same way v1source.qa.d.o presumably would? (I have a strong adverse reaction to duplicated information, so shipping the working tree in .git format and .orig.tar.gz format irks me, particularly if it's required) * Derivative distributions who are slow to upgrade their dpkg-source could still interoperate to some degree. They'd need to pull sources from the autogenerated url; though they'd still probably have Build-Depends: issues if they're not updating packages generally. Oh, I was referring more to the buildd base system and archive maintenance code too; dak needs to be updated in order to accept format 3.0 source packages, for instance. Well, you'd need an entirely new .dsc to use a v3 source package with an un-updated dak (or launchpad or whatever), that didn't contain the .git.tar.gz (or whatever) elements at all, so I don't personally see a lot of difference between just generating a new .dsc and generating a new .dsc and .tar.gz. (It might be just me, but I'm getting the feeling that implementing WigPen via this v3 format is probably easier than implementing it via the v2 format...) I might be off my rocker, but I'm not seeing any reason why we couldn't allow uploads of v3 format packages to experimental while blocking them for unstable etc, presuming dpkg somewhere supported them. Cheers, aj signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 10:18:17PM +1000, Anthony Towns wrote: On Sun, Oct 07, 2007 at 08:45:08AM +, Colin Watson wrote: I'm quite attached to being able to peek inside source packages quickly by sshing over to the local mirror I keep at home which grabs everything overnight so that I don't have to wait for it to download; particularly so for large source packages. How is that better than running apt-get source against your local mirror, though? Alternatively, is it really a problem to have your local mirror autogenerate v1 source packages in the same way v1source.qa.d.o presumably would? Of course, one possibility is to go the opposite direction: having a v3 source repository, that will automatically create v1 (or even v2 packages) and upload them to the main archive. [...] (It might be just me, but I'm getting the feeling that implementing WigPen via this v3 format is probably easier than implementing it via the v2 format...) Could you please explain what the difference between WigPen and v2 format is? I've seen them as identities so far. I might be off my rocker, but I'm not seeing any reason why we couldn't allow uploads of v3 format packages to experimental while blocking them for unstable etc, presuming dpkg somewhere supported them. Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Anthony Towns wrote: Maybe providing a feature on packages.debian.org (or similar) to download sources in simple, non-VC, tarball format would make this a complete non-issue though? pristine-tar could be used for this, it would just need source packages to put the delta somewhere standaised (under debian/), and would need some standarised way to get to the upstream source branch in git. So the logic there would be: if there's an upstream tag, then generate an .orig.tgz if there's a pristine-tar info, hax0r it to be pristine generate a .diff.gz if the .diff failed goto bailout generate a .dsc containing the orig and diff It's not generally possible to generate a .diff.gz that expresses all the changes that might be in a git repository. Repo formats that bzr in etch can unpack could be denoted by Source-Depends: dpkg-bzr (= 0.11) while repo formats that require bzr from lenny or later could be denoted by: Source-Depends: dpkg-bzr (= 0.18) I was thinking about Source-Depends too, the main problem seems to be that it would need to be supported in apt-get source too. I wonder if we could just use build-depends. -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Anthony Towns wrote: Oh, one question that comes to mind: how does this affect checking for non-free stuff in past revisions? If 3.1-4 had some non-free files that get reimplemented for 3.2-1, do we (a) expect the maintainer to do a no-history upload for 3.2-1; (b) check that this happens somehow; (c) not worry about it as long as it's only in the history; (d) something else? Verifying that not just the current tree is DFSG-free, but all the history is too seems potentially difficult. Yes, the faq discusses this problem. This is why shallow repos are IMHO important and non-shallow repos should only be uploaded with caution. -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Florian Weimer wrote: What about empty directories? Do you mean empty directories under .git or empty directories stored *in* git (can't be done, use a .gitignore in the directory) I really think you need to work off a clone (or a cleaned-up cp -al'ed copy). For instance, you do not necessary want to upload the reflog, or unreachable objects. The GIT configuration stored inside .git is probably uninteresting, too. I think if you read my code you'll see that I've dealt with these problems (Frank pointed out the reflog issue earlier in this thread), and of course it *does* build from a cleaned, cp'd copy, and run git-gc, and sanitise the .git/config, and... -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Colin Watson wrote: FWIW, I was thinking much more of native packages here; non-native packages already tend to just import the upstream tarball which usually contains generated files, which is probably why this hasn't been a problem for things like git-buildpackage. If nothing else, there are several native packages in the d-i tree alone that don't have configure et al in Subversion. Or these files could be checked into a copy of the repo that is used to build the source package, and not checked into the main vcs. This is not unlike those same packages in d-i shipping the generated files in their .diff.gz, if you look at diff as just another vcs.. -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 10:05:08AM -0400, Joey Hess wrote: Colin Watson wrote: FWIW, I was thinking much more of native packages here; non-native packages already tend to just import the upstream tarball which usually contains generated files, which is probably why this hasn't been a problem for things like git-buildpackage. If nothing else, there are several native packages in the d-i tree alone that don't have configure et al in Subversion. Or these files could be checked into a copy of the repo that is used to build the source package, and not checked into the main vcs. This is not unlike those same packages in d-i shipping the generated files in their .diff.gz, if you look at diff as just another vcs.. This is true. -- Colin Watson [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 10:18:17PM +1000, Anthony Towns wrote: On Sun, Oct 07, 2007 at 08:45:08AM +, Colin Watson wrote: I'm quite attached to being able to peek inside source packages quickly by sshing over to the local mirror I keep at home which grabs everything overnight so that I don't have to wait for it to download; particularly so for large source packages. How is that better than running apt-get source against your local mirror, though? Faster for some cases involving huge packages where I don't want to transfer the whole thing over wireless. Doesn't require complex apt configuration to point to the right package if what I want isn't the current version in the release I'm running. Etc. Alternatively, is it really a problem to have your local mirror autogenerate v1 source packages in the same way v1source.qa.d.o presumably would? I suppose that would be possible (if the code were properly packaged, integrated into debmirror, etc.), though it sounds like a big chunk of resources on my rather underpowered mirror server. (Yes, that's my problem, but I'm sure I'm not the only one.) I also can't see general mirrors like mirrorservice.org doing this kind of highly distro-specific thing, so we'd still lose handy look at a single file within this package on the web tools unless we reimplemented them on debian.org systems. Those sorts of things are very useful for big source packages. (I have a strong adverse reaction to duplicated information, so shipping the working tree in .git format and .orig.tar.gz format irks me, particularly if it's required) I do understand this reaction though ... Oh, I was referring more to the buildd base system and archive maintenance code too; dak needs to be updated in order to accept format 3.0 source packages, for instance. Well, you'd need an entirely new .dsc to use a v3 source package with an un-updated dak (or launchpad or whatever), that didn't contain the .git.tar.gz (or whatever) elements at all, so I don't personally see a lot of difference between just generating a new .dsc and generating a new .dsc and .tar.gz. True; I was thinking that a quick hack to permit v3 while still basically just unpacking .tar.gz and .diff.gz would be easier than full support for a derivative distribution that wasn't paying a whole lot of attention, but maybe it doesn't make that much difference. -- Colin Watson [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Fri, 5 Oct 2007 19:16:13 -0400, Joey Hess [EMAIL PROTECTED] said: I've been working on making dpkg-source support a new source package format based upon git. The idea is that a source package has only a .dsc and a .git.tar.gz, which is just a git repo. My implementation adds a new 3.0 version source format. A 3.0 format debian source package can consist of any files allowed by formats 1 and 2, but may also contain .$VCS.tar.gz files. To build a version 3 source package, a new field is needed in debian/control: I do not yet grok git, so could someoe tell me what this means in terms of, say, CVS or arch? What is a $CVS.tar.gz file contain when the we are using CVS? manoj -- We don't like their sound. Groups of guitars are on the way out. Decca Recording Company, turning down the Beatles, 1962 Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 10:05:32AM -0500, Manoj Srivastava wrote: On Fri, 5 Oct 2007 19:16:13 -0400, Joey Hess [EMAIL PROTECTED] said: My implementation adds a new 3.0 version source format. A 3.0 format debian source package can consist of any files allowed by formats 1 and 2, but may also contain .$VCS.tar.gz files. To build a version 3 source package, a new field is needed in debian/control: I do not yet grok git, so could someoe tell me what this means in terms of, say, CVS or arch? What is a $CVS.tar.gz file contain when the we are using CVS? For CVS it would need to contain the repository (i.e. all the RCS files), for arch I don't know enough about it to say. Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, 07 Oct 2007, Frank Lichtenheld wrote: (It might be just me, but I'm getting the feeling that implementing WigPen via this v3 format is probably easier than implementing it via the v2 format...) Could you please explain what the difference between WigPen and v2 format is? I've seen them as identities so far. I don't know either. But I'd like to dig in to say a few words. I like the idea of Joey and I'd also like to improve our source packages. I think we need to step back a bit and maybe try to come up with a more generic design encompassing wigpen and the work of Joey. But it's not as easy as it seems because we have many different requirements as shown by Colin and others. And furthermore, the data flow is considerably different when we integrate VCS in the picture. I'm not even sure that we should really call v3 a 'source package'. The goals of wigpen were IIRC: 1/ support of other compression mechanism 2/ support of multiple tarballs (glibc case) 3/ automatic support of debian/patches (1) should be a no-brainer (2) is not clear: what would multiple tarballs mean with a VCS repository? (3) patches are auto-applied at source extraction time. In a VCS, what does it mean ? In Joey's work, all Debian changesets are in the master branch which is auto-extracted if I understand correctly (I haven't read the code, only the discussion here). What about cases were multiple branches are stored? (One for upstream, one for Debian) Also, it seems important to keep the possibility to always generate a plain source package from any VCS based source package. But we might need some information to be able to do that properly. Exactly like we need new information if we ever want to support generation of v2 source packages. Is there some ground to create something common for those two use cases? (Sorry, everything is still a bit blur in my mind and while I was preparing myself to maybe hack on wigpen as my next dpkg related project, this discussion took me by surprise :-)) Cheers, -- Raphaël Hertzog Premier livre français sur Debian GNU/Linux : http://www.ouaza.com/livre/admin-debian/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 05:25:00PM +0200, Raphael Hertzog wrote: (Sorry, everything is still a bit blur in my mind and while I was preparing myself to maybe hack on wigpen as my next dpkg related project, this discussion took me by surprise :-)) Btw, if someone has too much free time and doesn't mind writing documentation, a deb-source.5 (or dsc.5) manpage similar to what we have for binary packages in deb.5 would be great stuff. Especially if it would document both V1 and V2. Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 10:05:32AM -0500, Manoj Srivastava wrote: On Fri, 5 Oct 2007 19:16:13 -0400, Joey Hess [EMAIL PROTECTED] said: I've been working on making dpkg-source support a new source package format based upon git. The idea is that a source package has only a .dsc and a .git.tar.gz, which is just a git repo. My implementation adds a new 3.0 version source format. A 3.0 format debian source package can consist of any files allowed by formats 1 and 2, but may also contain .$VCS.tar.gz files. To build a version 3 source package, a new field is needed in debian/control: I do not yet grok git, so could someoe tell me what this means in terms of, say, CVS or arch? What is a $CVS.tar.gz file contain when the we are using CVS? I think this only really makes sense for distributed revision control systems. For arch, the .arch.tar.gz would contain the {arch} directory, perhaps with a few adjustments similar to those being made in the git and bzr modules. -- Colin Watson [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Hi, OK, commenting on this with my I use arch hat on. If I understand correctly, we are proposing shipping a working directory in the .deb; and not shipping an orig.tar.gz nor a diff.gz file. I like the idea; and I think I can support nested arch packages (submodules in .git speak), based on the examples I have seen of joey's patch and Colin's for bzr -- I just need some more information about what exactly some of these git commands do. sub prep_tar: make sure we have an ./{arch} directory. Look for nested submodules: $tree_root=$($TLA tree-root); @nested=`$TLA inventory -t --nested $tree_root`; ** Why are we checking for uncommitted files here? I would think that people would have done an export to actually build packages ** for each tree_root and nested; do run $TLA CHANGES map { $list{${NESTED_PATH}/$_} = 1; } join ,, `$TLA inventory -s` done For all files in exclude list, go and set values in %list to 0 (or delete the key) ** I have no idea what the prune and shallow copy commands do, or the arch equivalent ** sub post_unpack_tar make sure we have an ./{arch} directory.Look for nested submodules: $tree_root=$($TLA tree-root); @nested=`$TLA inventory -t --nested $tree_root`; ** arch hooks are per user, not per repo -- iirc ** ** what does git-config do? or bzr checkout? ** Actually, at this point I am beginning to question my understanding of the proposal. If we are shipping a working tree, what is this step doing? Is this an svn update equivalent? manoj -- If a computer can't directly address all the RAM you can use, it's just a toy. anonymous comp.sys.amiga posting, non-sequitur Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, 7 Oct 2007 09:54:39 -0400, Joey Hess [EMAIL PROTECTED] said: Anthony Towns wrote: Oh, one question that comes to mind: how does this affect checking for non-free stuff in past revisions? If 3.1-4 had some non-free files that get reimplemented for 3.2-1, do we (a) expect the maintainer to do a no-history upload for 3.2-1; (b) check that this happens somehow; (c) not worry about it as long as it's only in the history; (d) something else? Verifying that not just the current tree is DFSG-free, but all the history is too seems potentially difficult. Yes, the faq discusses this problem. This is why shallow repos are IMHO important and non-shallow repos should only be uploaded with caution. What does this mean in non-git context? manoj -- Don't get even -- get odd! Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, 7 Oct 2007 15:44:47 +, Colin Watson [EMAIL PROTECTED] said: On Sun, Oct 07, 2007 at 10:05:32AM -0500, Manoj Srivastava wrote: On Fri, 5 Oct 2007 19:16:13 -0400, Joey Hess [EMAIL PROTECTED] said: I've been working on making dpkg-source support a new source package format based upon git. The idea is that a source package has only a .dsc and a .git.tar.gz, which is just a git repo. My implementation adds a new 3.0 version source format. A 3.0 format debian source package can consist of any files allowed by formats 1 and 2, but may also contain .$VCS.tar.gz files. To build a version 3 source package, a new field is needed in debian/control: I do not yet grok git, so could someoe tell me what this means in terms of, say, CVS or arch? What is a $CVS.tar.gz file contain when the we are using CVS? I think this only really makes sense for distributed revision control systems. For arch, the .arch.tar.gz would contain the {arch} directory, perhaps with a few adjustments similar to those being made in the git and bzr modules. Hmm. If I have just the ./{arch} directory, and none of the files, then arch thinks the files have just been deleted; and you can't just check out stuff, since the tree is up to date. Ah. Baz undo restores all the files, cool. The problem here is that the repository in question _has_ to be registered by the user running this; so all the users would have to register the arch repository in question before unpacking the source tarball in order to tell baz/tla how to get access to the repo. Is this going to be an issue? I would prefer to instead ship a grab file for arch instead of the {arch} directory, since the latter really buys us nothing over the grab file (since we are requiring the distributed source dir and network access to unpack source packages). Consider this grab file: --8---cut here---start-8--- Archive-Name: [EMAIL PROTECTED] Archive-Location: http://arch.debian.org/arch/private/srivasta Target-Revision: packages--debian--1.0 Target-Directory: manoj-packages Target-Config: configs/ucf/debian/ucf-3.003 --8---cut here---end---8--- tla register-archive --present-ok $values-of-Archive-Location-field tla grab path/to/the/grab-file cd $value-of-field-Target-Directory/package-name/* (room for standardization here) manoj --8---cut here---start-8--- __ baz status * looking for [EMAIL PROTECTED]/ucf--devel--3.0--patch-1 to compare with * comparing to [EMAIL PROTECTED]/ucf--devel--3.0--patch-1 D .arch-ids D examples D examples/.arch-ids D t D t/.arch-ids D .arch-ids/COPYING.id D .arch-ids/ChangeLog.id D .arch-ids/Makefile.id D .arch-ids/lcf.1.id D .arch-ids/lcf.id D .arch-ids/ucf.1.id D .arch-ids/ucf.conf.5.id D .arch-ids/ucf.conf.id D .arch-ids/ucf.id D .arch-ids/ucfq.1.id D .arch-ids/ucfq.id D .arch-ids/ucfr.1.id D .arch-ids/ucfr.id D COPYING D ChangeLog D Makefile D examples/.arch-ids/=id D examples/.arch-ids/ChangeLog.id D examples/.arch-ids/postinst.id D examples/.arch-ids/postrm.id D examples/ChangeLog D examples/postinst D examples/postrm D lcf D lcf.1 D t/.arch-ids/=id D ucf D ucf.1 D ucf.conf D ucf.conf.5 D ucfq D ucfq.1 D ucfr D ucfr.1 __ baz update * tree is already up to date --8---cut here---end---8--- -- Time is money and money can't buy you love and I love your outfit T.H.U.N.D.E.R. #1 Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 10:52:45AM -0500, Manoj Srivastava wrote: What does this mean in non-git context? I think truncating the patch-log history is unimportant for Arch, but any ++pristine-trees should definitely be nuked prior to packing. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 11:10:41AM -0500, Manoj Srivastava wrote: Hmm. If I have just the ./{arch} directory, and none of the files, then arch thinks the files have just been deleted; and you can't just check out stuff, since the tree is up to date. Ah. Baz undo restores all the files, cool. I presume you could ship all the normal files in one tarball, the .arch-ids and {arch} directories in another, and the debian/ directory in a third. That would give the NMUer a full working tree to run $TLA diff in. Only shipping a grab file would burden the end user with a need for http access and no guarantee that the source will be available. The problem here is that the repository in question _has_ to be registered by the user running this; so all the users would have to register the arch repository in question before unpacking the source tarball in order to tell baz/tla how to get access to the repo. Is this going to be an issue? It shouldn't be too difficult to add an --autoregister switch to tla grab, though I don't know how safe it'd be. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 10:49:48AM -0500, Manoj Srivastava wrote: OK, commenting on this with my I use arch hat on. If I understand correctly, we are proposing shipping a working directory in the .deb; and not shipping an orig.tar.gz nor a diff.gz file. I like You probably mean source package here and not .deb. Also the original proposal just means shipping the repository data, since most DVCS can easily create a working directory from that. the idea; and I think I can support nested arch packages (submodules in .git speak), based on the examples I have seen of joey's patch and Colin's for bzr -- I just need some more information about what exactly some of these git commands do. sub prep_tar: make sure we have an ./{arch} directory. Look for nested submodules: $tree_root=$($TLA tree-root); @nested=`$TLA inventory -t --nested $tree_root`; ** Why are we checking for uncommitted files here? I would think that people would have done an export to actually build packages ** The whole idea of the proposal is to NOT create an export. for each tree_root and nested; do run $TLA CHANGES map { $list{${NESTED_PATH}/$_} = 1; } join ,, `$TLA inventory -s` done For all files in exclude list, go and set values in %list to 0 (or delete the key) ** I have no idea what the prune and shallow copy commands do, or the arch equivalent ** git gc --prune deletes old data that isn't needed anymore. This is needed since all other git commands never change or overwrite data (file data, this is obviously not true for meta data), they only add some. sub post_unpack_tar make sure we have an ./{arch} directory.Look for nested submodules: $tree_root=$($TLA tree-root); @nested=`$TLA inventory -t --nested $tree_root`; ** arch hooks are per user, not per repo -- iirc ** ** what does git-config do? or bzr checkout? ** git-config is just an cli interface to the .git/config file. Since we only ship the repository we need to create the working tree. This is what git/bzr checkout do. Actually, at this point I am beginning to question my understanding of the proposal. If we are shipping a working tree, what is this step doing? Is this an svn update equivalent? No, that would be git fetch/pull (and probably something similar named in bzr) Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, 7 Oct 2007 20:33:58 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: On Sun, Oct 07, 2007 at 10:49:48AM -0500, Manoj Srivastava wrote: OK, commenting on this with my I use arch hat on. If I understand correctly, we are proposing shipping a working directory in the .deb; and not shipping an orig.tar.gz nor a diff.gz file. I like You probably mean source package here and not .deb. Also the original proposal just means shipping the repository data, since most DVCS can easily create a working directory from that. Hmm. The repository data, as far as I can tell, means the name of the archive, and the location. Do you really mean we are not shipping any, say, foo.c file in the sources, just a locatio where you can get the foo.c file from, at a particular version? The whole idea of the proposal is to NOT create an export. If we are not creating and export, and we are only shipping the repository data, how come there needs to be a check for uncommitted files? If the changes are uncommitted, that means the repo does not know about it; and if we only ship the repository data, we are not shipping stuff not in the repo. What am I missing? for each tree_root and nested; do run $TLA CHANGES map { $list{${NESTED_PATH}/$_} = 1; } join ,, `$TLA inventory -s` done For all files in exclude list, go and set values in %list to 0 (or delete the key) ** I have no idea what the prune and shallow copy commands do, or the arch equivalent ** git gc --prune deletes old data that isn't needed anymore. This is needed since all other git commands never change or overwrite data (file data, this is obviously not true for meta data), they only add some. I am unsure what this means in term of arch. sub post_unpack_tar make sure we have an ./{arch} directory.Look for nested submodules: $tree_root=$($TLA tree-root); @nested=`$TLA inventory -t --nested $tree_root`; ** arch hooks are per user, not per repo -- iirc ** ** what does git-config do? or bzr checkout? ** git-config is just an cli interface to the .git/config file. Since we only ship the repository we need to create the working tree. This is what git/bzr checkout do. Well, I do not see how this is done in arch. If you are not shipping the working tree; all I can see shipping for arch is the URI of the repo. I am pretty sure this is not what you mean, since then any arch based source would be three lines or so, and would need network access to unpack the source tree. Actually, at this point I am beginning to question my understanding of the proposal. If we are shipping a working tree, what is this step doing? Is this an svn update equivalent? No, that would be git fetch/pull (and probably something similar named in bzr) I don't think I know what this means when you are using arch. manoj -- Earn cash in your spare time -- blackmail your friends. Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, 7 Oct 2007 12:24:46 -0400, Clint Adams [EMAIL PROTECTED] said: On Sun, Oct 07, 2007 at 11:10:41AM -0500, Manoj Srivastava wrote: Hmm. If I have just the ./{arch} directory, and none of the files, then arch thinks the files have just been deleted; and you can't just check out stuff, since the tree is up to date. Ah. Baz undo restores all the files, cool. I presume you could ship all the normal files in one tarball, the .arch-ids and {arch} directories in another, and the debian/ directory in a third. Err, and why am I doing this? Why am I not shipping my working directory as a tarball, complete instead of breaking it up (apparently arbitrarily) into three parts? That would give the NMUer a full working tree to run $TLA diff in. Only shipping a grab file would burden the end user with a need for http access and no guarantee that the source will be available. How is git reconstituting the files if there is no network access? Are they shipping all the bits needed to get a full working dir without any network access? The problem here is that the repository in question _has_ to be registered by the user running this; so all the users would have to register the arch repository in question before unpacking the source tarball in order to tell baz/tla how to get access to the repo. Is this going to be an issue? It shouldn't be too difficult to add an --autoregister switch to tla grab, though I don't know how safe it'd be. caveat emptor, I think, given that some repository access seems to be required for unpacking a version 3 source package. This is not something I would do in an un-constrained environment. manoj -- It is impossible to make anything foolproof, because fools are so ingenious. Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, 7 Oct 2007 12:14:39 -0400, Clint Adams [EMAIL PROTECTED] said: On Sun, Oct 07, 2007 at 10:52:45AM -0500, Manoj Srivastava wrote: What does this mean in non-git context? I think truncating the patch-log history is unimportant for Arch, but any ++pristine-trees should definitely be nuked prior to packing. OK, that's fair. I use revision libs, so I never have pristine trees in my working dir anyway. manoj -- Linux is obsolete (Andrew Tanenbaum) Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Manoj Srivastava [EMAIL PROTECTED] writes: How is git reconstituting the files if there is no network access? Are they shipping all the bits needed to get a full working dir without any network access? As I understand it, yes, that's the basic idea. -- Russ Allbery ([EMAIL PROTECTED]) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 02:19:36PM -0500, Manoj Srivastava wrote: Err, and why am I doing this? Why am I not shipping my working directory as a tarball, complete instead of breaking it up (apparently arbitrarily) into three parts? As opposed to an .orig.tar.gz and all the debian/, {arch}/, and .arch-ids/ components in the .diff.gz ? How is git reconstituting the files if there is no network access? Are they shipping all the bits needed to get a full working dir without any network access? Yes. the .git/ (or .bzr/ ) directory contains the entire (or abridged in the case of these shallow clones) history so you can check out any of the covered revisions. This would be akin to you including a cachedrev of an arbitrary version followed by all the subsequent patches.tar.gz files, except that I believe git et al. are meant to be more space-efficient. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 02:16:12PM -0500, Manoj Srivastava wrote: On Sun, 7 Oct 2007 20:33:58 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: On Sun, Oct 07, 2007 at 10:49:48AM -0500, Manoj Srivastava wrote: You probably mean source package here and not .deb. Also the original proposal just means shipping the repository data, since most DVCS can easily create a working directory from that. Hmm. The repository data, as far as I can tell, means the name of the archive, and the location. Do you really mean we are not shipping any, say, foo.c file in the sources, just a locatio where you can get the foo.c file from, at a particular version? bzr and git always ship the complete repository with each working directory. This is why they are called distributed. Arch seems to be some weird thing in between truly central and truly distributed VCS. The whole idea of the proposal is to NOT create an export. If we are not creating and export, and we are only shipping the repository data, how come there needs to be a check for uncommitted files? If the changes are uncommitted, that means the repo does not know about it; and if we only ship the repository data, we are not shipping stuff not in the repo. What am I missing? They might be uncommitted because the maintainer forgot to commit them. The only question is whether we should abort, commit the changes, or ignore the changes. There is no technical problem with either of these cases. Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Hi, On Sun, 7 Oct 2007 15:49:55 -0400, Clint Adams [EMAIL PROTECTED] said: On Sun, Oct 07, 2007 at 02:19:36PM -0500, Manoj Srivastava wrote: On Sun, 7 Oct 2007 12:24:46 -0400, Clint Adams [EMAIL PROTECTED] said: I presume you could ship all the normal files in one tarball, the .arch-ids and {arch} directories in another, and the debian/ directory in a third. Err, and why am I doing this? Why am I not shipping my working directory as a tarball, complete instead of breaking it up (apparently arbitrarily) into three parts? As opposed to an .orig.tar.gz and all the debian/, {arch}/, and .arch-ids/ components in the .diff.gz ? Umm, I was asking about why the normal and the arch-ids and {arch} directories are being separated, and the ./debian dir as well. The idea of the wig pen was so that we no longer used diff as an version control system, or were able to use more than one tar ball for the source. How is this working in this proposal? I do not ship the orig.tar.gz file, but I ship and orig.arch.tar.gz file with the upstream branch? Then I mostly duplicate this by shipping a working dir, and each also somehow ship an delta that recreates the orig.tar.gzx file from the upstream branch I am shipping? How is git reconstituting the files if there is no network access? Are they shipping all the bits needed to get a full working dir without any network access? Yes. the .git/ (or .bzr/ ) directory contains the entire (or abridged in the case of these shallow clones) history so you can check out any of the covered revisions. A history as in RCS-like history, with parches, as opposed to the patch-log that is what the {arch} directories contain? This would be akin to you including a cachedrev of an arbitrary version followed by all the subsequent patches.tar.gz files, except that I believe git et al. are meant to be more space-efficient. wow. gulp. OK, so for arch I suppose I just ship a working dir, period, and people need network access to get the older versions, unless people want terabytes of the archive in every source versions. manoj -- Mind your own business, Mr. Spock. I'm sick of your halfbreed interference. Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, 7 Oct 2007 22:04:21 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: On Sun, Oct 07, 2007 at 02:16:12PM -0500, Manoj Srivastava wrote: On Sun, 7 Oct 2007 20:33:58 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: On Sun, Oct 07, 2007 at 10:49:48AM -0500, Manoj Srivastava wrote: You probably mean source package here and not .deb. Also the original proposal just means shipping the repository data, since most DVCS can easily create a working directory from that. Hmm. The repository data, as far as I can tell, means the name of the archive, and the location. Do you really mean we are not shipping any, say, foo.c file in the sources, just a locatio where you can get the foo.c file from, at a particular version? bzr and git always ship the complete repository with each working directory. This is why they are called distributed. Arch seems to be some weird thing in between truly central and truly distributed VCS. I am not sure I see this. Arch repositories are distributed, and you can pull, branch, and tag off any repository out there in the meta-verse. But every directory also has a semi permanent URI; and checking pout a branch locally does not end up with you downloading the terabytes of stuff in the repo out there. This might be because you can have more than one project in a repo; my repo contains CVS emacs, unicode emacs, as well as most of the SELinux packages, etc, and I mirror partially to arch.d.o. I would hate to see all of emacs in the local dir of people who just want to check out devotee. So arch does have a different mechanism of doing distributed repositories; but the repositories are distributed in the sense that I control one repo, but branches in my repo are children of other repositories, and can be merged and tagged back and from, The whole idea of the proposal is to NOT create an export. If we are not creating and export, and we are only shipping the repository data, how come there needs to be a check for uncommitted files? If the changes are uncommitted, that means the repo does not know about it; and if we only ship the repository data, we are not shipping stuff not in the repo. What am I missing? They might be uncommitted because the maintainer forgot to commit them. The only question is whether we should abort, commit the changes, or ignore the changes. There is no technical problem with either of these cases. Well, as a developer, I would rather that someone else running dpkg source on a package not try to commit to my repo, since it shall fail. Assuming we consider trying to support arch-like distributed version control systems in the new dpkg; it might well be that the current approach is too focussed on git/bzr type version control to work well with arch. manoj -- DEATH: The penultimate commercial transaction finalized by probate. Bernard Rosenberg Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 06:24:15PM -0500, Manoj Srivastava wrote: On Sun, 7 Oct 2007 22:04:21 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: bzr and git always ship the complete repository with each working directory. This is why they are called distributed. Arch seems to be some weird thing in between truly central and truly distributed VCS. I am not sure I see this. Arch repositories are distributed, and you can pull, branch, and tag off any repository out there in the meta-verse. But every directory also has a semi permanent URI; and checking pout a branch locally does not end up with you downloading the terabytes of stuff in the repo out there. Lets not exagerate. At least for git the repository will usually be smaller or only little larger than the working directory. It will probably compress worse though. This might be because you can have more than one project in a repo; my repo contains CVS emacs, unicode emacs, as well as most of the SELinux packages, etc, and I mirror partially to arch.d.o. I would hate to see all of emacs in the local dir of people who just want to check out devotee. So arch does have a different mechanism of doing distributed repositories; but the repositories are distributed in the sense that I control one repo, but branches in my repo are children of other repositories, and can be merged and tagged back and from, Out of interest, which of the following actions would need remote access? log view (including diffs between revisions) annotation/blame view creating a new commit/revision/tag reverting a dirty working tree to a clean one For git/bzr, the answer is usually no to all of these. If you have a shallow copy in git, the answers to the first two become yes, since you will need it convert to a full copy first . [...] Assuming we consider trying to support arch-like distributed version control systems in the new dpkg; it might well be that the current approach is too focussed on git/bzr type version control to work well with arch. It most probably is. Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Mon, 8 Oct 2007 02:55:37 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: On Sun, Oct 07, 2007 at 06:24:15PM -0500, Manoj Srivastava wrote: On Sun, 7 Oct 2007 22:04:21 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: bzr and git always ship the complete repository with each working directory. This is why they are called distributed. Arch seems to be some weird thing in between truly central and truly distributed VCS. I am not sure I see this. Arch repositories are distributed, and you can pull, branch, and tag off any repository out there in the meta-verse. But every directory also has a semi permanent URI; and checking out a branch locally does not end up with you downloading the terabytes of stuff in the repo out there. Lets not exagerate. At least for git the repository will usually be smaller or only little larger than the working directory. It will probably compress worse though. How is this magic done? If I have several dozen feature branches, all feeding back and forth, and have made lots and lots of changes in my sources, how does git preserve all this information without a commensurate increase in size? This makes the information theory geek in me very very skeptical. Or are you talking about typical usage, and is that why people go around making shallow copies to cut down on the size of the shipped repo? This might be because you can have more than one project in a repo; my repo contains CVS emacs, unicode emacs, as well as most of the SELinux packages, etc, and I mirror partially to arch.d.o. I would hate to see all of emacs in the local dir of people who just want to check out devotee. So arch does have a different mechanism of doing distributed repositories; but the repositories are distributed in the sense that I control one repo, but branches in my repo are children of other repositories, and can be merged and tagged back and from, Out of interest, which of the following actions would need remote access? log view (including diffs between revisions) The ./{arch} directory does contain logs. Diffs between revisions requires access to the repository (or the local cache library, if that contains the revision we want to diff with or from) annotation/blame view Same thing; you need access to the repo since the code for the other revisions is not in the checked out directory. creating a new commit/revision/tag Committing it would require access to the repo. reverting a dirty working tree to a clean one I think you are talking about reverting local changes to the latest revision from the repository. Well, that needs acess to the repo or a local cache. For git/bzr, the answer is usually no to all of these. If you have a shallow copy in git, the answers to the first two become yes, since you will need it convert to a full copy first . For arch, the answer is yes to all these cases. [...] Assuming we consider trying to support arch-like distributed version control systems in the new dpkg; it might well be that the current approach is too focussed on git/bzr type version control to work well with arch. It most probably is. As far as I can tell, most of the things being done for git are not required if I ship a working directory for for arch ({arch} and .arh-ids); and the only other thing required would be to also ship what lives in the grab file in the control file; so people can know where to register the archive location from to get access to the other information. If people wanted to provide changes, all that is needed is for them to tag the developers branch, hack, and ask the developers to pull from their branch (people have done that for ucf and devotee in the past). What exactly is the goal of this dpkg addition? With arch, I can ship a full working copy; and as long as people have the repository registered, they have full access to older revisions and feature branches and all. Would shipping the full working dir get by the requirement of shipping the diff.gz? If so, we can support arch with no changes to dpkg whatsoever. manoj -- You never hesitate to tackle the most difficult problems. Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/ 1024D/BF24424C print 4966 F272 D093 B493 410B 924B 21BA DABB BF24 424C -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sun, Oct 07, 2007 at 09:45:20AM -0400, Joey Hess wrote: Anthony Towns wrote: So the logic there would be: if there's an upstream tag, then generate an .orig.tgz if there's a pristine-tar info, hax0r it to be pristine generate a .diff.gz if the .diff failed goto bailout generate a .dsc containing the orig and diff It's not generally possible to generate a .diff.gz that expresses all the changes that might be in a git repository. Right, but it is possible to detect that, and bailout to generating a .tar.gz, no? Repo formats that bzr in etch can unpack could be denoted by Source-Depends: dpkg-bzr (= 0.11) I was thinking about Source-Depends too, the main problem seems to be that it would need to be supported in apt-get source too. I wonder if we could just use build-depends. apt-get source support could just be a warning This package cannot be unpacked without installed. Using Build-Depends: would make it pretty complicated to know which bits were needed for unpacking, if that's all you're trying to do. Cheers, aj signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Here's an updated patch, full diff from head again, with: - use git-config --null - git-config --filename only needs a full path if not run from a git WC - import the VCS module so it can check if the VCS is available - fix all commands that spawn a subshell - delete the reflog -- see shy jo diff --git a/debian/dpkg-dev.install b/debian/dpkg-dev.install index 49e3835..ee65dbf 100644 --- a/debian/dpkg-dev.install +++ b/debian/dpkg-dev.install @@ -56,3 +56,4 @@ usr/share/man/*/dpkg-shlibdeps.1 usr/share/man/*/*/dpkg-source.1 usr/share/man/*/dpkg-source.1 usr/share/perl5/Dpkg/BuildOptions.pm +usr/share/perl5/Dpkg/Source diff --git a/man/dpkg-source.1 b/man/dpkg-source.1 index 9bf9ff3..14c17c3 100644 --- a/man/dpkg-source.1 +++ b/man/dpkg-source.1 @@ -55,6 +55,10 @@ will look for the original source tarfile or the original source directory .IB directory .orig depending on the \fB\-sX\fP arguments. + + +If the source package is being built as a version 3 source package using +a VCS, no upstream tarball or original source directory is needed. .TP .BR \-h , \-\-help Show the usage message and exit. @@ -109,7 +113,9 @@ This option negates a previously set .BR \-i [\fIregexp\fP] You may specify a perl regular expression to match files you want filtered out of the list of files for the diff. (This list is -generated by a find command.) \fB\-i\fR by itself enables the option, +generated by a find command.) (If the source package is being built as a +version 3 source package using a VCS, this is instead used to +ignore uncommitted files.) \fB\-i\fR by itself enables the option, with a default that will filter out control files and directories of the most common revision control systems, backup and swap files and Libtool build output directories. There can only be one active regexp, of multiple @@ -162,6 +168,9 @@ will not overwrite existing tarfiles or directories. If this is desired then .BR \-sA , \-sP , \-sK , \-sU and \-sR should be used instead. +.PP +If the source package is being built as a version 3 source package using +a VCS, these options do not make sense, and will be ignored. .TP .BR \-sk Specifies to expect the original source as a tarfile, by default diff --git a/scripts/Dpkg/Source/VCS/git.pm b/scripts/Dpkg/Source/VCS/git.pm new file mode 100644 index 000..431fab3 --- /dev/null +++ b/scripts/Dpkg/Source/VCS/git.pm @@ -0,0 +1,257 @@ +#!/usr/bin/perl +# +# git support for dpkg-source +# +# Copyright © 2007 Joey Hess [EMAIL PROTECTED]. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +package Dpkg::Source::VCS::git; + +use strict; +use warnings; +use Cwd; +use File::Find; +use Dpkg; +use Dpkg::Gettext; + +push (@INC, $dpkglibdir); +require 'controllib.pl'; + +# Remove variables from the environment that might cause git to do +# something unexpected. +delete $ENV{GIT_DIR}; +delete $ENV{GIT_INDEX_FILE}; +delete $ENV{GIT_OBJECT_DIRECTORY}; +delete $ENV{GIT_ALTERNATE_OBJECT_DIRECTORIES}; +delete $ENV{GIT_WORK_TREE}; + +sub import { + foreach my $dir (split(/:/, $ENV{PATH})) { + if (-x $dir/git) { + return 1; + } + } + main::error(sprintf(_g(This source package can only be unpacked using git, which is not in the PATH.))); +} + +sub sanity_check { + my $srcdir=shift; + + if (! -d $srcdir/.git) { + main::error(sprintf(_g(source directory is not the top directory of a git repository (%s/.git not present), but Format git was specified), $srcdir)); + } + if (-s $srcdir/.gitmodules) { + main::error(sprintf(_g(git repository %s uses submodules. This is not yet supported.), $srcdir)); + } + + # Symlinks from .git to outside could cause unpack failures, or + # point to files they shouldn't, so check for and don't allow. + if (-l $srcdir/.git) { + main::error(sprintf(_g(%s is a symlink), $srcdir/.git)); + } + my $abs_srcdir=Cwd::abs_path($srcdir); + find(sub { + if (-l $_) { + if (Cwd::abs_path(readlink($_)) !~ /^\Q$abs_srcdir\E(\/|$)/) { +main::error(sprintf(_g(%s is a symlink to outside %s), $File::Find::name, $srcdir)); + } + } + }, $srcdir/.git); + + return 1; +} + +# Returns a hash of arrays of git config values. +sub read_git_config { + my $file=shift; + + my %ret; + open(GIT_CONFIG, '-|', git-config, --file, $file, --null, -l) || + main::subprocerr(git-config); + my ($key, $value); + while (GIT_CONFIG) { +
Re: [PATCH] proposed v3 source format using .git.tar.gz
Russ Allbery wrote: It's a little disturbing to have content in parentheses be significant in a format based on RFC 822, although we have broken this rule elsewhere (most notably in dependency fields, of course). If it helps, the (git) comment is only used in debian/control, it's not put in the dsc files. I'd be just as happy to use [git] or even Foo: git. -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote: I've been working on making dpkg-source support a new source package format based upon git. The idea is that a source package has only a .dsc and a .git.tar.gz, which is just a git repo. Is a .gitdiff.tar.gz possible, so the archive doesn't need to have the full git repo replaced by each upload? ie, something like Files: foo_1.0-1.git.tar.gz foo_1.0-2.gitdiff.tar.gz so that a small patch only adds a small file to the archive rather than replacing a large one? This means you can't build the package by hand with standard unix tools -- at the very least you need git installed, and if other VC systems are to be supported, you need them too. Changes in repository formats will presumably result in versioned dependencies too. This is slightly worse than the case for existing patch management tools in that most of those can be dealt with by hand; though cdbs and to a lesser extent debhelper can't be quite as easily replicated I guess. Once the unpack is done, I don't see any reason why you can't do an NMU in the traditional way, so presuming dpkg-source -x or apt-get source handles the unpack automatically, I don't think it necessarily imposes any new requirements on NMUers. Maybe providing a feature on packages.debian.org (or similar) to download sources in simple, non-VC, tarball format would make this a complete non-issue though? Would it make sense to have the source format look more like: Format: 3.0 Source: dpkg ... Source-Depends: git-dpkg (= 3.14159) Source-Hooks: /usr/bin/git-dpkg ... Files: ... foo_1.2.git.tar.gz and have the git specific functionality be provided by a /usr/bin/git-dpkg binary (with standardised arguments) from the git-dpkg package? That would let you smoothly deal with repository changes and implementing new interfaces, and also let us limit the allowable formats for the archive reasonably simply. You could drop the Source-Hooks: line, and just have dpkg-source know to associate *.git.tar.gz with /usr/lib/dpkg/source/git, and trust the package will provide it. Bonus points: rather than debian/rules clean, create a diff, build, have dpkg do debian/rules clean, commit any uncommitted changes with the commit message being the changes from the changelog, create a .git.tgz, build for git-source-format packages. Cheers, aj signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Anthony Towns wrote: Is a .gitdiff.tar.gz possible, so the archive doesn't need to have the full git repo replaced by each upload? ie, something like Files: foo_1.0-1.git.tar.gz foo_1.0-2.gitdiff.tar.gz so that a small patch only adds a small file to the archive rather than replacing a large one? I think it's possible, the gitdiff might use git packs against a prior repo. That would be a nice enhancement to what I have done. This means you can't build the package by hand with standard unix tools -- at the very least you need git installed, and if other VC systems are to be supported, you need them too. Yes, as I mention in the faq I think this is an acceptable tradeoff to get away from having to use diff. Changes in repository formats will presumably result in versioned dependencies too. I don't think that dpkg should add vcs formats that we don't have a good expectation of remaining supported by newer versions of the tools going forward (so svn repos are out). There's a bit of discussion of this in the faq. I think that git has a pretty good track record and has incentive to keep compatibility support since this format is used over the wire by git (eg, with http urls). If the format changes in a non-backwards compatible way, we could have source packages built on unstable that cannot be extracted on stable, which I also think is suboptimal, but hard to completly avoid. This is slightly worse than the case for existing patch management tools in that most of those can be dealt with by hand; though cdbs and to a lesser extent debhelper can't be quite as easily replicated I guess. Neither could packages using quilt before it was available in stable or dbs before it was. Once the unpack is done, I don't see any reason why you can't do an NMU in the traditional way, so presuming dpkg-source -x or apt-get source handles the unpack automatically, I don't think it necessarily imposes any new requirements on NMUers. Basically, you have to know how to git commit your changes before building the NMU, and that's all. As a bonus, it's rather easier to generate NMU patchsets. :-) Maybe providing a feature on packages.debian.org (or similar) to download sources in simple, non-VC, tarball format would make this a complete non-issue though? pristine-tar could be used for this, it would just need source packages to put the delta somewhere standaised (under debian/), and would need some standarised way to get to the upstream source branch in git. Would it make sense to have the source format look more like: Format: 3.0 Source: dpkg ... Source-Depends: git-dpkg (= 3.14159) Source-Hooks: /usr/bin/git-dpkg ... Files: ... foo_1.2.git.tar.gz and have the git specific functionality be provided by a /usr/bin/git-dpkg binary (with standardised arguments) from the git-dpkg package? That would let you smoothly deal with repository changes and implementing new interfaces, and also let us limit the allowable formats for the archive reasonably simply. You could drop the Source-Hooks: line, and just have dpkg-source know to associate *.git.tar.gz with /usr/lib/dpkg/source/git, and trust the package will provide it. Not sure if this buys anything that using perl modules for the vcses can't do, really. How do you envision this helping deal with repository format changes? Bonus points: rather than debian/rules clean, create a diff, build, have dpkg do debian/rules clean, commit any uncommitted changes with the commit message being the changes from the changelog, create a .git.tgz, build for git-source-format packages. I have a feeling that any auto-commit stuff should be controlled by an option. I'm *sure* that it would annoy some developers. No strong feelings about whether it should default on or off, though least suprise suggests *off*. -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Joey Hess wrote: Maybe providing a feature on packages.debian.org (or similar) to download sources in simple, non-VC, tarball format would make this a complete non-issue though? pristine-tar could be used for this, it would just need source packages to put the delta somewhere standaised (under debian/), and would need some standarised way to get to the upstream source branch in git. BTW, if that were standardised, the other option would be for dpkg-source -x to regenerate the pristine upstream tarball. -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Joey Hess wrote: Bonus points: rather than debian/rules clean, create a diff, build, have dpkg do debian/rules clean, commit any uncommitted changes with the commit message being the changes from the changelog, create a .git.tgz, build for git-source-format packages. I have a feeling that any auto-commit stuff should be controlled by an option. I'm *sure* that it would annoy some developers. No strong feelings about whether it should default on or off, though least suprise suggests *off*. One problem with auto-committing is tags. Developers will probably want to tag their release before doing the final release build, and if dpkg-source then found and auto-committed a further change, the tag wouldn't accurately match the release. -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sat, Oct 06, 2007 at 05:27:04PM +1000, Anthony Towns wrote: On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote: This means you can't build the package by hand with standard unix tools -- at the very least you need git installed, and if other VC systems are to be supported, you need them too. Changes in repository formats will presumably result in versioned dependencies too. This is slightly worse than the case for existing patch management tools in that most of those can be dealt with by hand; though cdbs and to a lesser extent debhelper can't be quite as easily replicated I guess. A similar problem arises with Format: 2.0 packages as well if the user hasn't bzip2 (unlikely) or lzma (likely) installed and tries to unpack a source package built with them. Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sat, Oct 06, 2007 at 11:19:43AM -0400, Joey Hess wrote: Anthony Towns wrote: Is a .gitdiff.tar.gz possible, so the archive doesn't need to have the full git repo replaced by each upload? ie, something like Files: foo_1.0-1.git.tar.gz foo_1.0-2.gitdiff.tar.gz so that a small patch only adds a small file to the archive rather than replacing a large one? I think it's possible, the gitdiff might use git packs against a prior repo. That would be a nice enhancement to what I have done. I think there is a mechanism in git to disallow replacing old pack files (i.e. forcing to create additional ones with only new objects), however, I haven't used that myself, yet. On a general note: I think we definetly could need the better tarball compression support _before_ adding huge amount of history into the archive... Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote: I've been working on making dpkg-source support a new source package format based upon git. The idea is that a source package has only a .dsc and a .git.tar.gz, which is just a git repo. So, I can't stand git's user interface. I generally try to avoid making a huge issue of this since it seems to be massively political on places like Planet at the moment, there seems to be a certain amount of confusion of people's personal opinions with that of their employers going on, and in any case I normally find that revision control flamewars have negative utility. (I don't think it's terribly relevant to this discussion why I prefer not to use git, and I don't want to sidetrack the thread with that; I just wanted to present an existence case of somebody who doesn't want to switch to .git.tar.gz and yet doesn't want to stay with .orig.tar.gz and .diff.gz forever.) Still, this work looks pretty cool, and I'd like to be able to make use of it despite avoiding git whenever I can. I noticed that you'd helpfully structured your changes such that it would be possible to plug in a different revision control system, so I wrote a module to support bzr. The patch is attached to this e-mail, and I'd appreciate comments; if this work is merged into dpkg I'd be very happy if my addition were merged too. There are probably some improvements to be made, but it was really utterly trivial; I was impressed that I didn't have to touch anything else beyond plugging in a new module. Ironically, of course, I did use git to create it. :-) While working on this I was thinking about general issues with the format. It seems to me that it's suboptimal not to ship a working tree. I know you sort of address this in the wiki FAQ, and I realise that there are space advantages to only shipping the VCS data. However, I'd like to try to persuade you otherwise if I can. My concerns are: * Users will need to have the VCS installed in order to inspect the source. It's true that this is no worse than dbs or dpatch or whatever, and in fact it's better because dpkg-source will take care of the unpacking step automatically. Still, I do think it is a downside; we do still ship /usr/share/doc/debian/source-unpack.txt, and people do unpack Debian source packages on other systems from time to time and inspect them (I certainly do the same in the other direction with source RPMs, and curse their complexity). Plus, if the VCS fails to reconstitute a working tree for some unforeseen reason (maybe you have a broken installation of it, or maybe there was some version skew, or something else), then you're rather screwed. Tarballs are nice and simple and, assuming they were transferred accurately, hardly ever break in ways that make it impossible for you to extract the files. * Buildds will need to have the VCS installed in their base system. Possibly a minor concern since sbuild does the unpack in the base rather than in the chroot, but it's there nevertheless. Every derivative distribution that runs its own buildds will need to take care of this too. * Some source packages want to ship non-VCS-managed files. It's very common for source packages to include autogenerated objects like configure, Makefile.in, etc. Whether to check these into a VCS is a somewhat religious matter (as acknowledged by the gettext info documentation, for instance), and personally I lean towards checking them in (with a few exceptions) just because it makes it easier to see when they change and keep an eye out for oddities, but I know that a lot of developers prefer to keep these outside their VCS. Shipping a working tree would make it easier to handle cases like this. There are two obvious modifications to Joey's proposal that would allow shipping a working tree. The first is just to include the working tree in the .$VCS.tar.gz object. This has the advantage of being trivial to implement on top of the current code: the git module would need to do a 'git checkout' after copying the .git, and the bzr module just wouldn't call 'bzr remove-tree'. The second possibility seems to me to be more flexible, though, and probably not all that hard to implement: build both a .tar.gz (containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source -x' to unpack the tree given at least one of these. This would allow various interesting possibilities such as: * Buildds could just fetch the .tar.gz; they have no need of the VCS data. Users who just want to inspect the current version of the source and not change it might want to do this too, using (say) 'apt-get source --no-vcs package'. * Developers on slow connections could say 'apt-get source --vcs-only package' to fetch just the .$VCS.tar.gz, with the documented caveat that it would be just like checking the source out of a VCS in
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sat, Oct 06, 2007 at 11:17:58PM +0200, Frank Lichtenheld wrote: On Sat, Oct 06, 2007 at 05:27:04PM +1000, Anthony Towns wrote: This means you can't build the package by hand with standard unix tools -- at the very least you need git installed, and if other VC systems are to be supported, you need them too. Changes in repository formats will presumably result in versioned dependencies too. This is slightly worse than the case for existing patch management tools in that most of those can be dealt with by hand; though cdbs and to a lesser extent debhelper can't be quite as easily replicated I guess. A similar problem arises with Format: 2.0 packages as well if the user hasn't bzip2 (unlikely) or lzma (likely) installed and tries to unpack a source package built with them. Perhaps 'apt-get source' et al could notice this class of situation and offer to install the necessary unpacking tools for you. It'd have to rely on sudo or similar as 'apt-get source' is typically run as non-root, but it seems like a useful enhancement even so. -- Colin Watson [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sat, Oct 06, 2007 at 10:37:48PM +, Colin Watson wrote: On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote: I've been working on making dpkg-source support a new source package format based upon git. The idea is that a source package has only a .dsc and a .git.tar.gz, which is just a git repo. [...] Still, this work looks pretty cool, and I'd like to be able to make use of it despite avoiding git whenever I can. I noticed that you'd helpfully structured your changes such that it would be possible to plug in a different revision control system, so I wrote a module to support bzr. The patch is attached to this e-mail, and I'd appreciate comments; if this work is merged into dpkg I'd be very happy if my addition were merged too. There are probably some improvements to be made, but it was really utterly trivial; I was impressed that I didn't have to touch anything else beyond plugging in a new module. Ironically, of course, I did use git to create it. :-) I guess if we use Joey's idea at all we will not be able to avoid shipping such a module for each distributed VCS, and I didn't get the impression that Joey thought otherwise. So I find your mail strangely defensive :) The code itself looks good AFAICT. While working on this I was thinking about general issues with the format. It seems to me that it's suboptimal not to ship a working tree. I know you sort of address this in the wiki FAQ, and I realise that there are space advantages to only shipping the VCS data. However, I'd like to try to persuade you otherwise if I can. My concerns are: Shipping the worktree essentially means defining this new format as an optional add-on, since you ship all the data you ship now plus some VCS metadata. So all packages will have to be bigger than there are now (aside from using other compression methods than gzip, and after really building some packages today with my dpkg-source -C patch I have to say I'm impressed how much space we might be able to save - with high CPU costs, though). This is not really an argument for either side, just wanted to make this effect clean. * Users will need to have the VCS installed in order to inspect the source. [...] * Buildds will need to have the VCS installed in their base system. [...] * Some source packages want to ship non-VCS-managed files. [...] Is the last one really such a big problem in Debian? I know that many upstream VCS don't contain autogenerated files but most .orig.tar.gz's already contain them today, so I would have guessed people either only have their debian/ in their Debian VCS or all upstream files from the .orig.tar.gz. There are two obvious modifications to Joey's proposal that would allow shipping a working tree. The first is just to include the working tree in the .$VCS.tar.gz object. This has the advantage of being trivial to implement on top of the current code: the git module would need to do a 'git checkout' after copying the .git, and the bzr module just wouldn't call 'bzr remove-tree'. This would be a bad idea IMHO, and like a regression: instead of shipping a .orig.tar+diff we now ship one, monolithic (bigger) tarball? Sounds suboptimal. I'm pretty sure I don't want to see this one implemented in dpkg-dev. The second possibility seems to me to be more flexible, though, and probably not all that hard to implement: build both a .tar.gz (containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source -x' to unpack the tree given at least one of these. This would allow various interesting possibilities such as: Since you're essentially demoting the new format to an add-on, why not just make it really one and just ship a real Format: 1.0 package (i.e. orig-tar+diff or native-tar) instead of this half-half-working-tree-tarball. [...] These seem to me to be non-trivial advantages that outweigh the space costs of shipping around the working tree. I'd be willing to have a go at implementing this once I've had a bit more sleep. Does any of this make sense? I guess there are two aspects to Joey's proposal: 1) Make the source package more useful by including VCS metadata like history 2) Make is easier to include arbitrary changes to the upstream sources by using more advanced tools than diff/patch, i.e. a DVCS By concentrating on the first point and making it optional you either have to sacrifice point 2 by reusing the old source package (orig+diff) or give people who choose not to download the vcs data a worse experience by making it harder for them to find the actual diff (working tree tar). On second thought you can reduce the regression by adding a pristine-gz delta to the working tree so that you can split the working tree tarball back into a orig+diff. On third thought who says you have to fall back to Format 1.0 for the non-VCS data? You could also fall back to Format 2.0 which would make preserving advantage 2 easier. So, no idea if my ramblings made any sense,
Re: [PATCH] proposed v3 source format using .git.tar.gz
Frank Lichtenheld wrote: I think there is a mechanism in git to disallow replacing old pack files (i.e. forcing to create additional ones with only new objects), however, I haven't used that myself, yet. The packs in the diff package would be basically the same packs that git-send-pack generates when git is pushing objects to a remote repository. Where the remote repo would be the contents of foo_1.0-1.git.gz, and the local repo would be foo-1.0-2. Intercept those packs in transit (how?), and then you can take the 1.0-1 repo and later apply them to it to regenerate the 1.0-2 repo. On a general note: I think we definetly could need the better tarball compression support _before_ adding huge amount of history into the archive... This would mostly be an optimisation for upload size, total archive size is only affected if foo 1.0-1 is in testing and 1.0-2 in unstable. It's actually much more significant to both upload and total archive size that all 61mb of dpkg's .git not be put into its .git.tar.gz. Thus the shallow clones with only a few hundred repos or so. -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Colin Watson wrote: So, I can't stand git's user interface. I generally try to avoid making a huge issue of this since it seems to be massively political on places like Planet at the moment, there seems to be a certain amount of confusion of people's personal opinions with that of their employers going on, and in any case I normally find that revision control flamewars have negative utility. (I don't think it's terribly relevant to this discussion why I prefer not to use git, and I don't want to sidetrack the thread with that; I just wanted to present an existence case of somebody who doesn't want to switch to .git.tar.gz and yet doesn't want to stay with .orig.tar.gz and .diff.gz forever.) (So, FWIW, I'm not sold on git. Not sold at all yet. But it was a good choice for this implementation for several reasons.) Still, this work looks pretty cool, and I'd like to be able to make use of it despite avoiding git whenever I can. I noticed that you'd helpfully structured your changes such that it would be possible to plug in a different revision control system, so I wrote a module to support bzr. Nice. The FAQ has some questions aimed at adding other revision control systems, could you try to answer those in the context of bzr? In particular, is the data that would be shipped in the source package the same data that bzr normally reads from untrusted sources, thus ensuring that using it this way is equally (in)secure as using bzr to pull data over the network? (Note that this wasn't 100% true for git and I have had to put in several workarounds.) And is the data format stable and/or one that bzr has a history of supporting old versions of in a way that ensures backwards compatability? Also, will the bzr repos always contain the full history, or is there an equivilant to git shallow clones? How big do they tend to be? It's true that this is no worse than dbs or dpatch or whatever, and in fact it's better because dpkg-source will take care of the unpacking step automatically. Still, I do think it is a downside; we do still ship /usr/share/doc/debian/source-unpack.txt BTW, source-unpack.txt fails for both packages containing debian/subdirs/ and of course for wig-n-pen.. * Buildds will need to have the VCS installed in their base system. This seems easily solved by recommends (installed by default). * Some source packages want to ship non-VCS-managed files. It's very common for source packages to include autogenerated objects like configure, Makefile.in, etc. Whether to check these into a VCS is a somewhat religious matter (as acknowledged by the gettext info documentation, for instance), and personally I lean towards checking them in (with a few exceptions) just because it makes it easier to see when they change and keep an eye out for oddities, but I know that a lot of developers prefer to keep these outside their VCS. Shipping a working tree would make it easier to handle cases like this. Hmm, I hadn't considered that this might be a problem. I don't know if I'd want to write the code to do this, but shipping a partial working tree consisting of just those files would be enough to solve this. * Space-constrained mirrors could conceivably exclude the VCS data if they had to, though we probably wouldn't encourage this. These seem to me to be non-trivial advantages that outweigh the space costs of shipping around the working tree. The space constraints seem pretty hard to me. Specifically, I don't want to piss the ftpmasters off and get vcs source packages banned from the archive.. The only saving grace really seems to be that shipping both vcs and upstream tar will only double the size of the archive once most everything uses the new format, and the archive will have probably doubled in size several times over due to other factors before then. I've eyeballed the code, it looks ok though so close to code I've been looking at all week that I may be missing trees for the forest. :-) -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
Frank Lichtenheld wrote: I guess if we use Joey's idea at all we will not be able to avoid shipping such a module for each distributed VCS, and I didn't get the impression that Joey thought otherwise. I do think otherwise. If the distributed (or other) VCS does not meet our criteria for security and backwards compatability, then we should not ship it. And yes, it'll be up to the dpkg maintainers to enforce those criteria if you crack open the floodgates.. Is the last one really such a big problem in Debian? I know that many upstream VCS don't contain autogenerated files but most .orig.tar.gz's already contain them today, so I would have guessed people either only have their debian/ in their Debian VCS or all upstream files from the .orig.tar.gz. So would I, and most of the tools like git-buildpackage seem to assume it too and not try to support this case AFAICS. Colin's probably right that it's an issue religious wars can be fought over, but if they're being fought in the context of keeping package source in revision control it's happening quietly.. -- see shy jo signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sat, Oct 06, 2007 at 11:19:43AM -0400, Joey Hess wrote: Anthony Towns wrote: Changes in repository formats will presumably result in versioned dependencies too. I don't think that dpkg should add vcs formats that we don't have a good expectation of remaining supported by newer versions of the tools going forward (so svn repos are out). It's more that newer versions of the tools will create more optimised repo formats, that older versions don't support -- bzr has done this between etch and lenny, eg. My inclination would be to have dpkg support it, but have it generate a REJECT at upload time if we don't want to support the new format (yet). If the format changes in a non-backwards compatible way, we could have source packages built on unstable that cannot be extracted on stable, which I also think is suboptimal, but hard to completly avoid. Well, that's true of any Version: 3 format already anyway. Once the unpack is done, I don't see any reason why you can't do an NMU in the traditional way, so presuming dpkg-source -x or apt-get source handles the unpack automatically, I don't think it necessarily imposes any new requirements on NMUers. Basically, you have to know how to git commit your changes before building the NMU, and that's all. As a bonus, it's rather easier to generate NMU patchsets. :-) Well, there's two options: - dpkg-source knows it's meant to be a git package, and can either warn you you have uncommitted changes (and tell you what to do) or just auto commit them for you - dpkg-source doesn't know what sort of package it's meant to be and just builds a v1 source package Both of which sound pretty trivial for an NMUer to deal with... Maybe providing a feature on packages.debian.org (or similar) to download sources in simple, non-VC, tarball format would make this a complete non-issue though? pristine-tar could be used for this, it would just need source packages to put the delta somewhere standaised (under debian/), and would need some standarised way to get to the upstream source branch in git. So the logic there would be: if there's an upstream tag, then generate an .orig.tgz if there's a pristine-tar info, hax0r it to be pristine generate a .diff.gz if the .diff failed goto bailout generate a .dsc containing the orig and diff publish all three else: (bailout:) generate a .tar.gz generate a .dsc containing the tar publish both Would it make sense to have the source format look more like: Format: 3.0 Source: dpkg ... Source-Depends: git-dpkg (= 3.14159) Source-Hooks: /usr/bin/git-dpkg ... Files: ... foo_1.2.git.tar.gz You could drop the Source-Hooks: line, and just have dpkg-source know to associate *.git.tar.gz with /usr/lib/dpkg/source/git, and trust the package will provide it. Not sure if this buys anything that using perl modules for the vcses can't do, really. It doesn't buy anything extra, so forget the Source-Hooks: and just consider it to be a different package providing the VCS-specific perl module. That buys you: - no changes to dpkg to support new source formats - easy for other distros to support more or fewer VCS formats - version info to deal with new repo formats - explicit dependency info that can be checked at upload time to block source formats we don't want to support How do you envision this helping deal with repository format changes? Repo formats that bzr in etch can unpack could be denoted by Source-Depends: dpkg-bzr (= 0.11) while repo formats that require bzr from lenny or later could be denoted by: Source-Depends: dpkg-bzr (= 0.18) (Or you could have a versioning scheme that matches the repo format directly, rather than the program being used. Or you could use virtual packages and say dpkg-bzr-v3 and have that be Provided: by some package/s, etc) It'd be straightforward to make a policy decision to only ACCEPT uploads with given Source-Depends: lines, eg ones that can be satisfied using packages from stable, while letting third party repos experiment with new repo formats without needing to use a different dpkg than Debian does. Cheers, aj signature.asc Description: Digital signature
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Sat, Oct 06, 2007 at 10:37:48PM +, Colin Watson wrote: The second possibility seems to me to be more flexible, though, and probably not all that hard to implement: build both a .tar.gz (containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source -x' to unpack the tree given at least one of these. This would allow various interesting possibilities such as: Would this be better in any way than having a web interface that provides an autogenerated version-1 source package? Presume it's a url like: http://v1source.qa.debian.org/i/ifupdown/ifupdown_0.6.8.dsc * Buildds could just fetch the .tar.gz; they have no need of the VCS data. Users who just want to inspect the current version of the source and not change it might want to do this too, using (say) 'apt-get source --no-vcs package'. dget -x http://v1source.qa.debian.org/i/ifupdown/ifupdown_0.6.8.dsc * Developers on slow connections could say 'apt-get source --vcs-only package' to fetch just the .$VCS.tar.gz, with the documented caveat that it would be just like checking the source out of a VCS in that you might have to recreate some autogenerated files. That happens automatically. * Space-constrained mirrors could conceivably exclude the VCS data if they had to, though we probably wouldn't encourage this. Mirrors wouldn't mirror the autogenerated stuff, so not an issue. * Derivative distributions who are slow to upgrade their dpkg-source could still interoperate to some degree. They'd need to pull sources from the autogenerated url; though they'd still probably have Build-Depends: issues if they're not updating packages generally. * Tools like mc, vim's tar plugin, or http://www.mirrorservice.org/sites/ftp.debian.org/debian/ could still be used straightforwardly and without modifications to look inside source packages on mirrors. Again, you'd have to go to the autogenerating url rather than a mirror. Cheers, aj signature.asc Description: Digital signature
[PATCH] proposed v3 source format using .git.tar.gz
I've been working on making dpkg-source support a new source package format based upon git. The idea is that a source package has only a .dsc and a .git.tar.gz, which is just a git repo. I've blogged[1] about some of what led me to this idea, and I've also written a short FAQ[2]. Suggest reading both to understand where I'm coming from with this. [1] http://kitenet.net/~joey/blog/entry/an_evolutionary_change_to_the_Debian_source_package_format/ [2] http://wiki.debian.org/GitSrc My implementation adds a new 3.0 version source format. A 3.0 format debian source package can consist of any files allowed by formats 1 and 2, but may also contain .$VCS.tar.gz files. To build a version 3 source package, a new field is needed in debian/control: Format: 3.0 (git) The bit in parens specifies that it should use the git backend, which is currently the only one available. That backend is in the Dpkg::Source::VCS::git perl module. I have a sourcev3 branch with my changes at git://kitenet.net/dpkg, and have also attached a diff to this mail. I feel that this is ready for review and hopefully merging into dpkg now. Looking forward to your comments. A sample dpkg source package built using this is at http://kitenet.net/~joey/tmp/git-demo/. This demo package includes only the last 200 commits to the dpkg git repo, so it's more than 1 mb *smaller* than dpkg's normal .tar.gz! -- see shy jo diff --git a/debian/dpkg-dev.install b/debian/dpkg-dev.install index 49e3835..ee65dbf 100644 --- a/debian/dpkg-dev.install +++ b/debian/dpkg-dev.install @@ -56,3 +56,4 @@ usr/share/man/*/dpkg-shlibdeps.1 usr/share/man/*/*/dpkg-source.1 usr/share/man/*/dpkg-source.1 usr/share/perl5/Dpkg/BuildOptions.pm +usr/share/perl5/Dpkg/Source diff --git a/man/dpkg-source.1 b/man/dpkg-source.1 index 9bf9ff3..14c17c3 100644 --- a/man/dpkg-source.1 +++ b/man/dpkg-source.1 @@ -55,6 +55,10 @@ will look for the original source tarfile or the original source directory .IB directory .orig depending on the \fB\-sX\fP arguments. + + +If the source package is being built as a version 3 source package using +a VCS, no upstream tarball or original source directory is needed. .TP .BR \-h , \-\-help Show the usage message and exit. @@ -109,7 +113,9 @@ This option negates a previously set .BR \-i [\fIregexp\fP] You may specify a perl regular expression to match files you want filtered out of the list of files for the diff. (This list is -generated by a find command.) \fB\-i\fR by itself enables the option, +generated by a find command.) (If the source package is being built as a +version 3 source package using a VCS, this is instead used to +ignore uncommitted files.) \fB\-i\fR by itself enables the option, with a default that will filter out control files and directories of the most common revision control systems, backup and swap files and Libtool build output directories. There can only be one active regexp, of multiple @@ -162,6 +168,9 @@ will not overwrite existing tarfiles or directories. If this is desired then .BR \-sA , \-sP , \-sK , \-sU and \-sR should be used instead. +.PP +If the source package is being built as a version 3 source package using +a VCS, these options do not make sense, and will be ignored. .TP .BR \-sk Specifies to expect the original source as a tarfile, by default diff --git a/scripts/Dpkg/Source/VCS/git.pm b/scripts/Dpkg/Source/VCS/git.pm new file mode 100644 index 000..cac7d05 --- /dev/null +++ b/scripts/Dpkg/Source/VCS/git.pm @@ -0,0 +1,226 @@ +#!/usr/bin/perl +# +# git support for dpkg-source +# +# Copyright © 2007 Joey Hess [EMAIL PROTECTED]. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +package Dpkg::Source::VCS::git; + +use strict; +use warnings; +use Cwd; +use File::Find; +use Dpkg; +use Dpkg::Gettext; + +push (@INC, $dpkglibdir); +require 'controllib.pl'; + +# Remove variables from the environment that might cause git to do +# something unexpected. +delete $ENV{GIT_DIR}; +delete $ENV{GIT_INDEX_FILE}; +delete $ENV{GIT_OBJECT_DIRECTORY}; +delete $ENV{GIT_ALTERNATE_OBJECT_DIRECTORIES}; +delete $ENV{GIT_WORK_TREE}; + +sub sanity_check { + my $srcdir=shift; + + if (! -s $srcdir/.git) { + main::error(sprintf(_g(%s is not the top directory of a git repository (%s/.git not present), but Format git was specified), $srcdir, $srcdir)); + } +
Re: [PATCH] proposed v3 source format using .git.tar.gz
Joey Hess [EMAIL PROTECTED] writes: My implementation adds a new 3.0 version source format. A 3.0 format debian source package can consist of any files allowed by formats 1 and 2, but may also contain .$VCS.tar.gz files. To build a version 3 source package, a new field is needed in debian/control: Format: 3.0 (git) The bit in parens specifies that it should use the git backend, which is currently the only one available. That backend is in the Dpkg::Source::VCS::git perl module. It's a little disturbing to have content in parentheses be significant in a format based on RFC 822, although we have broken this rule elsewhere (most notably in dependency fields, of course). I think this is a great idea, although I can't comment on the code implementation. -- Russ Allbery ([EMAIL PROTECTED]) http://www.eyrie.org/~eagle/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote: I have a sourcev3 branch with my changes at git://kitenet.net/dpkg, and have also attached a diff to this mail. I feel that this is ready for review and hopefully merging into dpkg now. Looking forward to your comments. A little code review follows. +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA old FSF address (not really important, but while we're at it ;) +sub sanity_check { + my $srcdir=shift; + + if (! -s $srcdir/.git) { + main::error(sprintf(_g(%s is not the top directory of a git repository (%s/.git not present), but Format git was specified), $srcdir, $srcdir)); you probably mean -e or -d here? -s on a directory is kinda strange. printing $srcdir twice might bloat the error message. + } + if (-s $srcdir/.gitmodules) { + main::error(sprintf(_g(git repository %s uses submodules. This is not yet supported.), $srcdir)); + } + + # Symlinks from .git to outside could cause unpack failures, or + # point to files they shouldn't, so check for and don't allow. + if (-l $srcdir/.git) { + main::error(sprintf(_g(%s is a symlink), $srcdir/.git)); + } + my $abs_srcdir=Cwd::abs_path($srcdir); + find(sub { + if (-l $_) { + if (Cwd::abs_path(readlink($_)) !~ /^\Q$abs_srcdir\E(\/|$)/) { + main::error(sprintf(_g(%s is a symlink to outside %s), $File::Find::name, $srcdir)); + } + } + }, $srcdir/.git); Maybe it would be easier to just disallow symlinks completly? Or are there important use cases for that? +} + +# Called before a tarball is created, to prepare the tar directory. +sub prep_tar { + my $srcdir=shift; + my $tardir=shift; + + sanity_check($srcdir); + + if (! -e $srcdir/.git) { + main::error(sprintf(_g(%s is not a git repository, but Format git was specified), $srcdir)); + } + if (-e $srcdir/.gitmodules) { + main::error(sprintf(_g(git repository %s uses submodules. This is not yet supported.), $srcdir)); + } Duplicated code from sanity_check + + # Check for uncommitted files. + open(GIT_STATUS, LANG=C cd $srcdir git-status |) || + main::subprocerr(cd $srcdir git-status); you make a lot cd $srcdir. Maybe you should just chdir() in the parent process? This would also take care of funny things in $srcdir like whitespaces... + my $clean=0; + my $status=; + while (GIT_STATUS) { + if (/^\Qnothing to commit (working directory clean)\E$/) { + $clean=1; + } + else { + $status.=git-status: $_; + } + } + close GIT_STATUS; + # git-status exits 1 if there are uncommitted changes or if + # the repo is clean, and 0 if there are uncommitted changes + # listed in the index. + if ($? $? 8 != 1) { + main::subprocerr(cd $srcdir git status); + } + if (! $clean) { + # To support dpkg-buildpackage -i, get a list of files dpkg-source -i would be the proper attribution here. dpkg-buildpackage implements -i only as a pass-through option. + # eqivilant to the ones git-status finds, and remove any is that an English word? + # ignored files from it. + my @ignores=--exclude-per-directory=.gitignore; + my $core_excludesfile=`cd $srcdir git-config --get core.excludesfile`; + chomp $core_excludesfile; + if (length $core_excludesfile -e $srcdir/$core_excludesfile) { + push @ignores, --exclude-from='$core_excludesfile'; + } + if (-e $srcdir/.git/info/exclude) { + push @ignores, --exclude-from=.git/info/exclude; + } + open(GIT_LS_FILES, cd $srcdir git-ls-files -m -d -o @ignores |) || + main::subprocerr(cd $srcdir git-ls-files); If you get rid of the cd you could use the '-|', @array form of open here which would be preferable imho. This is essentially running git-status again without the output beautification... Can't we avoid doing the work twice? Also I would prefer using long options where available. It's not like anyone has to type them more than once ;) + my @files; + while (GIT_LS_FILES) { + chomp; + if (! length $main::diff_ignore_regexp || + ! m/$main::diff_ignore_regexp/o) { + push @files, $_; + } + } + close(GIT_LS_FILES) ||
Re: [PATCH] proposed v3 source format using .git.tar.gz
One thing I forgot: On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote: @@ -825,14 +881,17 @@ if ($opmode eq 'build') { if ($native) { warning(_g(multiple tarfiles in native package)) if @tarfiles 1; warning(_g(native package with .orig.tar)) - unless $seen{'.tar'} or $seen{-$revision.tar}; + unless $seen{'.tar'} or $seen{-$revision.tar} or %vcsfiles; } else { - warning(_g(no upstream tarfile in Files field)) unless $seen{'.orig.tar'}; + warning(_g(no upstream tarfile in Files field)) unless $seen{'.orig.tar'} or %vcsfiles; This should probably error out. Aren't v3 packages always native in the sense tested here? if ($dscformat =~ /^1\./) { warning(sprintf(_g(multiple upstream tarballs in %s format dsc), $dscformat)) if @tarfiles 1; warning(sprintf(_g(debian.tar in %s format dsc), $dscformat)) if $debianfile; } } +if (%vcsfiles $dscformat !~ /^3\./) { + warning(sprintf(_g(rc.tar file in %s format dsc), $dscformat)); +} $newdirectory = $sourcepackage.'-'.$baseversion unless defined($newdirectory); $expectprefix = $newdirectory; Gruesse, -- Frank Lichtenheld [EMAIL PROTECTED] www: http://www.djpig.de/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Re: [PATCH] proposed v3 source format using .git.tar.gz
Thanks a lot for the code review. Any comments on the big picture or design? Frank Lichtenheld wrote: On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote: I have a sourcev3 branch with my changes at git://kitenet.net/dpkg, and have also attached a diff to this mail. I feel that this is ready for review and hopefully merging into dpkg now. Looking forward to your comments. A little code review follows. +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA old FSF address (not really important, but while we're at it ;) Copied from elsewhere in dpkg source. :-) +sub sanity_check { + my $srcdir=shift; + + if (! -s $srcdir/.git) { + main::error(sprintf(_g(%s is not the top directory of a git repository (%s/.git not present), but Format git was specified), $srcdir, $srcdir)); you probably mean -e or -d here? -s on a directory is kinda strange. printing $srcdir twice might bloat the error message. Yes, I meant -d, the -s snuck in from the other test. ACK on the duplication. + } + if (-s $srcdir/.gitmodules) { + main::error(sprintf(_g(git repository %s uses submodules. This is not yet supported.), $srcdir)); + } + + # Symlinks from .git to outside could cause unpack failures, or + # point to files they shouldn't, so check for and don't allow. + if (-l $srcdir/.git) { + main::error(sprintf(_g(%s is a symlink), $srcdir/.git)); + } + my $abs_srcdir=Cwd::abs_path($srcdir); + find(sub { + if (-l $_) { + if (Cwd::abs_path(readlink($_)) !~ /^\Q$abs_srcdir\E(\/|$)/) { + main::error(sprintf(_g(%s is a symlink to outside %s), $File::Find::name, $srcdir)); + } + } + }, $srcdir/.git); Maybe it would be easier to just disallow symlinks completly? Or are there important use cases for that? I've tried to not make dpkg have to know too much about git internals. (As you can see I've not been 100% successful, but have kept it to about the level someone with a week's knowledge of git would be comfortable with.) So while I don't see any symlinks in my git repos, if git decides to use symlinks, I don't want dpkg to have to be updated. (I think git did historically use symlinks in the repo). There are probably semi-valid reasons to manually add symlinks inside a .git directory today, too. +} + +# Called before a tarball is created, to prepare the tar directory. +sub prep_tar { + my $srcdir=shift; + my $tardir=shift; + + sanity_check($srcdir); + + if (! -e $srcdir/.git) { + main::error(sprintf(_g(%s is not a git repository, but Format git was specified), $srcdir)); + } + if (-e $srcdir/.gitmodules) { + main::error(sprintf(_g(git repository %s uses submodules. This is not yet supported.), $srcdir)); + } Duplicated code from sanity_check Doh! + + # Check for uncommitted files. + open(GIT_STATUS, LANG=C cd $srcdir git-status |) || + main::subprocerr(cd $srcdir git-status); you make a lot cd $srcdir. Maybe you should just chdir() in the parent process? I could make it do that, I suppose it would be safe as long as I cd back (dpkg-source in general assumes it's in the parent dir of the source tree). This would also take care of funny things in $srcdir like whitespaces... If you get rid of the cd you could use the '-|', @array form of open here which would be preferable imho. Wow, you've taught me something new, I only knew about the much more clumsy manual fork and open(-|) approach. I'll do this, but it will take a little while. + my $clean=0; + my $status=; + while (GIT_STATUS) { + if (/^\Qnothing to commit (working directory clean)\E$/) { + $clean=1; + } + else { + $status.=git-status: $_; + } + } + close GIT_STATUS; + # git-status exits 1 if there are uncommitted changes or if + # the repo is clean, and 0 if there are uncommitted changes + # listed in the index. + if ($? $? 8 != 1) { + main::subprocerr(cd $srcdir git status); + } + if (! $clean) { + # To support dpkg-buildpackage -i, get a list of files dpkg-source -i would be the proper attribution here. dpkg-buildpackage implements -i only as a pass-through option. True. + # eqivilant to the ones git-status finds, and remove any is that an English word? Even better, a common typo of one. :-) + # ignored files from it. + my @ignores=--exclude-per-directory=.gitignore; + my $core_excludesfile=`cd $srcdir git-config --get core.excludesfile`; + chomp
Re: [PATCH] proposed v3 source format using .git.tar.gz
Frank Lichtenheld wrote: One thing I forgot: On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote: @@ -825,14 +881,17 @@ if ($opmode eq 'build') { if ($native) { warning(_g(multiple tarfiles in native package)) if @tarfiles 1; warning(_g(native package with .orig.tar)) - unless $seen{'.tar'} or $seen{-$revision.tar}; + unless $seen{'.tar'} or $seen{-$revision.tar} or %vcsfiles; } else { - warning(_g(no upstream tarfile in Files field)) unless $seen{'.orig.tar'}; + warning(_g(no upstream tarfile in Files field)) unless $seen{'.orig.tar'} or %vcsfiles; This should probably error out. Aren't v3 packages always native in the sense tested here? Not necessarily. I wanted to leave the option open to use wig-n-pen to constuct mixed source packages that maybe use vcs for debian/ and pristine source for the rest + a diff.gz, or something like that. I think the code will basically handle unpacking such a mongrel, although there are no tools to create one. -- see shy jo signature.asc Description: Digital signature