Re: [PATCH] proposed v3 source format using .git.tar.gz

2008-02-11 Thread Otavio Salvador
Joey Hess [EMAIL PROTECTED] writes:

...
 However, stashing away uncommitted changes and not including them in the
 build violates least suprise. I'd except to see them either commited
 automatically, or the current error forcing me to resolve them before
 building. The advantage to auto-committing, of course, is that you don't
 have to know how to use git (or debcommit) to build a package that uses it.

Error out looks to be the most robust thing to do.

Otherwise we can start to get people not properly commiting changes
themselfs ofthenly.

...
 4) aj suggested in this thread to add a Source-Depends field which could
 be used to specify the dependencies needed to unpack the package. I
 guess that could prove useful, but I really would like to avoid that
 all packages need to specify it (even though that might be solvable with
 substvars defined by the plugin). OTOH if dpkg uses an internal
 mechanism to map format to dependencies it would be more difficult for
 other programs like apt to get to this information. Or is this all
 over-engineering and the plugin should check its pre-requisites itself
 and note the dependencies in the error message like the current code
 does.

 One appoach would be for dpkg to build a dpkg-dev-git package that
 includes the git format (and depends on git-core), and so on,
 then Format: 3.0 (foo) could be converted to dpkg-dev-foo.

Couldn't dpkg adds the needed packages, automatically, as
build-depends? This looks more logical to me.

-- 
O T A V I OS A L V A D O R
-
 E-mail: [EMAIL PROTECTED]  UIN: 5906116
 GNU/Linux User: 239058 GPG ID: 49A5F855
 Home Page: http://otavio.ossystems.com.br
-
Microsoft sells you Windows ... Linux gives
 you the whole house.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2008-02-10 Thread Frank Lichtenheld
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
 I have a sourcev3 branch with my changes at git://kitenet.net/dpkg,
 and have also attached a diff to this mail. I feel that this is ready
 for review and hopefully merging into dpkg now. Looking forward to your
 comments.

I've now added this branch to the official dpkg repository on alioth
with the intention to work on it. I've at least fixed it up so that
it works with the current code base.

After thinking a bit about this proposal I have the following
suggestions for changes that I would like to put up for discussion:

1) I don't really like the current behaviour when there are uncommitted
changes in the package directory. I would suggest as default behaviour
creating a commit containing these changes. This would eliminate the
need for people having to commit changes if they don't really care.

The most elegant solution would probably to create the commit, clone it
and then do a git reset HEAD^ in the package directory. Don't know if
that is robust enough, though.

Prompting the user for the commit message would probably be best but
would break if people try to run the program non-interactivly.

2) Independently from the default behaviour on pack we should definetly add
a command-line option for the user to choose between the three
possibilities 1) error out, 2) create a commit, 3) create a commit
interactivly

3) About the plugin interface: I was considering whether it would be
better to move the tar generation into the plugin itself. This would
allow other plugins more flexibility (e.g. generating more than one
file). My masterplan includes making source formats 1.0 and 2.0 plugins
internally ;)

This would of course require to move the tar generating and compressing
code to a module that can then be used by the plugins.

4) aj suggested in this thread to add a Source-Depends field which could
be used to specify the dependencies needed to unpack the package. I
guess that could prove useful, but I really would like to avoid that
all packages need to specify it (even though that might be solvable with
substvars defined by the plugin). OTOH if dpkg uses an internal
mechanism to map format to dependencies it would be more difficult for
other programs like apt to get to this information. Or is this all
over-engineering and the plugin should check its pre-requisites itself
and note the dependencies in the error message like the current code
does.

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2008-02-10 Thread Joey Hess
Frank Lichtenheld wrote:
 I've now added this branch to the official dpkg repository on alioth
 with the intention to work on it. I've at least fixed it up so that
 it works with the current code base.

Excellent. I had kept it merged to master, but haven't checked that it's
not bit-rotted lately.

 After thinking a bit about this proposal I have the following
 suggestions for changes that I would like to put up for discussion:
 
 1) I don't really like the current behaviour when there are uncommitted
 changes in the package directory. I would suggest as default behaviour
 creating a commit containing these changes. This would eliminate the
 need for people having to commit changes if they don't really care.
 
 The most elegant solution would probably to create the commit, clone it
 and then do a git reset HEAD^ in the package directory. Don't know if
 that is robust enough, though.

Sounds like git stash?

However, stashing away uncommitted changes and not including them in the
build violates least suprise. I'd except to see them either commited
automatically, or the current error forcing me to resolve them before
building. The advantage to auto-committing, of course, is that you don't
have to know how to use git (or debcommit) to build a package that uses it.

 Prompting the user for the commit message would probably be best but
 would break if people try to run the program non-interactivly.

I don't think it's a good idea to prompt for a commit message.

 2) Independently from the default behaviour on pack we should definetly add
 a command-line option for the user to choose between the three
 possibilities 1) error out, 2) create a commit, 3) create a commit
 interactivly

Not sure sure what you mean here?

 3) About the plugin interface: I was considering whether it would be
 better to move the tar generation into the plugin itself. This would
 allow other plugins more flexibility (e.g. generating more than one
 file). My masterplan includes making source formats 1.0 and 2.0 plugins
 internally ;)
 
 This would of course require to move the tar generating and compressing
 code to a module that can then be used by the plugins.

That would of course be fine. I didn't want to touch doing that in my
branch for obvious reasons. :-)

 4) aj suggested in this thread to add a Source-Depends field which could
 be used to specify the dependencies needed to unpack the package. I
 guess that could prove useful, but I really would like to avoid that
 all packages need to specify it (even though that might be solvable with
 substvars defined by the plugin). OTOH if dpkg uses an internal
 mechanism to map format to dependencies it would be more difficult for
 other programs like apt to get to this information. Or is this all
 over-engineering and the plugin should check its pre-requisites itself
 and note the dependencies in the error message like the current code
 does.

One appoach would be for dpkg to build a dpkg-dev-git package that
includes the git format (and depends on git-core), and so on,
then Format: 3.0 (foo) could be converted to dpkg-dev-foo.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-17 Thread Frank Lichtenheld
On Tue, Oct 16, 2007 at 01:42:06PM -0400, Joey Hess wrote:
 Phillip Susi wrote:
  Joey Hess wrote:
  A sample dpkg source package built using this is at
  http://kitenet.net/~joey/tmp/git-demo/. This demo package includes only
  the last 200 commits to the dpkg git repo, so it's more than 1 mb 
  *smaller*
  than dpkg's normal .tar.gz!
 
  What was removed from the source tree when importing it into git to save 
  this space?
 
 Like I said, I included only the last 200 commits in the git repo.

Note that he was talking about the size of the working tree, not the git
repository.

The distribution tarball contains more than a checkout from dpkg's git
repository, though, since it does contains the files copied and/or
generated by autoreconf. However, this shouldn't really make a huge difference
in size.

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-17 Thread Frank Lichtenheld
On Wed, Oct 17, 2007 at 05:24:10PM -0400, Phillip Susi wrote:
 Exactly... it seemed to be an 8 MB difference though, which would 
 account for why the git repo was smaller; it started with 8 MB less 
 files.  My point is that git doesn't magically make the same set of 
 files plus their history smaller than just the original set of files.
 
 When I poked around with du a bit it looks like the missing space in the 
 git repo is mostly those .po files.  Are these auto generated?  If so, 
 why are they included in the source package?

You are most likely speaking about the .gmo files, not the .po files.

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-16 Thread Ian Jackson
Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using 
.git.tar.gz):
 On Mon, 15 Oct 2007 17:55:13 +0100, Ian Jackson [EMAIL PROTECTED] said: 
  Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using
  .git.tar.gz):
  Well, this is tricky. I am not sure how the NMU'er communicates with
  the developer; I assume it is by sending in a diff. If so, this works
  with an arch checked out dir, and unmodified dpkg.
 
  Ideally the NMUer would simply upload and would not need to send a
  diff to the BTS.
 
  The maintainer would fetch the source from the archive and would be
  able commit the NMUers changes and then merge etc. appropriately.
 
 This works better for the distributed VCS's with the model that
  every checkout contains a copy of the whole repository. With a
  distributed model where every checkout does not pull in a copy of the
  repo, this means the NMU'er would have to have write access to the repo
  (unlikely), or create their own public repo with tagged version of the
  software, or send in a diff.

I was talking about the case where the NMUer is RCS-naive.

They download the source edit it, test it, and upload it, all using
using the standard tools (apt-get source, dpkg-source,
dpkg-buildpackage etc.),

Obviously this means that the NMUer's download, and their
corresponding upload, have to contain a working tree.  By this I mean
it has to contain, or imply in a way that the tools can construct,
both a complete set of the actual checked-out source code, and also an
indication of what the version was that was checked out (the
information that CVS puts in the CVS/Entries file) so that it can be
merged properly later.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-16 Thread Joey Hess
Phillip Susi wrote:
 Joey Hess wrote:
 A sample dpkg source package built using this is at
 http://kitenet.net/~joey/tmp/git-demo/. This demo package includes only
 the last 200 commits to the dpkg git repo, so it's more than 1 mb 
 *smaller*
 than dpkg's normal .tar.gz!

 What was removed from the source tree when importing it into git to save 
 this space?

Like I said, I included only the last 200 commits in the git repo.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-15 Thread Ian Jackson
Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using 
.git.tar.gz):
 Well, this is tricky. I am not sure how the NMU'er communicates
  with the developer; I assume it is by sending in a diff. If so, this
  works with an arch checked out dir, and unmodified dpkg.

Ideally the NMUer would simply upload and would not need to send a
diff to the BTS.

The maintainer would fetch the source from the archive and would be
able commit the NMUers changes and then merge etc. appropriately.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-11 Thread Anthony Towns
On Tue, Oct 09, 2007 at 06:58:19PM +0100, Ian Jackson wrote:
 [...] Goals I would suggest:

 * Abolish dpatch (and similar excresences) and specifically to get
   back to the point where a Debian source package can be unpacked to
   the point of seeing the source code without having to execute any of
   it.

Really, that's probably the most valuable part of this, even if not
the most interesting -- having a sane way to unpack source packages to
the *actual* working tree makes it much more sane to do analysis of the
source, hack on it, whatever.

And something that works for a pure tarball of a .git directory all the
way to unpacked .c files seems like it should certainly be general enough
to achieve that. That seems (to me) like it means:

- keep the perl module structure Joey's created and expect to
  use it with other ways of dealing with patches internally to a
  source package (quilt, bzr, darcs, whatever)

- finalise the remaining tweaks: drop the bracketed (git) from
  the Format: field and handle it some other way? add a
  Source-Depends: field?

 * Make it possible (once more) for NMUers to make a change to a
   to acquire the source, inspect it, edit it, build it, test it, and
   upload it, using only tools which either do not depend on the RCS or
   which entirely hide it, without disrupting or being disrupted by the
   revision control system.

It seems... remarkable that making the source package format more
dependent on the revision control system would make NMUers and others
more able to ignore it.

The remaining big question seems to be whether to have Debian source
packages include the working tree directly so people don't need git to
get at it; but that seems to me something that can be decided by policy
mechanisms outside dpkg.

So, afaics, the dpkg maintainers should:

- add Source-Depends:   (I'm biassed :)
- upload dpkg with modular support to unstable
- upload git/bzr support as part of either dpkg or the git/bzr
  packages, with appropriate autogenerated Source-Depends: 

and ftpmaster should start accepting git/bzr source packages to
experimental so we can get some practical experience with the format, and
decide whether to have .git.tgz or .git+.orig+.xdelta .tgz's or whatever to
unstable.

I'd expect we'd either wait for lenny to release, or an updated dpkg with
Format:3.0 support to be in an etch point release before accepting such
packages in unstable either way, but better to get started sooner, afaics.

Cheers,
aj



signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-10 Thread Phillip Susi

Ian Jackson wrote:

Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using 
.git.tar.gz):

What exactly is the goal of this dpkg addition?


This is a sensible question to ask.  Goals I would suggest:


I find myself wondering the same thing.  It seems to me that one of the 
main functions that the debian source format fulfills is essentially 
that of a version control system.  It allows multiple versions to 
coexist in the archive, provides a change log to track the history, has 
tools to examine changes across revisions in detail ( debdiff ), and so 
on.  While less refined than VCS like git, svn, et al, the debian source 
format does manage to provide the core functions of a VCS.


Therefore, I ask, why would you pack one VCS ( git ) inside another ( 
deb src )?



* Enable all people who work with a Debian source package to do so
  with the benefits of the distributed revision control system in use,
  which includes smart merging, and so forth;

* Specifically, to enable the above for NMUers in such a way that
  a minimum of additional work is needed by the maintainer to merge
  changes.

* Abolish dpatch (and similar excresences) and specifically to get
  back to the point where a Debian source package can be unpacked to
  the point of seeing the source code without having to execute any of
  it.

* Make life easier for derived distributions by making it possible for
  them to merge from us, and us from them, using all of the usual
  features of the RCSs in use.

* Make it possible (once more) for NMUers to make a change to a
  package without having to learn and interact with a revision control
  system, even if the maintainers are using one.  Ie, make it possible
  to acquire the source, inspect it, edit it, build it, test it, and
  upload it, using only tools which either do not depend on the RCS or
  which entirely hide it, without disrupting or being disrupted by the
  revision control system.

* When an RCS-agnostic NMUer has done their work, still give the
  benefit of the RCS to the maintainer (and others) when merging the
  NMUer's work.


This is a nice set of goals, and if we are ok with leaving behind the 
current source package format to achieve this, then it seems to me that 
using git ( or possibly another VCS ) is a good way to do this, but if 
you are going to use git, then _really_ use it.  Convert the archive 
over into a bunch of git repositories - one for each package, and be 
done with it.  Why go into it half assed by packaging git inside the old 
format?



--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-10 Thread Adeodato Simó
* Phillip Susi [Wed, 10 Oct 2007 14:25:46 -0400]:

 Why go into it half assed by packaging git inside the old format?

Because otherwise the change won't happen (TM).

-- 
Adeodato Simó dato at net.com.org.es
Debian Developer  adeodato at debian.org
 
Guy on cell: Yeah, I mean she's not easy to talk to, because, you know,
she'll be like, What did you do this weekend? and I'll say, Nothing,
but really I was fucking some other girl.
-- http://www.overheardinnewyork.com/archives/003179.html


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-10 Thread Phillip Susi

Adeodato Simó wrote:

* Phillip Susi [Wed, 10 Oct 2007 14:25:46 -0400]:


Why go into it half assed by packaging git inside the old format?


Because otherwise the change won't happen (TM).



Why is that a bad thing?  What good does it do to have the git repo 
packed inside the source archive?  How is that any better than just 
using git yourself and leaving the archives alone?




--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-10 Thread Joey Hess
Phillip Susi wrote:
 Why is that a bad thing?  What good does it do to have the git repo packed 
 inside the source archive?

http://kitenet.net/~joey/blog/entry/an_evolutionary_change_to_the_Debian_source_package_format/

-- 
see shy jo, over and over, and out


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-10 Thread Raphael Hertzog
Hi,

On Wed, 10 Oct 2007, Phillip Susi wrote:
 Adeodato Simó wrote:
 * Phillip Susi [Wed, 10 Oct 2007 14:25:46 -0400]:
 Why go into it half assed by packaging git inside the old format?
 Because otherwise the change won't happen (TM).

 Why is that a bad thing?  What good does it do to have the git repo packed 
 inside the source archive?  How is that any better than just using git 
 yourself and leaving the archives alone?

Because Debian is all about cooperation and making the git repository
available is an essential step in the process. We currently use
alioth.debian.org for that purpose but it's not related to our standard
packaging process and the logic to go further is either the idea of Joey
(upload git repository as source) or someone in ftpmaster that implements
a direct connection between a git repository and incoming, so that we can
upload packages throught a git repository (and thus have a real canonical
git repository for a given package).

We can rarely affort to design from scratch and must take into account
various parameters... such as nobody has done the second variant yet while
Joey did the first one.

Cheers,
-- 
Raphaël Hertzog

Premier livre français sur Debian GNU/Linux :
http://www.ouaza.com/livre/admin-debian/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-10 Thread Phillip Susi

Raphael Hertzog wrote:

Because Debian is all about cooperation and making the git repository
available is an essential step in the process. We currently use
alioth.debian.org for that purpose but it's not related to our standard
packaging process and the logic to go further is either the idea of Joey
(upload git repository as source) or someone in ftpmaster that implements
a direct connection between a git repository and incoming, so that we can
upload packages throught a git repository (and thus have a real canonical
git repository for a given package).


Connecting the git repository to the ftp archive is a good idea, but 
that is not what this thread is about.  This thread is about packaging 
the git repository directly into the source archive, and I do not see 
any benefit to that.  Why not keep the existing source ftp archive as 
is, but connect it with the git repo so that a change in the release 
branch of the git repo automatically generates the new source archive, 
rather than tar up the whole repo and make that the archive?




--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-09 Thread Manoj Srivastava
On Tue, 9 Oct 2007 14:17:17 +1000, Anthony Towns [EMAIL PROTECTED] said: 

 So that leaves:

 I still think that shipping a full working dir, with no dpkg changes,
 seem to be the way to go, along with a tla grab file, which I think I
 should consider putting into the package itself (If I can work around
 the chicken and egg issue of adding a grab file changes the source
 revision which means the grab file should change which means a new
 revision is needed )

 If you're just distributing a snapshot, rather than a full repository
 as Joey's basically proposing, why can't your grab file be
 autogenerated? ie,

   1. hack on the source, merge changes, blahblah, finish, tag
   2. do a checkout from version control
   3. autogenerate anything necessary
   4. create source package
   5. build
   6. upload

 If you're using pristine-tar to create a pristine .orig.tgz from your
 repo (rather than keeping one around), that needs to be autogenerated
 at step 3 too, afaics. Worst case you could check the autogenerated
 files into a parallel repository and use a config or something,
 afaics.

I can (and do) autogenerate the grab file -- and I guess I can
 add it to the source package after I check things out of the version
 control. I guess I was quibbling over having stuff in the source
 package that was not  in my version control and not generated by dpkg
 and friends -- but even I can see it is a pretty weak quibble.

Anyway, thanks for the clarifications: I'll just re-start
 shipping a full working sir in the source tree, along with a grab file
 for registration; the overhead is pretty minimal compared to that of
 the full repo that git ships; and if people can deal with .git dirs,
 they can deal with {arch} and .arch-id dirs  as well.

Which concludes my involvement in this thread.

manoj
-- 
He flung himself on his horse and rode madly off in all directions.
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-09 Thread Ian Jackson
Firstly I'd just like to say that I think this is a fantastic
direction to be heading in.  I look forward very much to the demise of
dpatch :-).

I do however very much share Colin's view about the desirability of
preserving the .orig.tar.gz's, the ability to unpack a Debian source
package with non-Debian tools, and the ability to unpack a source
package without needing to install a suitably recent one of fourteen
possible revision control systems :-).

On Sun, Oct 07, 2007 at 10:18:17PM +1000, Anthony Towns wrote:
 (I have a strong adverse reaction to duplicated information, so shipping
 the working tree in .git format and .orig.tar.gz format irks me,
 particularly if it's required)

Like Colin, I can quite understand this point of view.  I'd like to
make a completely crazy suggestion.

How about we ship the .orig.tar.gz, plus an rsync batched update (with
a suitably early rsync version) which turns the unpacked source into
working tree plus revision history ?

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-09 Thread Ian Jackson
Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using 
.git.tar.gz):
 What exactly is the goal of this dpkg addition?

This is a sensible question to ask.  Goals I would suggest:

* Enable all people who work with a Debian source package to do so
  with the benefits of the distributed revision control system in use,
  which includes smart merging, and so forth;

* Specifically, to enable the above for NMUers in such a way that
  a minimum of additional work is needed by the maintainer to merge
  changes.

* Abolish dpatch (and similar excresences) and specifically to get
  back to the point where a Debian source package can be unpacked to
  the point of seeing the source code without having to execute any of
  it.

* Make life easier for derived distributions by making it possible for
  them to merge from us, and us from them, using all of the usual
  features of the RCSs in use.

* Make it possible (once more) for NMUers to make a change to a
  package without having to learn and interact with a revision control
  system, even if the maintainers are using one.  Ie, make it possible
  to acquire the source, inspect it, edit it, build it, test it, and
  upload it, using only tools which either do not depend on the RCS or
  which entirely hide it, without disrupting or being disrupted by the
  revision control system.

* When an RCS-agnostic NMUer has done their work, still give the
  benefit of the RCS to the maintainer (and others) when merging the
  NMUer's work.

Ian.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-09 Thread Joey Hess
Ian Jackson wrote:
 How about we ship the .orig.tar.gz, plus an rsync batched update (with
 a suitably early rsync version) which turns the unpacked source into
 working tree plus revision history ?

I'm afraid that due to consisting of many small gzipped compontents,
.git is not ameanable to being efficiently binary deltaed, so, you'll
still end up with approximatly 2x doubled data. This is probably true of
many revision control backends, though not all .. you might be able to
do it with CVS.

It might be possible to start with the pristine source, check it into
git, and apply a set of git packs that merges the resulting repository
forward to match the maintainer's git repository. However, I think this
could only work if the maintainer's git repository began with importing
that same pristine source[1]. Which means throwing away your git repo for
each new upstream version and starting afresh, which doesn't seem very
practical.

-- 
see shy jo

[1] git's sha1sums are AIUI based on the entire history of the repo,
so you can't go back and change history


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-09 Thread Joey Hess
FWIW, I listed my goals and reasons for working on this in the blog post
linked to in the head of this thread.

I feel that I should bow out of this thread here. I've presented an
idea, a working implementation, and addressed the issues with it to the
best of my ability. Far too many times in this project I've seen a good
idea be indefinitely delayed or killed when everyone piles on and
nitpicks it to death. This idea is in danger of that happening.

If the dpkg maintainers decide to add support to this format to dpkg,
I'll be happy to work with them to make any further fixes needed to my
patch. (My git repo has a couple more fixes in it BTW.)

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-09 Thread Manoj Srivastava
On Tue, 9 Oct 2007 16:42:38 -0400, Joey Hess [EMAIL PROTECTED] said: 

 FWIW, I listed my goals and reasons for working on this in the blog
 post linked to in the head of this thread.

 I feel that I should bow out of this thread here. I've presented an
 idea, a working implementation, and addressed the issues with it to
 the best of my ability. Far too many times in this project I've seen a
 good idea be indefinitely delayed or killed when everyone piles on and
 nitpicks it to death. This idea is in danger of that happening.

I do apologize if my quest for understanding your proposal
 sounded like nitpicking; that ws not my intent. I truly did not
 understand what I needed to do while using arch (and it turns out no
 changes are really required in dpkg for arch).

manoj
 feeling obtuse
-- 
Suicide is the sincerest form of self-criticism. Donald Kaul
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-09 Thread Manoj Srivastava
On Tue, 9 Oct 2007 18:58:19 +0100, Ian Jackson
[EMAIL PROTECTED] said:  

I am going to comment on this with my I use arch hat on.

 Manoj Srivastava writes (Re: [PATCH] proposed v3 source format using
 .git.tar.gz):
 What exactly is the goal of this dpkg addition?

 This is a sensible question to ask.  Goals I would suggest:

Thanks for clarifying.

 * Enable all people who work with a Debian source package to do so
   with the benefits of the distributed revision control system in use,
   which includes smart merging, and so forth;

This, of course, means you have to have the distributed SCM
 system installed and configured, and perhaps a bit of configuration
 work done. 

Shipping an arch working dir, with {arch} and .arch-ids; allows
 people to see the log history, and, after they have registered the
 repository this was checked from, to do diffs and so on.  Commits won't
 be possible unless they have commit access to the distributed repo; but
 they can tag/branch to their local repo, and ask the developer to pull
 from there.

This requires no dpkg change.

 * Specifically, to enable the above for NMUers in such a way that a
   minimum of additional work is needed by the maintainer to merge
   changes.

Sure. Tag the checked out tree to a repo you have commit rights
 to, ask developers to pull from there.

 * Abolish dpatch (and similar excresences) and specifically to get
   back to the point where a Debian source package can be unpacked to
   the point of seeing the source code without having to execute any of
   it.

All for it.

 * Make life easier for derived distributions by making it possible for
   them to merge from us, and us from them, using all of the usual
   features of the RCSs in use.

ok

 * Make it possible (once more) for NMUers to make a change to a
   package without having to learn and interact with a revision control
   system, even if the maintainers are using one.  Ie, make it possible
   to acquire the source, inspect it, edit it, build it, test it, and
   upload it, using only tools which either do not depend on the RCS or
   which entirely hide it, without disrupting or being disrupted by the
   revision control system.

Hmm, OK. Well, as long as people ignore the extra directories,
 shipping an arch checked out dir will allow people to work with plain
 old make, etc, with no changes to dpkg.

 * When an RCS-agnostic NMUer has done their work, still give the
   benefit of the RCS to the maintainer (and others) when merging the
   NMUer's work.

Well, this is tricky. I am not sure how the NMU'er communicates
 with the developer; I assume it is by sending in a diff. If so, this
 works with an arch checked out dir, and unmodified dpkg.

So, in conclusion, I can happily say that no change in dpkg is
 needed to help arch using developers accomplish these goals; they
 need just stop stripping out the {arch} and .arch-id directories to
 accomplish all these.

Silencing Lintian would be a good start.

manoj
-- 
If I am elected, the concrete barriers around the WHITE HOUSE will be
replaced by tasteful foam replicas of ANN MARGARET!
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-08 Thread Frank Lichtenheld
On Sun, Oct 07, 2007 at 10:59:18PM -0500, Manoj Srivastava wrote:
 On Mon, 8 Oct 2007 02:55:37 +0200, Frank Lichtenheld [EMAIL PROTECTED] 
 said: 
  Lets not exagerate. At least for git the repository will usually be
  smaller or only little larger than the working directory. It will
  probably compress worse though.
 How is this magic done? If I have several dozen feature
  branches, all feeding back and forth, and have made lots and lots of
  changes in my sources, how does git preserve all this information
  without a commensurate increase in size?  This makes the information
  theory geek in me very very skeptical.

By already using compression in the repository and by aggressively
storing data as delta against earlier versions (both for binary and
textual data).

 Or are you talking about typical usage, and is that why people
  go around making shallow copies to cut down on the size of the
  shipped repo?

Shallow copies are not a very typical thing to do, IME.

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-08 Thread Goswin von Brederlow
Joey Hess [EMAIL PROTECTED] writes:

 Frank Lichtenheld wrote:
 This should probably error out. Aren't v3 packages always native in the
 sense tested here?

 Not necessarily. I wanted to leave the option open to use wig-n-pen to
 constuct mixed source packages that maybe use vcs for debian/ and
 pristine source for the rest + a diff.gz, or something like that.

 I think the code will basically handle unpacking such a mongrel,
 although there are no tools to create one.

 -- 
 see shy jo

Shouldn't we allow any number of any files in the dsc and dpkg-source
would unpack/apply them each in turn. For example you could have:

Files:
  yyy foobar.orig.tar.bz2
  yyy images.tar
  yyy debian.git.tar.gz
  yyy security.diff.gz

dpkg-source would unpack the orig.tar.bz2 first, then add the
images.tar, merge the debian.git and last apply the security patch.

Dpkg-source should record the files it used to construct a source dir
in debian/something so that subsequent source builds can recreate the
procedure. When building source the last entry should be modified
where possible or a new diff.gz added otherwise. Meaning dpkg should
unpack foobar.orig.tar.bz2, images.tar and debian.git.tar.gz and then
create a new security.diff.gz in this case.

Tools like svn-buildpackage could create a new debian.svn.tar.gz file
before building source and dpkg could skip adding an empty diff.gz to
the end of the dsc in such a case. For many projects you would then
end up with:

Files:
  yyy foobar.orig.tar.gz
  yyy debian.svn.tar.gz

(or whatever VCS is used).

MfG
Goswin


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-08 Thread Mark Brown
On Mon, Oct 08, 2007 at 12:59:52PM +0200, Frank Lichtenheld wrote:
 On Sun, Oct 07, 2007 at 10:59:18PM -0500, Manoj Srivastava wrote:

  How is this magic done? If I have several dozen feature
   branches, all feeding back and forth, and have made lots and lots of
   changes in my sources, how does git preserve all this information
   without a commensurate increase in size?  This makes the information
   theory geek in me very very skeptical.

 By already using compression in the repository and by aggressively
 storing data as delta against earlier versions (both for binary and
 textual data).

For reference, a current clone I have of Linus' linux-2.6 repository
with full history and working tree is 489M of which 194M is .git.

-- 
You grabbed my hand and we fell into it, like a daydream - or a fever.


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-08 Thread Manoj Srivastava
On Mon, 8 Oct 2007 12:59:52 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: 

 On Sun, Oct 07, 2007 at 10:59:18PM -0500, Manoj Srivastava wrote:
 On Mon, 8 Oct 2007 02:55:37 +0200, Frank Lichtenheld
 [EMAIL PROTECTED] said:
  Lets not exagerate. At least for git the repository will usually be
  smaller or only little larger than the working directory. It will
  probably compress worse though.
 How is this magic done? If I have several dozen feature branches, all
 feeding back and forth, and have made lots and lots of changes in my
 sources, how does git preserve all this information without a
 commensurate increase in size?  This makes the information theory
 geek in me very very skeptical.

 By already using compression in the repository and by aggressively
 storing data as delta against earlier versions (both for binary and
 textual data).

Well, arch does this in the repo: base versions and cacherevs
 are tar.gz files, and then it stores deltas from the most recent base
 version or cached revisions (I generally cache every 20th revision).

In any case, I think the kinds of actions taken by joey's and
 Colin's patches are probably not things that we'll have to do to
 support shipping an arh working directory in the source packagel if we
 have {arch}  and .arch-id dirs in the source, the end user has access
 to the distributed version control system; as soon as they register the
 archive location mentioned in the control file entry.

I am not sure  how the pritine-tar bit fits in into the picture
 yet. 

manoj
-- 
Eighty percent of married men cheat in America.  The rest cheat in
Europe. Jackie Mason
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-08 Thread Anthony Towns
On Mon, Oct 08, 2007 at 09:16:52AM -0500, Manoj Srivastava wrote:
 In any case, I think the kinds of actions taken by joey's and
  Colin's patches are probably not things that we'll have to do to
  support shipping an arh working directory in the source packagel if we
  have {arch}  and .arch-id dirs in the source, the end user has access
  to the distributed version control system; 

Joey's thing lets you do a clean tarball that only contains the git
(or bzr, or darcs) information, and recreates the working directory by
a checkout. 

For CVS the equivalent would be shipping the CVSROOT, for rcs the
equivalent would be shipping only the ,v files. If you don't have git,
you can't do *anything* with a .git.tar.gz source package. If you unpack
it by hand, all you get is the .git directory -- no debian/control,
no debian/rules, nothing.

You could do something similar with darcs/git/bzr atm simply by shipping
the .git, _darcs or .bzr directories as part of your source package --
that's discouraged atm because it's duplicate information that bloats the
source package, but it's entirely possible -- some ifupdown uploads have
included the _darcs directory, eg.

Ultimately, it turns the source package into a snapshot of not just the
current codebase, but the history as well -- or in the case of a shallow
tree, the recent history.

What's the point of that?

There may not be any -- if you're just packaging something that's
completely straightforward, just adding a pointer to the official
repository is probably the most sensible thing to do anyway; whether
that be a subversion url or a tla grab file, or something else, and
you can already do that.

Where it starts becoming relevant (afaics) is when there's a
Debian-specific patch history (either due to it being a native package,
complicated packaging, or significant patches against upstream) and
we want the archive, as the primary way we distribute the source, to
include a real change history rather than a simple snapshot.

You can do that to some extent via all the dpatch tools, but they're
kludgy in various ways; having the source format allow for an actual
repository from a real VCS solves that in a really powerful way.

  I am not sure  how the pritine-tar bit fits in into the picture
  yet. 

I don't think it really does; though it makes it possible to confirm
that the point in the repo that claims to match some upstream release,
really does match the official tarball of that release from upstream,
which might have some use.

pristine-tar seems mostly useful for generating a v1 source package
purely from a remote repository; this allows you to turn a repository
_into_ a (v3) source package.

Cheers,
aj



signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-08 Thread Manoj Srivastava
On Tue, 9 Oct 2007 01:10:00 +1000, Anthony Towns [EMAIL PROTECTED] said: 

 On Mon, Oct 08, 2007 at 09:16:52AM -0500, Manoj Srivastava wrote:
 In any case, I think the kinds of actions taken by joey's and Colin's
 patches are probably not things that we'll have to do to support
 shipping an arh working directory in the source packagel if we have
 {arch} and .arch-id dirs in the source, the end user has access to
 the distributed version control system;

 Joey's thing lets you do a clean tarball that only contains the git
 (or bzr, or darcs) information, and recreates the working directory by
 a checkout.

Well, an additional factor is that git/bzr/darcs contains all
 the data required in the .git/.bzr/.darcs directories  to recreate all
 the sources, and do the diffs, etc, which is not the case with arch --
 rch does not follow the model where every checkout is a repo; so the
 checked dirs do not have all the info (you refer to the repo for the
 rest).  Unless you use {arch}/++pristine trees, which I have not used
 in years.

[Snip bunches of git/bzr/darcs material]

 What's the point of that?

 There may not be any -- if you're just packaging something that's
 completely straightforward, just adding a pointer to the official
 repository is probably the most sensible thing to do anyway; whether
 that be a subversion url or a tla grab file, or something else, and
 you can already do that.

Right. I am not sure what I package is always trivial, though.

 Where it starts becoming relevant (afaics) is when there's a
 Debian-specific patch history (either due to it being a native
 package, complicated packaging, or significant patches against
 upstream) and we want the archive, as the primary way we distribute
 the source, to include a real change history rather than a simple
 snapshot.

This seems to fit my use case; I have often large feature
 branches that only sporadically get merged back upstream.

The question is, how do I do this if I use arch as a version
 control system? I can, or course, start shipping  a cacherev + patches,
 but that can be large; and might not mean much unless I also ship all
 the feature branches and upstream branch at the same time; which can
 blow up badly: see the ps for details.

If we just look at lenny, and I want to provide people with full
 details of all changes that have been made in various feature branches
 and upstream and debian packaging for lenny (etcvh is somewhat larger),
 I get:
--8---cut here---start-8---
3.0Mfvwm--autotools--2.5.18/
368Kfvwm--autotools--2.5.21/
88K fvwm--autotools--2.5.23/
3.0Mfvwm--debian--2.5.18/
356Kfvwm--debian--2.5.21/
5.3Mfvwm--debian--2.5.23/
3.1Mfvwm--devo--2.5.18/
392Kfvwm--devo--2.5.21/
1.7Mfvwm--devo--2.5.23/
3.0Mfvwm--terminal-emulator--2.5.18/
360Kfvwm--terminal-emulator--2.5.21/
1.5Mfvwm--terminal-emulator--2.5.23/
2.9Mfvwm--upstream--2.5.18/
344Kfvwm--upstream--2.5.21/
1.5Mfvwm--upstream--2.5.23/
600Kdebian-dir--fvwm--0.1/
27M  total
--8---cut here---end---8---

What I ship currently:
--8---cut here---start-8---
 132 /usr/local/src/arch/done/fvwm_2.5.23-2.diff.gz
   8 /usr/local/src/arch/done/fvwm_2.5.23-2.dsc
3244 /usr/local/src/arch/done/fvwm_2.5.23.orig.tar.gz
3.3M total.
--8---cut here---end---8---

This is almost an order of magnitude increase in size, which I
 find hard to justify.

I still think that shipping a full working dir, with no dpkg
 changes, seem to be the way to go, along with a tla grab file, which I
 think I should consider putting into the package itself (If I can work
 around the chicken and egg issue of adding a grab file changes the
 source revision which means the grab file should change which means a
 new revision is needed  )


 I am not sure how the pritine-tar bit fits in into the picture yet.

 I don't think it really does; though it makes it possible to confirm
 that the point in the repo that claims to match some upstream release,
 really does match the official tarball of that release from upstream,
 which might have some use.

 pristine-tar seems mostly useful for generating a v1 source package
 purely from a remote repository; this allows you to turn a repository
 _into_ a (v3) source package.

Thanks for the clarification.

manoj
ps: This is from my lenny archive

1.8Mangband--autotools--3.0/
1.8Mangband--debian--3.0/
1.8Mangband--devo--3.0/
1000K   angband-doc--devel--3.0/
1.7Mangband--upstream--3.0/
292Kc2man--configure--2.0/
292Kc2man--devo--2.0/
296Kc2man--manpage-fix--2.0/
248Kc2man--upstream--2.0/
952Kcalc--debian--2.0/
956Kcalc--devo--2.0/
904Kcalc--upstream--2.0/
148Kcheckpolicy--devo--1.32/
128Kcheckpolicy--devo--1.34/
176K

Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-08 Thread Anthony Towns
On Mon, Oct 08, 2007 at 03:59:05PM -0500, Manoj Srivastava wrote:
  Where it starts becoming relevant (afaics) is when there's a
  Debian-specific patch history (either due to it being a native
  package, complicated packaging, or significant patches against
  upstream) and we want the archive, as the primary way we distribute
  the source, to include a real change history rather than a simple
  snapshot.
 This seems to fit my use case; I have often large feature
  branches that only sporadically get merged back upstream.

Right, but the caveat is important too -- we have to _also_ want the archive
to include the real change history. Maybe when things get complicated enough
that there are often large branches that sporadically get merged back, that
part's no longer worth the hassle:

 This is almost an order of magnitude increase in size, which I
  find hard to justify.

As far as cases where there are enough changes to make a repo interesting,
but not so many that shipping a repo as the standard source becomes
huge and clunky, it's possible that arch just isn't a useful tool for
the job -- repo registration alone would be pretty annoying, and it's
not like there aren't plenty of other VCS options for that case anyway.
Subversion (or SVK) isn't an option either, afaics, eg, and I doubt CVS
or RCS would work well either.

So that leaves:

 I still think that shipping a full working dir, with no dpkg
  changes, seem to be the way to go, along with a tla grab file, which I
  think I should consider putting into the package itself (If I can work
  around the chicken and egg issue of adding a grab file changes the
  source revision which means the grab file should change which means a
  new revision is needed  )

If you're just distributing a snapshot, rather than a full repository as
Joey's basically proposing, why can't your grab file be autogenerated? ie,

1. hack on the source, merge changes, blahblah, finish, tag
2. do a checkout from version control
3. autogenerate anything necessary
4. create source package
5. build
6. upload

If you're using pristine-tar to create a pristine .orig.tgz from your repo
(rather than keeping one around), that needs to be autogenerated at step
3 too, afaics. Worst case you could check the autogenerated files into
a parallel repository and use a config or something, afaics.

Cheers,
aj



signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Anthony Towns
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
 I've been working on making dpkg-source support a new source package format
 based upon git. 

Oh, one question that comes to mind: how does this affect checking for
non-free stuff in past revisions? If 3.1-4 had some non-free files that
get reimplemented for 3.2-1, do we (a) expect the maintainer to do a
no-history upload for 3.2-1; (b) check that this happens somehow; (c) not
worry about it as long as it's only in the history; (d) something else?

Verifying that not just the current tree is DFSG-free, but all the history
is too seems potentially difficult.

Cheers,
aj



signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Colin Watson
On Sun, Oct 07, 2007 at 02:56:47PM +1000, Anthony Towns wrote:
 On Sat, Oct 06, 2007 at 10:37:48PM +, Colin Watson wrote:
  The second possibility seems to me to be more flexible, though, and
  probably not all that hard to implement: build both a .tar.gz
  (containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source
  -x' to unpack the tree given at least one of these. This would allow
  various interesting possibilities such as:
 
 Would this be better in any way than having a web interface that provides
 an autogenerated version-1 source package? Presume it's a url like:
 
   http://v1source.qa.debian.org/i/ifupdown/ifupdown_0.6.8.dsc

Autogenerated source packages won't (presumably, certainly not without
some special arrangements) be mirrored on useful services like
www.mirrorservice.org that let you peek inside tarballs without opening
them, and seem difficult for people to mirror locally in general since
it would put a lot of stress on v1source.qa.debian.org which I expect
would be a lot less beefy than the regular Debian mirror network. I'm
quite attached to being able to peek inside source packages quickly by
sshing over to the local mirror I keep at home which grabs everything
overnight so that I don't have to wait for it to download; particularly
so for large source packages.

* Derivative distributions who are slow to upgrade their dpkg-source
  could still interoperate to some degree.
 
 They'd need to pull sources from the autogenerated url; though they'd
 still probably have Build-Depends: issues if they're not updating
 packages generally.

Oh, I was referring more to the buildd base system and archive
maintenance code too; dak needs to be updated in order to accept format
3.0 source packages, for instance.

Cheers,

-- 
Colin Watson   [EMAIL PROTECTED]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Florian Weimer
* Joey Hess:

 I have a sourcev3 branch with my changes at git://kitenet.net/dpkg,
 and have also attached a diff to this mail. I feel that this is ready
 for review and hopefully merging into dpkg now. Looking forward to your
 comments.

What about empty directories?

I really think you need to work off a clone (or a cleaned-up cp -al'ed
copy).  For instance, you do not necessary want to upload the reflog, or
unreachable objects.  The GIT configuration stored inside .git is
probably uninteresting, too.

But it's still a nice idea, I think.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Colin Watson
On Sat, Oct 06, 2007 at 10:09:22PM -0400, Joey Hess wrote:
 Colin Watson wrote:

 (So, FWIW, I'm not sold on git. Not sold at all yet. But it was a good
 choice for this implementation for several reasons.)

(I don't think bzr is perfect either, of course; the lack of shallow
branches (see below) is one flaw that's very relevant to this
application. If there were a distributed VCS that were clearly better
than the others in every respect, we'd probably all know about it ...)

  Still, this work looks pretty cool, and I'd like to be able to make use
  of it despite avoiding git whenever I can. I noticed that you'd
  helpfully structured your changes such that it would be possible to plug
  in a different revision control system, so I wrote a module to support
  bzr.
 
 Nice. The FAQ has some questions aimed at adding other revision control
 systems, could you try to answer those in the context of bzr? In
 particular, is the data that would be shipped in the source package the
 same data that bzr normally reads from untrusted sources, thus ensuring
 that using it this way is equally (in)secure as using bzr to pull data
 over the network? (Note that this wasn't 100% true for git and I have
 had to put in several workarounds.)

I believe so; bzr has hooks but AFAICS they're only exposed to plugins
(i.e. code that goes in /usr or in ~/.bazaar/plugins) rather than being
something that lives in the .bzr directory. I don't know of anything
executable in .bzr. I intentionally used 'bzr branch' to create the data
that will be shipped, which is the same command used to branch from a
network repository, so I believe that if there is a security flaw in
this implementation then it would also be a security flaw in bzr itself.

The only things I really needed to tweak were to remove a couple of bits
of metadata which aren't useful in this context: branch-name ended up
with blah.bzr.tar.gz.tmp or something like that in it, and it'll be
detected from the unpacked directory name if it doesn't exist; and
parent is just the directory 'bzr branch' branched from.

 And is the data format stable and/or one that bzr has a history of
 supporting old versions of in a way that ensures backwards
 compatability?

The data format has changed a few times, but so far bzr has an excellent
history of continuing to support old versions. Some data formats (dating
from 0.8 or so) are marked as unsupported and you have to use 'bzr
upgrade' before doing anything else. Everything else at worst nags you
to run 'bzr upgrade'.

I think they may have dropped support for some very old formats that
basically only some early bzr developers used.

 Also, will the bzr repos always contain the full history, or is there
 an equivilant to git shallow clones? How big do they tend to be?

I don't have as comfortable an answer here. There's no equivalent to git
shallow clones yet (only a design, http://bazaar-vcs.org/HistoryHorizon;
so this will probably get fixed one day). At present the .bzr tends if
anything to be a little bigger than the source.

I think due to historical performance issues people tend not to be using
bzr much on very large trees yet, so I'm hoping this won't be an issue
for a while; whereas the git backend has the immediate prospect of
linux-2.6.git.tar.gz. ;-)

* Some source packages want to ship non-VCS-managed files.
  
  It's very common for source packages to include autogenerated
  objects like configure, Makefile.in, etc. Whether to check these
  into a VCS is a somewhat religious matter (as acknowledged by the
  gettext info documentation, for instance), and personally I lean
  towards checking them in (with a few exceptions) just because it
  makes it easier to see when they change and keep an eye out for
  oddities, but I know that a lot of developers prefer to keep these
  outside their VCS. Shipping a working tree would make it easier to
  handle cases like this.
 
 Hmm, I hadn't considered that this might be a problem.
 
 I don't know if I'd want to write the code to do this, but shipping a
 partial working tree consisting of just those files would be enough to
 solve this.

That ought to be relatively straightforward; just list all the files
that the VCS knows about and unlink them. It seemed untidy though. Maybe
put them in a separate directory (.bzr-extra-files or something) which
is copied over after unpack, and make it a dpkg-source -b option rather
than the default behaviour?

FWIW, I was thinking much more of native packages here; non-native
packages already tend to just import the upstream tarball which usually
contains generated files, which is probably why this hasn't been a
problem for things like git-buildpackage. If nothing else, there are
several native packages in the d-i tree alone that don't have configure
et al in Subversion.

Alternatively, if people don't agree with me that we should ship the
working tree by default, maybe it could be an option for the few
packages 

Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Frank Lichtenheld
On Sun, Oct 07, 2007 at 11:55:49AM +, Colin Watson wrote:
 Of course, a number of packages accidentally ship .svn directories and
 so on anyway, though I suppose there's a difference between officially
 blessed by dpkg and warned against by lintian ...

That has to be the understatement of the year ;)

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Anthony Towns
On Sun, Oct 07, 2007 at 08:45:08AM +, Colin Watson wrote:
 I'm
 quite attached to being able to peek inside source packages quickly by
 sshing over to the local mirror I keep at home which grabs everything
 overnight so that I don't have to wait for it to download; particularly
 so for large source packages.

How is that better than running apt-get source against your local
mirror, though?

Alternatively, is it really a problem to have your local mirror
autogenerate v1 source packages in the same way v1source.qa.d.o presumably
would?

(I have a strong adverse reaction to duplicated information, so shipping
the working tree in .git format and .orig.tar.gz format irks me,
particularly if it's required)

 * Derivative distributions who are slow to upgrade their dpkg-source
   could still interoperate to some degree.
  They'd need to pull sources from the autogenerated url; though they'd
  still probably have Build-Depends: issues if they're not updating
  packages generally.
 Oh, I was referring more to the buildd base system and archive
 maintenance code too; dak needs to be updated in order to accept format
 3.0 source packages, for instance.

Well, you'd need an entirely new .dsc to use a v3 source package with
an un-updated dak (or launchpad or whatever), that didn't contain the
.git.tar.gz (or whatever) elements at all, so I don't personally see a
lot of difference between just generating a new .dsc and generating a
new .dsc and .tar.gz.

(It might be just me, but I'm getting the feeling that implementing
WigPen via this v3 format is probably easier than implementing it via
the v2 format...)

I might be off my rocker, but I'm not seeing any reason why we couldn't
allow uploads of v3 format packages to experimental while blocking them
for unstable etc, presuming dpkg somewhere supported them.

Cheers,
aj



signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Frank Lichtenheld
On Sun, Oct 07, 2007 at 10:18:17PM +1000, Anthony Towns wrote:
 On Sun, Oct 07, 2007 at 08:45:08AM +, Colin Watson wrote:
  I'm
  quite attached to being able to peek inside source packages quickly by
  sshing over to the local mirror I keep at home which grabs everything
  overnight so that I don't have to wait for it to download; particularly
  so for large source packages.
 
 How is that better than running apt-get source against your local
 mirror, though?
 
 Alternatively, is it really a problem to have your local mirror
 autogenerate v1 source packages in the same way v1source.qa.d.o presumably
 would?

Of course, one possibility is to go the opposite direction: having a v3
source repository, that will automatically create v1 (or even v2 packages)
and upload them to the main archive.

[...]
 (It might be just me, but I'm getting the feeling that implementing
 WigPen via this v3 format is probably easier than implementing it via
 the v2 format...)

Could you please explain what the difference between WigPen and v2
format is? I've seen them as identities so far.

 I might be off my rocker, but I'm not seeing any reason why we couldn't
 allow uploads of v3 format packages to experimental while blocking them
 for unstable etc, presuming dpkg somewhere supported them.

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Joey Hess
Anthony Towns wrote:
   Maybe providing a feature on packages.debian.org (or similar) to download
   sources in simple, non-VC, tarball format would make this a complete
   non-issue though?
  pristine-tar could be used for this, it would just need source packages
  to put the delta somewhere standaised (under debian/), and would need 
  some standarised way to get to the upstream source branch in git.
 
 So the logic there would be:
 
   if there's an upstream tag, then
   generate an .orig.tgz
   if there's a pristine-tar info,
   hax0r it to be pristine
   generate a .diff.gz
   if the .diff failed goto bailout
   generate a .dsc containing the orig and diff

It's not generally possible to generate a .diff.gz that expresses all
the changes that might be in a git repository.

 Repo formats that bzr in etch can unpack could be denoted by
 
   Source-Depends: dpkg-bzr (= 0.11)
 
 while repo formats that require bzr from lenny or later could be
 denoted by:
 
   Source-Depends: dpkg-bzr (= 0.18)

I was thinking about Source-Depends too, the main problem seems to be
that it would need to be supported in apt-get source too. I wonder if
we could just use build-depends.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Joey Hess
Anthony Towns wrote:
 Oh, one question that comes to mind: how does this affect checking for
 non-free stuff in past revisions? If 3.1-4 had some non-free files that
 get reimplemented for 3.2-1, do we (a) expect the maintainer to do a
 no-history upload for 3.2-1; (b) check that this happens somehow; (c) not
 worry about it as long as it's only in the history; (d) something else?
 
 Verifying that not just the current tree is DFSG-free, but all the history
 is too seems potentially difficult.

Yes, the faq discusses this problem. This is why shallow repos are IMHO
important and non-shallow repos should only be uploaded with caution.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Joey Hess
Florian Weimer wrote:
 What about empty directories?

Do you mean empty directories under .git or empty directories stored
*in* git (can't be done, use a .gitignore in the directory)

 I really think you need to work off a clone (or a cleaned-up cp -al'ed
 copy).  For instance, you do not necessary want to upload the reflog, or
 unreachable objects.  The GIT configuration stored inside .git is
 probably uninteresting, too.

I think if you read my code you'll see that I've dealt with these
problems (Frank pointed out the reflog issue earlier in this thread),
and of course it *does* build from a cleaned, cp'd copy, and run git-gc,
and sanitise the .git/config, and...

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Joey Hess
Colin Watson wrote:
 FWIW, I was thinking much more of native packages here; non-native
 packages already tend to just import the upstream tarball which usually
 contains generated files, which is probably why this hasn't been a
 problem for things like git-buildpackage. If nothing else, there are
 several native packages in the d-i tree alone that don't have configure
 et al in Subversion.

Or these files could be checked into a copy of the repo that is used to
build the source package, and not checked into the main vcs. This is
not unlike those same packages in d-i shipping the generated files in
their .diff.gz, if you look at diff as just another vcs..

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Colin Watson
On Sun, Oct 07, 2007 at 10:05:08AM -0400, Joey Hess wrote:
 Colin Watson wrote:
  FWIW, I was thinking much more of native packages here; non-native
  packages already tend to just import the upstream tarball which usually
  contains generated files, which is probably why this hasn't been a
  problem for things like git-buildpackage. If nothing else, there are
  several native packages in the d-i tree alone that don't have configure
  et al in Subversion.
 
 Or these files could be checked into a copy of the repo that is used to
 build the source package, and not checked into the main vcs. This is
 not unlike those same packages in d-i shipping the generated files in
 their .diff.gz, if you look at diff as just another vcs..

This is true.

-- 
Colin Watson   [EMAIL PROTECTED]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Colin Watson
On Sun, Oct 07, 2007 at 10:18:17PM +1000, Anthony Towns wrote:
 On Sun, Oct 07, 2007 at 08:45:08AM +, Colin Watson wrote:
  I'm quite attached to being able to peek inside source packages
  quickly by sshing over to the local mirror I keep at home which
  grabs everything overnight so that I don't have to wait for it to
  download; particularly so for large source packages.
 
 How is that better than running apt-get source against your local
 mirror, though?

Faster for some cases involving huge packages where I don't want to
transfer the whole thing over wireless. Doesn't require complex apt
configuration to point to the right package if what I want isn't the
current version in the release I'm running. Etc.

 Alternatively, is it really a problem to have your local mirror
 autogenerate v1 source packages in the same way v1source.qa.d.o presumably
 would?

I suppose that would be possible (if the code were properly packaged,
integrated into debmirror, etc.), though it sounds like a big chunk of
resources on my rather underpowered mirror server. (Yes, that's my
problem, but I'm sure I'm not the only one.) I also can't see general
mirrors like mirrorservice.org doing this kind of highly distro-specific
thing, so we'd still lose handy look at a single file within this
package on the web tools unless we reimplemented them on debian.org
systems. Those sorts of things are very useful for big source packages.

 (I have a strong adverse reaction to duplicated information, so shipping
 the working tree in .git format and .orig.tar.gz format irks me,
 particularly if it's required)

I do understand this reaction though ...

  Oh, I was referring more to the buildd base system and archive
  maintenance code too; dak needs to be updated in order to accept format
  3.0 source packages, for instance.
 
 Well, you'd need an entirely new .dsc to use a v3 source package with
 an un-updated dak (or launchpad or whatever), that didn't contain the
 .git.tar.gz (or whatever) elements at all, so I don't personally see a
 lot of difference between just generating a new .dsc and generating a
 new .dsc and .tar.gz.

True; I was thinking that a quick hack to permit v3 while still
basically just unpacking .tar.gz and .diff.gz would be easier than full
support for a derivative distribution that wasn't paying a whole lot of
attention, but maybe it doesn't make that much difference.

-- 
Colin Watson   [EMAIL PROTECTED]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Manoj Srivastava
On Fri, 5 Oct 2007 19:16:13 -0400, Joey Hess [EMAIL PROTECTED] said: 

 I've been working on making dpkg-source support a new source package
 format based upon git. The idea is that a source package has only a
 .dsc and a .git.tar.gz, which is just a git repo.

 My implementation adds a new 3.0 version source format. A 3.0 format
 debian source package can consist of any files allowed by formats 1
 and 2, but may also contain .$VCS.tar.gz files. To build a version 3
 source package, a new field is needed in debian/control:

I do not yet grok git, so could someoe tell me what this means
 in terms of, say, CVS or arch? What is a $CVS.tar.gz file contain when
 the we are using CVS?

manoj
-- 
We don't like their sound.  Groups of guitars are on the way out. Decca
Recording Company, turning down the Beatles, 1962
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Frank Lichtenheld
On Sun, Oct 07, 2007 at 10:05:32AM -0500, Manoj Srivastava wrote:
 On Fri, 5 Oct 2007 19:16:13 -0400, Joey Hess [EMAIL PROTECTED] said: 
  My implementation adds a new 3.0 version source format. A 3.0 format
  debian source package can consist of any files allowed by formats 1
  and 2, but may also contain .$VCS.tar.gz files. To build a version 3
  source package, a new field is needed in debian/control:
 
 I do not yet grok git, so could someoe tell me what this means
  in terms of, say, CVS or arch? What is a $CVS.tar.gz file contain when
  the we are using CVS?

For CVS it would need to contain the repository (i.e. all the RCS
files), for arch I don't know enough about it to say.

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Raphael Hertzog
On Sun, 07 Oct 2007, Frank Lichtenheld wrote:
  (It might be just me, but I'm getting the feeling that implementing
  WigPen via this v3 format is probably easier than implementing it via
  the v2 format...)
 
 Could you please explain what the difference between WigPen and v2
 format is? I've seen them as identities so far.

I don't know either. But I'd like to dig in to say a few words.

I like the idea of Joey and I'd also like to improve our source packages.
I think we need to step back a bit and maybe try to come up with a more
generic design encompassing wigpen and the work of Joey. 

But it's not as easy as it seems because we have many different
requirements as shown by Colin and others. And furthermore, the
data flow is considerably different when we integrate VCS in the picture.

I'm not even sure that we should really call v3 a 'source package'.

The goals of wigpen were IIRC:
1/ support of other compression mechanism
2/ support of multiple tarballs (glibc case)
3/ automatic support of debian/patches


(1) should be a no-brainer

(2) is not clear: what would multiple tarballs mean with a VCS repository?

(3) patches are auto-applied at source extraction time.

In a VCS, what does it mean ? In Joey's work, all Debian changesets
are in the master branch which is auto-extracted if I understand
correctly (I haven't read the code, only the discussion here).

What about cases were multiple branches are stored? (One for upstream,
one for Debian)


Also, it seems important to keep the possibility to always generate a
plain source package from any VCS based source package. But we might
need some information to be able to do that properly. Exactly like we need
new information if we ever want to support generation of v2 source
packages. Is there some ground to create something common for those
two use cases? 


(Sorry, everything is still a bit blur in my mind and while I was
preparing myself to maybe hack on wigpen as my next dpkg related project, 
this discussion took me by surprise :-))

Cheers,
-- 
Raphaël Hertzog

Premier livre français sur Debian GNU/Linux :
http://www.ouaza.com/livre/admin-debian/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Frank Lichtenheld
On Sun, Oct 07, 2007 at 05:25:00PM +0200, Raphael Hertzog wrote:
 (Sorry, everything is still a bit blur in my mind and while I was
 preparing myself to maybe hack on wigpen as my next dpkg related project, 
 this discussion took me by surprise :-))

Btw, if someone has too much free time and doesn't mind writing
documentation, a deb-source.5 (or dsc.5) manpage similar to what we have for 
binary
packages in deb.5 would be great stuff. Especially if it would document
both V1 and V2.

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Colin Watson
On Sun, Oct 07, 2007 at 10:05:32AM -0500, Manoj Srivastava wrote:
 On Fri, 5 Oct 2007 19:16:13 -0400, Joey Hess [EMAIL PROTECTED] said: 
  I've been working on making dpkg-source support a new source package
  format based upon git. The idea is that a source package has only a
  .dsc and a .git.tar.gz, which is just a git repo.
 
  My implementation adds a new 3.0 version source format. A 3.0 format
  debian source package can consist of any files allowed by formats 1
  and 2, but may also contain .$VCS.tar.gz files. To build a version 3
  source package, a new field is needed in debian/control:
 
 I do not yet grok git, so could someoe tell me what this means
  in terms of, say, CVS or arch? What is a $CVS.tar.gz file contain when
  the we are using CVS?

I think this only really makes sense for distributed revision control
systems. For arch, the .arch.tar.gz would contain the {arch} directory,
perhaps with a few adjustments similar to those being made in the git
and bzr modules.

-- 
Colin Watson   [EMAIL PROTECTED]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Manoj Srivastava
Hi,

OK, commenting on this with my I use arch hat on. If I
 understand correctly, we are proposing shipping a working directory in
 the .deb; and not shipping an orig.tar.gz nor a diff.gz file. I like
 the idea; and I think I can support nested arch packages (submodules in
 .git speak), based on the examples I have seen of joey's patch and
 Colin's for bzr -- I just need some more information about what exactly
 some of these git commands do.

sub prep_tar:
  make sure we have an ./{arch} directory.
  Look for nested submodules:
   $tree_root=$($TLA tree-root);
   @nested=`$TLA inventory -t --nested $tree_root`;
** Why are we checking for uncommitted files here? I would think that
   people would have done an export to actually build packages **

   
   for each tree_root and nested; do
 run $TLA CHANGES
 map { $list{${NESTED_PATH}/$_} = 1; } join ,, `$TLA inventory -s`
   done
   For all files in exclude list, go and set values in %list to 0 (or
   delete the key)
 
** I have no idea what the prune and shallow copy commands do, or the
   arch equivalent **

sub post_unpack_tar
  make sure we have an ./{arch} directory.Look for nested submodules:
   $tree_root=$($TLA tree-root);
   @nested=`$TLA inventory -t --nested $tree_root`;
** arch hooks are per user, not per repo -- iirc **
** what does git-config do? or bzr checkout? **


Actually, at this point I am beginning to question my
 understanding of the proposal.  If we are shipping a working tree, what
 is this step doing?

Is this an svn update equivalent?

manoj
-- 
If a computer can't directly address all the RAM you can use, it's just
a toy. anonymous comp.sys.amiga posting, non-sequitur
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Manoj Srivastava
On Sun, 7 Oct 2007 09:54:39 -0400, Joey Hess [EMAIL PROTECTED] said: 

 Anthony Towns wrote:
 Oh, one question that comes to mind: how does this affect checking
 for non-free stuff in past revisions? If 3.1-4 had some non-free
 files that get reimplemented for 3.2-1, do we (a) expect the
 maintainer to do a no-history upload for 3.2-1; (b) check that this
 happens somehow; (c) not worry about it as long as it's only in the
 history; (d) something else?
 
 Verifying that not just the current tree is DFSG-free, but all the
 history is too seems potentially difficult.

 Yes, the faq discusses this problem. This is why shallow repos are
 IMHO important and non-shallow repos should only be uploaded with
 caution.

What does this mean in non-git context?

manoj
-- 
Don't get even -- get odd!
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Manoj Srivastava
On Sun, 7 Oct 2007 15:44:47 +, Colin Watson [EMAIL PROTECTED] said: 

 On Sun, Oct 07, 2007 at 10:05:32AM -0500, Manoj Srivastava wrote:
 On Fri, 5 Oct 2007 19:16:13 -0400, Joey Hess [EMAIL PROTECTED] said:
  I've been working on making dpkg-source support a new source
  package format based upon git. The idea is that a source package
  has only a .dsc and a .git.tar.gz, which is just a git repo.
 
  My implementation adds a new 3.0 version source format. A 3.0
  format debian source package can consist of any files allowed by
  formats 1 and 2, but may also contain .$VCS.tar.gz files. To build
  a version 3 source package, a new field is needed in
  debian/control:
 
 I do not yet grok git, so could someoe tell me what this means in
 terms of, say, CVS or arch? What is a $CVS.tar.gz file contain when
 the we are using CVS?

 I think this only really makes sense for distributed revision control
 systems. For arch, the .arch.tar.gz would contain the {arch}
 directory, perhaps with a few adjustments similar to those being made
 in the git and bzr modules.

Hmm. If I have just the ./{arch} directory, and none of the
 files, then arch thinks the files have just been deleted; and you can't
 just check out stuff, since the tree is up to date.  Ah. Baz undo
 restores all the files, cool.

The problem here is that the repository in question _has_ to be
 registered by the user running this; so all the users would have to
 register the arch repository in question before unpacking the source
 tarball in order to tell baz/tla how to get access to the repo. Is this
 going to be an issue?

I would prefer to instead ship a grab file for arch instead of
 the {arch} directory, since the latter really buys us nothing over the
 grab file (since we are requiring the distributed source dir and
 network access to unpack source packages).

Consider this grab file:
--8---cut here---start-8---
Archive-Name: [EMAIL PROTECTED]
Archive-Location: http://arch.debian.org/arch/private/srivasta
Target-Revision: packages--debian--1.0
Target-Directory: manoj-packages
Target-Config: configs/ucf/debian/ucf-3.003
--8---cut here---end---8---

tla register-archive --present-ok $values-of-Archive-Location-field
tla grab path/to/the/grab-file
cd $value-of-field-Target-Directory/package-name/*

(room for standardization here)

manoj

--8---cut here---start-8---
__ baz status
* looking for [EMAIL PROTECTED]/ucf--devel--3.0--patch-1 to compare with
* comparing to [EMAIL PROTECTED]/ucf--devel--3.0--patch-1

D   .arch-ids
D   examples
D   examples/.arch-ids
D   t
D   t/.arch-ids
D   .arch-ids/COPYING.id
D   .arch-ids/ChangeLog.id
D   .arch-ids/Makefile.id
D   .arch-ids/lcf.1.id
D   .arch-ids/lcf.id
D   .arch-ids/ucf.1.id
D   .arch-ids/ucf.conf.5.id
D   .arch-ids/ucf.conf.id
D   .arch-ids/ucf.id
D   .arch-ids/ucfq.1.id
D   .arch-ids/ucfq.id
D   .arch-ids/ucfr.1.id
D   .arch-ids/ucfr.id
D   COPYING
D   ChangeLog
D   Makefile
D   examples/.arch-ids/=id
D   examples/.arch-ids/ChangeLog.id
D   examples/.arch-ids/postinst.id
D   examples/.arch-ids/postrm.id
D   examples/ChangeLog
D   examples/postinst
D   examples/postrm
D   lcf
D   lcf.1
D   t/.arch-ids/=id
D   ucf
D   ucf.1
D   ucf.conf
D   ucf.conf.5
D   ucfq
D   ucfq.1
D   ucfr
D   ucfr.1
__ baz update
* tree is already up to date
--8---cut here---end---8---

-- 
Time is money and money can't buy you love and I love your outfit
T.H.U.N.D.E.R. #1
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Clint Adams
On Sun, Oct 07, 2007 at 10:52:45AM -0500, Manoj Srivastava wrote:
 What does this mean in non-git context?

I think truncating the patch-log history is unimportant for Arch,
but any ++pristine-trees should definitely be nuked prior to packing.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Clint Adams
On Sun, Oct 07, 2007 at 11:10:41AM -0500, Manoj Srivastava wrote:
 Hmm. If I have just the ./{arch} directory, and none of the
  files, then arch thinks the files have just been deleted; and you can't
  just check out stuff, since the tree is up to date.  Ah. Baz undo
  restores all the files, cool.

I presume you could ship all the normal files in one tarball,
the .arch-ids and {arch} directories in another, and the debian/
directory in a third.

That would give the NMUer a full working tree to run $TLA diff
in.  Only shipping a grab file would burden the end user with
a need for http access and no guarantee that the source will
be available.

 The problem here is that the repository in question _has_ to be
  registered by the user running this; so all the users would have to
  register the arch repository in question before unpacking the source
  tarball in order to tell baz/tla how to get access to the repo. Is this
  going to be an issue?

It shouldn't be too difficult to add an --autoregister switch to tla
grab, though I don't know how safe it'd be.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Frank Lichtenheld
On Sun, Oct 07, 2007 at 10:49:48AM -0500, Manoj Srivastava wrote:
 OK, commenting on this with my I use arch hat on. If I
  understand correctly, we are proposing shipping a working directory in
  the .deb; and not shipping an orig.tar.gz nor a diff.gz file. I like

You probably mean source package here and not .deb. Also the original
proposal just means shipping the repository data, since most DVCS can
easily create a working directory from that.

  the idea; and I think I can support nested arch packages (submodules in
  .git speak), based on the examples I have seen of joey's patch and
  Colin's for bzr -- I just need some more information about what exactly
  some of these git commands do.
 
 sub prep_tar:
   make sure we have an ./{arch} directory.
   Look for nested submodules:
$tree_root=$($TLA tree-root);
@nested=`$TLA inventory -t --nested $tree_root`;
 ** Why are we checking for uncommitted files here? I would think that
people would have done an export to actually build packages **

The whole idea of the proposal is to NOT create an export.

for each tree_root and nested; do
  run $TLA CHANGES
  map { $list{${NESTED_PATH}/$_} = 1; } join ,, `$TLA inventory -s`
done
For all files in exclude list, go and set values in %list to 0 (or
delete the key)
  
 ** I have no idea what the prune and shallow copy commands do, or the
arch equivalent **

git gc --prune deletes old data that isn't needed anymore. This is
needed since all other git commands never change or overwrite data
(file data, this is obviously not true for meta data), they only
add some.

 sub post_unpack_tar
   make sure we have an ./{arch} directory.Look for nested submodules:
$tree_root=$($TLA tree-root);
@nested=`$TLA inventory -t --nested $tree_root`;
 ** arch hooks are per user, not per repo -- iirc **
 ** what does git-config do? or bzr checkout? **

git-config is just an cli interface to the .git/config file.
Since we only ship the repository we need to create the working
tree. This is what git/bzr checkout do.

 Actually, at this point I am beginning to question my
  understanding of the proposal.  If we are shipping a working tree, what
  is this step doing?
 
 Is this an svn update equivalent?

No, that would be git fetch/pull (and probably something similar named
in bzr)

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Manoj Srivastava
On Sun, 7 Oct 2007 20:33:58 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: 

 On Sun, Oct 07, 2007 at 10:49:48AM -0500, Manoj Srivastava wrote:
 OK, commenting on this with my I use arch hat on. If I understand
 correctly, we are proposing shipping a working directory in the .deb;
 and not shipping an orig.tar.gz nor a diff.gz file. I like

 You probably mean source package here and not .deb. Also the original
 proposal just means shipping the repository data, since most DVCS can
 easily create a working directory from that.

Hmm. The repository data, as far as I can tell, means the name
 of the archive, and the location.  Do you really mean we are not
 shipping any, say, foo.c file in the sources, just a locatio where you
 can get the foo.c file from, at a particular version?

 The whole idea of the proposal is to NOT create an export.

If we are not creating and export, and we are only shipping the
 repository data,  how come there  needs to be a check for uncommitted
 files? If the changes are uncommitted, that means the repo does not
 know about it; and if we only ship  the repository data, we are not
 shipping stuff not in the repo.

What am I missing?

 for each tree_root and nested; do run $TLA CHANGES map {
 $list{${NESTED_PATH}/$_} = 1; } join ,, `$TLA inventory -s` done
 For all files in exclude list, go and set values in %list to 0 (or
 delete the key)
 
 ** I have no idea what the prune and shallow copy commands do, or the
 arch equivalent **

 git gc --prune deletes old data that isn't needed anymore. This is
 needed since all other git commands never change or overwrite data
 (file data, this is obviously not true for meta data), they only add
 some.

I am unsure what this means in term of arch.

 sub post_unpack_tar make sure we have an ./{arch} directory.Look for
 nested submodules: $tree_root=$($TLA tree-root); @nested=`$TLA
 inventory -t --nested $tree_root`;
 ** arch hooks are per user, not per repo -- iirc **
 ** what does git-config do? or bzr checkout? **

 git-config is just an cli interface to the .git/config file.  Since we
 only ship the repository we need to create the working tree. This is
 what git/bzr checkout do.

Well, I do not see how this is done in arch. If you are not
 shipping the working tree; all I can see shipping for arch is the URI
 of the repo. I am pretty sure this is not what you mean, since then any
 arch based source would be three lines or so, and would need network
 access to unpack the source tree.

 Actually, at this point I am beginning to question my understanding
 of the proposal.  If we are shipping a working tree, what is this
 step doing?
 
 Is this an svn update equivalent?

 No, that would be git fetch/pull (and probably something similar named
 in bzr)

I don't think I know what this means when you are using arch.

manoj

-- 
Earn cash in your spare time -- blackmail your friends.
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Manoj Srivastava
On Sun, 7 Oct 2007 12:24:46 -0400, Clint Adams [EMAIL PROTECTED] said: 

 On Sun, Oct 07, 2007 at 11:10:41AM -0500, Manoj Srivastava wrote:
 Hmm. If I have just the ./{arch} directory, and none of the files,
 then arch thinks the files have just been deleted; and you can't just
 check out stuff, since the tree is up to date.  Ah. Baz undo restores
 all the files, cool.

 I presume you could ship all the normal files in one tarball, the
 .arch-ids and {arch} directories in another, and the debian/ directory
 in a third.

Err, and why am I doing this? Why am I not shipping my working
 directory as a tarball, complete instead of breaking it up
 (apparently arbitrarily) into three parts?

 That would give the NMUer a full working tree to run $TLA diff in.
 Only shipping a grab file would burden the end user with a need for
 http access and no guarantee that the source will be available.

How is git reconstituting the files if there is no network
 access?  Are they shipping all the bits needed to get a full working
 dir without any network access?

 The problem here is that the repository in question _has_ to be
 registered by the user running this; so all the users would have to
 register the arch repository in question before unpacking the source
 tarball in order to tell baz/tla how to get access to the repo. Is
 this going to be an issue?

 It shouldn't be too difficult to add an --autoregister switch to tla
 grab, though I don't know how safe it'd be.

caveat emptor, I think, given that some repository access seems
 to be required for unpacking a version 3 source package.  This is not
 something I would do in an un-constrained environment.

manoj
-- 
It is impossible to make anything foolproof, because fools are so
ingenious.
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Manoj Srivastava
On Sun, 7 Oct 2007 12:14:39 -0400, Clint Adams [EMAIL PROTECTED] said: 

 On Sun, Oct 07, 2007 at 10:52:45AM -0500, Manoj Srivastava wrote:
 What does this mean in non-git context?

 I think truncating the patch-log history is unimportant for Arch, but
 any ++pristine-trees should definitely be nuked prior to packing.

OK, that's fair.  I use revision libs, so I never have pristine
 trees in my working dir anyway.

manoj
-- 
Linux is obsolete (Andrew Tanenbaum)
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Russ Allbery
Manoj Srivastava [EMAIL PROTECTED] writes:

 How is git reconstituting the files if there is no network
  access?  Are they shipping all the bits needed to get a full working
  dir without any network access?

As I understand it, yes, that's the basic idea.

-- 
Russ Allbery ([EMAIL PROTECTED])   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Clint Adams
On Sun, Oct 07, 2007 at 02:19:36PM -0500, Manoj Srivastava wrote:
 Err, and why am I doing this? Why am I not shipping my working
  directory as a tarball, complete instead of breaking it up
  (apparently arbitrarily) into three parts?

As opposed to an .orig.tar.gz and all the debian/, {arch}/, and
.arch-ids/ components in the .diff.gz ?

 How is git reconstituting the files if there is no network
  access?  Are they shipping all the bits needed to get a full working
  dir without any network access?

Yes.  the .git/ (or .bzr/ ) directory contains the entire (or abridged
in the case of these shallow clones) history so you can check out any
of the covered revisions.

This would be akin to you including a cachedrev of an arbitrary version
followed by all the subsequent patches.tar.gz files, except that I
believe git et al. are meant to be more space-efficient.


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Frank Lichtenheld
On Sun, Oct 07, 2007 at 02:16:12PM -0500, Manoj Srivastava wrote:
 On Sun, 7 Oct 2007 20:33:58 +0200, Frank Lichtenheld [EMAIL PROTECTED] 
 said: 
  On Sun, Oct 07, 2007 at 10:49:48AM -0500, Manoj Srivastava wrote:
  You probably mean source package here and not .deb. Also the original
  proposal just means shipping the repository data, since most DVCS can
  easily create a working directory from that.
 
 Hmm. The repository data, as far as I can tell, means the name
  of the archive, and the location.  Do you really mean we are not
  shipping any, say, foo.c file in the sources, just a locatio where you
  can get the foo.c file from, at a particular version?

bzr and git always ship the complete repository with each working
directory. This is why they are called distributed. Arch seems to be some
weird thing in between truly central and truly distributed VCS.

  The whole idea of the proposal is to NOT create an export.
 
 If we are not creating and export, and we are only shipping the
  repository data,  how come there  needs to be a check for uncommitted
  files? If the changes are uncommitted, that means the repo does not
  know about it; and if we only ship  the repository data, we are not
  shipping stuff not in the repo.
 
 What am I missing?

They might be uncommitted because the maintainer forgot to commit them.
The only question is whether we should abort, commit the changes, or
ignore the changes. There is no technical problem with either of these
cases.

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Manoj Srivastava
Hi,
On Sun, 7 Oct 2007 15:49:55 -0400, Clint Adams [EMAIL PROTECTED] said: 

 On Sun, Oct 07, 2007 at 02:19:36PM -0500, Manoj Srivastava wrote:
On Sun, 7 Oct 2007 12:24:46 -0400, Clint Adams [EMAIL PROTECTED] said: 
 I presume you could ship all the normal files in one tarball, the
 .arch-ids and {arch} directories in another, and the debian/ directory
 in a third.

 Err, and why am I doing this? Why am I not shipping my working
 directory as a tarball, complete instead of breaking it up
 (apparently arbitrarily) into three parts?

 As opposed to an .orig.tar.gz and all the debian/, {arch}/, and
 .arch-ids/ components in the .diff.gz ?

Umm, I was asking about why the normal and the arch-ids and
 {arch} directories are being separated, and the ./debian dir as well.

The idea of the wig  pen was so that we no longer used diff as
 an version control system, or were able to use more than one tar ball
 for the source.

How is this working in this proposal? I do not ship the
 orig.tar.gz file, but I ship and orig.arch.tar.gz file with the
 upstream branch?

Then I mostly duplicate this by shipping a working dir, and each
 also somehow ship an delta that recreates the orig.tar.gzx file from
 the upstream branch I am shipping?

 How is git reconstituting the files if there is no network access?
 Are they shipping all the bits needed to get a full working dir
 without any network access?

 Yes.  the .git/ (or .bzr/ ) directory contains the entire (or abridged
 in the case of these shallow clones) history so you can check out
 any of the covered revisions.

A history as in RCS-like history, with parches, as opposed to
 the patch-log that is what the {arch} directories contain? 

 This would be akin to you including a cachedrev of an arbitrary
 version followed by all the subsequent patches.tar.gz files, except
 that I believe git et al. are meant to be more space-efficient.

wow.

gulp.

OK, so for arch I suppose I just ship a working dir, period, and
 people need network access to  get the older versions, unless people
 want terabytes of the archive in every source versions.

manoj

-- 
Mind your own business, Mr. Spock.  I'm sick of your halfbreed
interference.
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Manoj Srivastava
On Sun, 7 Oct 2007 22:04:21 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: 

 On Sun, Oct 07, 2007 at 02:16:12PM -0500, Manoj Srivastava wrote:
 On Sun, 7 Oct 2007 20:33:58 +0200, Frank Lichtenheld
 [EMAIL PROTECTED] said:
  On Sun, Oct 07, 2007 at 10:49:48AM -0500, Manoj Srivastava wrote:
  You probably mean source package here and not .deb. Also the
  original proposal just means shipping the repository data, since
  most DVCS can easily create a working directory from that.
 
 Hmm. The repository data, as far as I can tell, means the name of the
 archive, and the location.  Do you really mean we are not shipping
 any, say, foo.c file in the sources, just a locatio where you can get
 the foo.c file from, at a particular version?

 bzr and git always ship the complete repository with each working
 directory. This is why they are called distributed. Arch seems to be
 some weird thing in between truly central and truly distributed VCS.

I am not sure I see this.  Arch repositories are distributed,
 and you can pull, branch,  and tag off any repository out there in the
 meta-verse.  But every directory also has a semi permanent URI; and
 checking pout a branch locally does not end up with you downloading the
 terabytes of stuff in the repo out there.

This might be because you can have more than one project in a
 repo; my repo contains CVS emacs, unicode emacs, as well as most of the
 SELinux packages, etc, and I mirror partially to arch.d.o. I would hate
 to see all of emacs in the local dir of people who just want to check
 out devotee.

So arch does have a different mechanism of doing distributed
 repositories; but the repositories are distributed in the sense that I
 control one repo, but branches in my repo are children of other
 repositories, and can be merged and tagged back and from,

  The whole idea of the proposal is to NOT create an export.
 
 If we are not creating and export, and we are only shipping the
 repository data, how come there needs to be a check for uncommitted
 files? If the changes are uncommitted, that means the repo does not
 know about it; and if we only ship the repository data, we are not
 shipping stuff not in the repo.
 
 What am I missing?

 They might be uncommitted because the maintainer forgot to commit
 them.  The only question is whether we should abort, commit the
 changes, or ignore the changes. There is no technical problem with
 either of these cases.

Well, as a developer, I would rather that someone else running
 dpkg source on a package not try to commit to my repo, since it shall
 fail.

Assuming we consider trying to support arch-like distributed
 version control systems in the new dpkg; it might well be that the
 current approach is too focussed on git/bzr type version control to
 work well with arch.

manoj
-- 
DEATH: The penultimate commercial transaction finalized by
probate. Bernard Rosenberg
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Frank Lichtenheld
On Sun, Oct 07, 2007 at 06:24:15PM -0500, Manoj Srivastava wrote:
 On Sun, 7 Oct 2007 22:04:21 +0200, Frank Lichtenheld [EMAIL PROTECTED] 
 said: 
  bzr and git always ship the complete repository with each working
  directory. This is why they are called distributed. Arch seems to be
  some weird thing in between truly central and truly distributed VCS.
 
 I am not sure I see this.  Arch repositories are distributed,
  and you can pull, branch,  and tag off any repository out there in the
  meta-verse.  But every directory also has a semi permanent URI; and
  checking pout a branch locally does not end up with you downloading the
  terabytes of stuff in the repo out there.

Lets not exagerate. At least for git the repository will usually be
smaller or only little larger than the working directory. It will
probably compress worse though.

 This might be because you can have more than one project in a
  repo; my repo contains CVS emacs, unicode emacs, as well as most of the
  SELinux packages, etc, and I mirror partially to arch.d.o. I would hate
  to see all of emacs in the local dir of people who just want to check
  out devotee.
 
 So arch does have a different mechanism of doing distributed
  repositories; but the repositories are distributed in the sense that I
  control one repo, but branches in my repo are children of other
  repositories, and can be merged and tagged back and from,

Out of interest, which of the following actions would need remote
access?

log view (including diffs between revisions)
annotation/blame view
creating a new commit/revision/tag
reverting a dirty working tree to a clean one

For git/bzr, the answer is usually no to all of these. If you have
a shallow copy in git, the answers to the first two become
yes, since you will need it convert to a full copy first .

[...]
 Assuming we consider trying to support arch-like distributed
  version control systems in the new dpkg; it might well be that the
  current approach is too focussed on git/bzr type version control to
  work well with arch.

It most probably is.

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Manoj Srivastava
On Mon, 8 Oct 2007 02:55:37 +0200, Frank Lichtenheld [EMAIL PROTECTED] said: 

 On Sun, Oct 07, 2007 at 06:24:15PM -0500, Manoj Srivastava wrote:
 On Sun, 7 Oct 2007 22:04:21 +0200, Frank Lichtenheld
 [EMAIL PROTECTED] said:
  bzr and git always ship the complete repository with each working
  directory. This is why they are called distributed. Arch seems to
  be some weird thing in between truly central and truly distributed
  VCS.
 
 I am not sure I see this.  Arch repositories are distributed, and you
 can pull, branch, and tag off any repository out there in the
 meta-verse.  But every directory also has a semi permanent URI; and
 checking out a branch locally does not end up with you downloading
 the terabytes of stuff in the repo out there.

 Lets not exagerate. At least for git the repository will usually be
 smaller or only little larger than the working directory. It will
 probably compress worse though.


How is this magic done? If I have several dozen feature
 branches, all feeding back and forth, and have made lots and lots of
 changes in my sources, how does git preserve all this information
 without a commensurate increase in size?  This makes the information
 theory geek in me very very skeptical.

Or are you talking about typical usage, and is that why people
 go around making shallow copies to cut down on the size of the
 shipped repo?

 This might be because you can have more than one project in a repo;
 my repo contains CVS emacs, unicode emacs, as well as most of the
 SELinux packages, etc, and I mirror partially to arch.d.o. I would
 hate to see all of emacs in the local dir of people who just want to
 check out devotee.
 
 So arch does have a different mechanism of doing distributed
 repositories; but the repositories are distributed in the sense that
 I control one repo, but branches in my repo are children of other
 repositories, and can be merged and tagged back and from,

 Out of interest, which of the following actions would need remote
 access?

 log view (including diffs between revisions)

The ./{arch} directory does contain logs. Diffs between
 revisions requires access to the repository (or the local cache
 library, if that contains the revision we want to diff with or from)

 annotation/blame view

Same thing; you need access to the repo since the code for the
 other revisions is not in the checked out directory.

 creating a new commit/revision/tag

Committing it would require access to the repo.

 reverting a dirty working tree to a clean one

I think you are talking about reverting local changes to the
 latest revision from the repository.  Well, that needs acess to the
 repo or a local cache.

 For git/bzr, the answer is usually no to all of these. If you have a
 shallow copy in git, the answers to the first two become yes, since
 you will need it convert to a full copy first .

For arch, the answer is yes to all these cases.

 [...]
 Assuming we consider trying to support arch-like distributed version
 control systems in the new dpkg; it might well be that the current
 approach is too focussed on git/bzr type version control to work well
 with arch.

 It most probably is.

As far as I can tell, most of the things being done for git are
 not required if I ship a working directory for for arch ({arch} and
 .arh-ids); and the only other thing required would be to also ship what
 lives in the grab file in the control file; so people can know where to
 register the  archive location from to get access to the other
 information.

If people wanted to provide changes, all that is needed is for
 them to tag the developers branch, hack, and ask the developers to pull
 from their branch (people have done that for ucf and devotee in the
 past).

What exactly is the goal of this dpkg addition? With arch, I can
 ship a full working copy; and as long as people have the repository
 registered, they have full access to older revisions and feature
 branches and all. 

Would shipping the full working dir get by the requirement of
 shipping the diff.gz?  If so, we can support arch with no changes to
 dpkg whatsoever.

manoj
-- 
You never hesitate to tackle the most difficult problems.
Manoj Srivastava [EMAIL PROTECTED] http://www.golden-gryphon.com/
1024D/BF24424C print 4966 F272 D093 B493 410B  924B 21BA DABB BF24 424C


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-07 Thread Anthony Towns
On Sun, Oct 07, 2007 at 09:45:20AM -0400, Joey Hess wrote:
 Anthony Towns wrote:
  So the logic there would be:
  if there's an upstream tag, then
  generate an .orig.tgz
  if there's a pristine-tar info,
  hax0r it to be pristine
  generate a .diff.gz
  if the .diff failed goto bailout
  generate a .dsc containing the orig and diff
 It's not generally possible to generate a .diff.gz that expresses all
 the changes that might be in a git repository.

Right, but it is possible to detect that, and bailout to generating a
.tar.gz, no?

  Repo formats that bzr in etch can unpack could be denoted by
  Source-Depends: dpkg-bzr (= 0.11)
 I was thinking about Source-Depends too, the main problem seems to be
 that it would need to be supported in apt-get source too. I wonder if
 we could just use build-depends.

apt-get source support could just be a warning This package cannot be
unpacked without  installed. Using Build-Depends: would make it
pretty complicated to know which bits were needed for unpacking, if that's
all you're trying to do.

Cheers,
aj


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Here's an updated patch, full diff from head again, with:

- use git-config --null
- git-config --filename only needs a full path if not run from a git WC
- import the VCS module so it can check if the VCS is available
- fix all commands that spawn a subshell
- delete the reflog

-- 
see shy jo
diff --git a/debian/dpkg-dev.install b/debian/dpkg-dev.install
index 49e3835..ee65dbf 100644
--- a/debian/dpkg-dev.install
+++ b/debian/dpkg-dev.install
@@ -56,3 +56,4 @@ usr/share/man/*/dpkg-shlibdeps.1
 usr/share/man/*/*/dpkg-source.1
 usr/share/man/*/dpkg-source.1
 usr/share/perl5/Dpkg/BuildOptions.pm
+usr/share/perl5/Dpkg/Source
diff --git a/man/dpkg-source.1 b/man/dpkg-source.1
index 9bf9ff3..14c17c3 100644
--- a/man/dpkg-source.1
+++ b/man/dpkg-source.1
@@ -55,6 +55,10 @@ will look for the original source tarfile
 or the original source directory
 .IB directory .orig
 depending on the \fB\-sX\fP arguments.
+
+
+If the source package is being built as a version 3 source package using
+a VCS, no upstream tarball or original source directory is needed.
 .TP
 .BR \-h ,  \-\-help
 Show the usage message and exit.
@@ -109,7 +113,9 @@ This option negates a previously set
 .BR \-i [\fIregexp\fP]
 You may specify a perl regular expression to match files you want
 filtered out of the list of files for the diff. (This list is
-generated by a find command.) \fB\-i\fR by itself enables the option,
+generated by a find command.) (If the source package is being built as a
+version 3 source package using a VCS, this is instead used to
+ignore uncommitted files.) \fB\-i\fR by itself enables the option,
 with a default that will filter out control files and directories of the
 most common revision control systems, backup and swap files and Libtool
 build output directories. There can only be one active regexp, of multiple
@@ -162,6 +168,9 @@ will not overwrite existing tarfiles or directories. If this is
 desired then
 .BR \-sA ,  \-sP ,  \-sK ,  \-sU  and  \-sR
 should be used instead.
+.PP
+If the source package is being built as a version 3 source package using
+a VCS, these options do not make sense, and will be ignored.
 .TP
 .BR \-sk
 Specifies to expect the original source as a tarfile, by default
diff --git a/scripts/Dpkg/Source/VCS/git.pm b/scripts/Dpkg/Source/VCS/git.pm
new file mode 100644
index 000..431fab3
--- /dev/null
+++ b/scripts/Dpkg/Source/VCS/git.pm
@@ -0,0 +1,257 @@
+#!/usr/bin/perl
+#
+# git support for dpkg-source
+#
+# Copyright © 2007 Joey Hess [EMAIL PROTECTED].
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+package Dpkg::Source::VCS::git;
+
+use strict;
+use warnings;
+use Cwd;
+use File::Find;
+use Dpkg;
+use Dpkg::Gettext;
+
+push (@INC, $dpkglibdir);
+require 'controllib.pl';
+
+# Remove variables from the environment that might cause git to do
+# something unexpected.
+delete $ENV{GIT_DIR};
+delete $ENV{GIT_INDEX_FILE};
+delete $ENV{GIT_OBJECT_DIRECTORY};
+delete $ENV{GIT_ALTERNATE_OBJECT_DIRECTORIES};
+delete $ENV{GIT_WORK_TREE};
+
+sub import {
+	foreach my $dir (split(/:/, $ENV{PATH})) {
+		if (-x $dir/git) {
+			return 1;
+		}
+	}
+	main::error(sprintf(_g(This source package can only be unpacked using git, which is not in the PATH.)));
+}
+
+sub sanity_check {
+	my $srcdir=shift;
+
+	if (! -d $srcdir/.git) {
+		main::error(sprintf(_g(source directory is not the top directory of a git repository (%s/.git not present), but Format git was specified), $srcdir));
+	}
+	if (-s $srcdir/.gitmodules) {
+		main::error(sprintf(_g(git repository %s uses submodules. This is not yet supported.), $srcdir));
+	}
+
+	# Symlinks from .git to outside could cause unpack failures, or
+	# point to files they shouldn't, so check for and don't allow.
+	if (-l $srcdir/.git) {
+		main::error(sprintf(_g(%s is a symlink), $srcdir/.git));
+	}
+	my $abs_srcdir=Cwd::abs_path($srcdir);
+	find(sub {
+		if (-l $_) {
+			if (Cwd::abs_path(readlink($_)) !~ /^\Q$abs_srcdir\E(\/|$)/) {
+main::error(sprintf(_g(%s is a symlink to outside %s), $File::Find::name, $srcdir));
+			}
+		}
+	}, $srcdir/.git);
+
+	return 1;
+}
+
+# Returns a hash of arrays of git config values.
+sub read_git_config {
+	my $file=shift;
+
+	my %ret;
+	open(GIT_CONFIG, '-|', git-config, --file, $file, --null, -l) ||
+		main::subprocerr(git-config);
+	my ($key, $value);
+	while (GIT_CONFIG) {
+		

Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Russ Allbery wrote:
 It's a little disturbing to have content in parentheses be significant in
 a format based on RFC 822, although we have broken this rule elsewhere
 (most notably in dependency fields, of course).

If it helps, the (git) comment is only used in debian/control, it's
not put in the dsc files.

I'd be just as happy to use [git] or even Foo: git.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Anthony Towns
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
 I've been working on making dpkg-source support a new source package format
 based upon git. The idea is that a source package has only a .dsc and a
 .git.tar.gz, which is just a git repo.

Is a .gitdiff.tar.gz possible, so the archive doesn't need to have the
full git repo replaced by each upload? ie, something like

Files:
  foo_1.0-1.git.tar.gz
  foo_1.0-2.gitdiff.tar.gz

so that a small patch only adds a small file to the archive rather than
replacing a large one?

This means you can't build the package by hand with standard unix tools
-- at the very least you need git installed, and if other VC systems
are to be supported, you need them too. Changes in repository formats
will presumably result in versioned dependencies too.

This is slightly worse than the case for existing patch management tools
in that most of those can be dealt with by hand; though cdbs and to a
lesser extent debhelper can't be quite as easily replicated I guess.

Once the unpack is done, I don't see any reason why you can't do an NMU
in the traditional way, so presuming dpkg-source -x or apt-get source
handles the unpack automatically, I don't think it necessarily imposes
any new requirements on NMUers.

Maybe providing a feature on packages.debian.org (or similar) to download
sources in simple, non-VC, tarball format would make this a complete
non-issue though?

Would it make sense to have the source format look more like:

Format: 3.0
Source: dpkg
...
Source-Depends: git-dpkg (= 3.14159)
Source-Hooks: /usr/bin/git-dpkg
...
Files:
 ... foo_1.2.git.tar.gz

and have the git specific functionality be provided by a /usr/bin/git-dpkg
binary (with standardised arguments) from the git-dpkg package? That
would let you smoothly deal with repository changes and implementing new
interfaces, and also let us limit the allowable formats for the archive
reasonably simply.

You could drop the Source-Hooks: line, and just have dpkg-source know
to associate *.git.tar.gz with /usr/lib/dpkg/source/git, and trust the
package will provide it.

Bonus points: rather than debian/rules clean, create a diff, build,
have dpkg do debian/rules clean, commit any uncommitted changes with the
commit message being the changes from the changelog, create a .git.tgz,
build for git-source-format packages.

Cheers,
aj



signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Anthony Towns wrote:
 Is a .gitdiff.tar.gz possible, so the archive doesn't need to have the
 full git repo replaced by each upload? ie, something like
 
   Files:
 foo_1.0-1.git.tar.gz
 foo_1.0-2.gitdiff.tar.gz
 
 so that a small patch only adds a small file to the archive rather than
 replacing a large one?

I think it's possible, the gitdiff might use git packs against a prior
repo. That would be a nice enhancement to what I have done.

 This means you can't build the package by hand with standard unix tools
 -- at the very least you need git installed, and if other VC systems
 are to be supported, you need them too.

Yes, as I mention in the faq I think this is an acceptable tradeoff to
get away from having to use diff.

 Changes in repository formats will presumably result in versioned
 dependencies too.

I don't think that dpkg should add vcs formats that we don't have a good
expectation of remaining supported by newer versions of the tools going
forward (so svn repos are out). There's a bit of discussion of this in
the faq. I think that git has a pretty good track record and has
incentive to keep compatibility support since this format is
used over the wire by git (eg, with http urls).

If the format changes in a non-backwards compatible way, we could have
source packages built on unstable that cannot be extracted on stable,
which I also think is suboptimal, but hard to completly avoid.

 This is slightly worse than the case for existing patch management tools
 in that most of those can be dealt with by hand; though cdbs and to a
 lesser extent debhelper can't be quite as easily replicated I guess.

Neither could packages using quilt before it was available in
stable or dbs before it was.

 Once the unpack is done, I don't see any reason why you can't do an NMU
 in the traditional way, so presuming dpkg-source -x or apt-get source
 handles the unpack automatically, I don't think it necessarily imposes
 any new requirements on NMUers.

Basically, you have to know how to git commit your changes before building
the NMU, and that's all. As a bonus, it's rather easier to generate NMU
patchsets. :-)

 Maybe providing a feature on packages.debian.org (or similar) to download
 sources in simple, non-VC, tarball format would make this a complete
 non-issue though?

pristine-tar could be used for this, it would just need source packages
to put the delta somewhere standaised (under debian/), and would need 
some standarised way to get to the upstream source branch in git.

 Would it make sense to have the source format look more like:
 
   Format: 3.0
   Source: dpkg
   ...
   Source-Depends: git-dpkg (= 3.14159)
   Source-Hooks: /usr/bin/git-dpkg
   ...
   Files:
... foo_1.2.git.tar.gz
 
 and have the git specific functionality be provided by a /usr/bin/git-dpkg
 binary (with standardised arguments) from the git-dpkg package? That
 would let you smoothly deal with repository changes and implementing new
 interfaces, and also let us limit the allowable formats for the archive
 reasonably simply.
 
 You could drop the Source-Hooks: line, and just have dpkg-source know
 to associate *.git.tar.gz with /usr/lib/dpkg/source/git, and trust the
 package will provide it.
 
Not sure if this buys anything that using perl modules for the vcses
can't do, really. How do you envision this helping deal with repository
format changes?

 Bonus points: rather than debian/rules clean, create a diff, build,
 have dpkg do debian/rules clean, commit any uncommitted changes with the
 commit message being the changes from the changelog, create a .git.tgz,
 build for git-source-format packages.

I have a feeling that any auto-commit stuff should be controlled by an
option. I'm *sure* that it would annoy some developers. No strong
feelings about whether it should default on or off, though least suprise
suggests *off*.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Joey Hess wrote:
  Maybe providing a feature on packages.debian.org (or similar) to download
  sources in simple, non-VC, tarball format would make this a complete
  non-issue though?
 
 pristine-tar could be used for this, it would just need source packages
 to put the delta somewhere standaised (under debian/), and would need 
 some standarised way to get to the upstream source branch in git.

BTW, if that were standardised, the other option would be for
dpkg-source -x to regenerate the pristine upstream tarball.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Joey Hess wrote:
  Bonus points: rather than debian/rules clean, create a diff, build,
  have dpkg do debian/rules clean, commit any uncommitted changes with the
  commit message being the changes from the changelog, create a .git.tgz,
  build for git-source-format packages.
 
 I have a feeling that any auto-commit stuff should be controlled by an
 option. I'm *sure* that it would annoy some developers. No strong
 feelings about whether it should default on or off, though least suprise
 suggests *off*.

One problem with auto-committing is tags. Developers will probably
want to tag their release before doing the final release build, and
if dpkg-source then found and auto-committed a further change, the tag
wouldn't accurately match the release.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Frank Lichtenheld
On Sat, Oct 06, 2007 at 05:27:04PM +1000, Anthony Towns wrote:
 On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
 This means you can't build the package by hand with standard unix tools
 -- at the very least you need git installed, and if other VC systems
 are to be supported, you need them too. Changes in repository formats
 will presumably result in versioned dependencies too.
 
 This is slightly worse than the case for existing patch management tools
 in that most of those can be dealt with by hand; though cdbs and to a
 lesser extent debhelper can't be quite as easily replicated I guess.

A similar problem arises with Format: 2.0 packages as well if the user
hasn't bzip2 (unlikely) or lzma (likely) installed and tries to unpack
a source package built with them.

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Frank Lichtenheld
On Sat, Oct 06, 2007 at 11:19:43AM -0400, Joey Hess wrote:
 Anthony Towns wrote:
  Is a .gitdiff.tar.gz possible, so the archive doesn't need to have the
  full git repo replaced by each upload? ie, something like
  
  Files:
    foo_1.0-1.git.tar.gz
    foo_1.0-2.gitdiff.tar.gz
  
  so that a small patch only adds a small file to the archive rather than
  replacing a large one?
 
 I think it's possible, the gitdiff might use git packs against a prior
 repo. That would be a nice enhancement to what I have done.

I think there is a mechanism in git to disallow replacing old pack
files (i.e. forcing to create additional ones with only new objects),
however, I haven't used that myself, yet.

On a general note: I think we definetly could need the better tarball
compression support _before_ adding huge amount of history into the
archive...

Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Colin Watson
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
 I've been working on making dpkg-source support a new source package
 format based upon git. The idea is that a source package has only a
 .dsc and a .git.tar.gz, which is just a git repo.

So, I can't stand git's user interface. I generally try to avoid making
a huge issue of this since it seems to be massively political on places
like Planet at the moment, there seems to be a certain amount of
confusion of people's personal opinions with that of their employers
going on, and in any case I normally find that revision control
flamewars have negative utility. (I don't think it's terribly relevant
to this discussion why I prefer not to use git, and I don't want to
sidetrack the thread with that; I just wanted to present an existence
case of somebody who doesn't want to switch to .git.tar.gz and yet
doesn't want to stay with .orig.tar.gz and .diff.gz forever.)

Still, this work looks pretty cool, and I'd like to be able to make use
of it despite avoiding git whenever I can. I noticed that you'd
helpfully structured your changes such that it would be possible to plug
in a different revision control system, so I wrote a module to support
bzr. The patch is attached to this e-mail, and I'd appreciate comments;
if this work is merged into dpkg I'd be very happy if my addition were
merged too. There are probably some improvements to be made, but it was
really utterly trivial; I was impressed that I didn't have to touch
anything else beyond plugging in a new module. Ironically, of course, I
did use git to create it. :-)


While working on this I was thinking about general issues with the
format. It seems to me that it's suboptimal not to ship a working tree.
I know you sort of address this in the wiki FAQ, and I realise that
there are space advantages to only shipping the VCS data. However, I'd
like to try to persuade you otherwise if I can. My concerns are:

  * Users will need to have the VCS installed in order to inspect the
source.

It's true that this is no worse than dbs or dpatch or whatever, and
in fact it's better because dpkg-source will take care of the
unpacking step automatically. Still, I do think it is a downside; we
do still ship /usr/share/doc/debian/source-unpack.txt, and people do
unpack Debian source packages on other systems from time to time and
inspect them (I certainly do the same in the other direction with
source RPMs, and curse their complexity). Plus, if the VCS fails to
reconstitute a working tree for some unforeseen reason (maybe you
have a broken installation of it, or maybe there was some version
skew, or something else), then you're rather screwed. Tarballs are
nice and simple and, assuming they were transferred accurately,
hardly ever break in ways that make it impossible for you to extract
the files.

  * Buildds will need to have the VCS installed in their base system.

Possibly a minor concern since sbuild does the unpack in the base
rather than in the chroot, but it's there nevertheless. Every
derivative distribution that runs its own buildds will need to take
care of this too.

  * Some source packages want to ship non-VCS-managed files.

It's very common for source packages to include autogenerated
objects like configure, Makefile.in, etc. Whether to check these
into a VCS is a somewhat religious matter (as acknowledged by the
gettext info documentation, for instance), and personally I lean
towards checking them in (with a few exceptions) just because it
makes it easier to see when they change and keep an eye out for
oddities, but I know that a lot of developers prefer to keep these
outside their VCS. Shipping a working tree would make it easier to
handle cases like this.

There are two obvious modifications to Joey's proposal that would allow
shipping a working tree. The first is just to include the working tree
in the .$VCS.tar.gz object. This has the advantage of being trivial to
implement on top of the current code: the git module would need to do a
'git checkout' after copying the .git, and the bzr module just wouldn't
call 'bzr remove-tree'.

The second possibility seems to me to be more flexible, though, and
probably not all that hard to implement: build both a .tar.gz
(containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source
-x' to unpack the tree given at least one of these. This would allow
various interesting possibilities such as:

  * Buildds could just fetch the .tar.gz; they have no need of the VCS
data. Users who just want to inspect the current version of the
source and not change it might want to do this too, using (say)
'apt-get source --no-vcs package'.

  * Developers on slow connections could say 'apt-get source --vcs-only
package' to fetch just the .$VCS.tar.gz, with the documented caveat
that it would be just like checking the source out of a VCS in 

Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Colin Watson
On Sat, Oct 06, 2007 at 11:17:58PM +0200, Frank Lichtenheld wrote:
 On Sat, Oct 06, 2007 at 05:27:04PM +1000, Anthony Towns wrote:
  This means you can't build the package by hand with standard unix tools
  -- at the very least you need git installed, and if other VC systems
  are to be supported, you need them too. Changes in repository formats
  will presumably result in versioned dependencies too.
  
  This is slightly worse than the case for existing patch management tools
  in that most of those can be dealt with by hand; though cdbs and to a
  lesser extent debhelper can't be quite as easily replicated I guess.
 
 A similar problem arises with Format: 2.0 packages as well if the user
 hasn't bzip2 (unlikely) or lzma (likely) installed and tries to unpack
 a source package built with them.

Perhaps 'apt-get source' et al could notice this class of situation and
offer to install the necessary unpacking tools for you. It'd have to
rely on sudo or similar as 'apt-get source' is typically run as
non-root, but it seems like a useful enhancement even so.

-- 
Colin Watson   [EMAIL PROTECTED]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Frank Lichtenheld
On Sat, Oct 06, 2007 at 10:37:48PM +, Colin Watson wrote:
 On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
  I've been working on making dpkg-source support a new source package
  format based upon git. The idea is that a source package has only a
  .dsc and a .git.tar.gz, which is just a git repo.
[...]
 Still, this work looks pretty cool, and I'd like to be able to make use
 of it despite avoiding git whenever I can. I noticed that you'd
 helpfully structured your changes such that it would be possible to plug
 in a different revision control system, so I wrote a module to support
 bzr. The patch is attached to this e-mail, and I'd appreciate comments;
 if this work is merged into dpkg I'd be very happy if my addition were
 merged too. There are probably some improvements to be made, but it was
 really utterly trivial; I was impressed that I didn't have to touch
 anything else beyond plugging in a new module. Ironically, of course, I
 did use git to create it. :-)

I guess if we use Joey's idea at all we will not be able to avoid
shipping such a module for each distributed VCS, and I didn't get
the impression that Joey thought otherwise. So I find your mail
strangely defensive :)

The code itself looks good AFAICT.

 While working on this I was thinking about general issues with the
 format. It seems to me that it's suboptimal not to ship a working tree.
 I know you sort of address this in the wiki FAQ, and I realise that
 there are space advantages to only shipping the VCS data. However, I'd
 like to try to persuade you otherwise if I can. My concerns are:

Shipping the worktree essentially means defining this new format as
an optional add-on, since you ship all the data you ship now plus some
VCS metadata. So all packages will have to be bigger than there
are now (aside from using other compression methods than gzip, and
after really building some packages today with my dpkg-source -C patch
I have to say I'm impressed how much space we might be able to save -
with high CPU costs, though). This is not really an argument for either
side, just wanted to make this effect clean.

   * Users will need to have the VCS installed in order to inspect the
 source.
[...]
   * Buildds will need to have the VCS installed in their base system.
[...]
   * Some source packages want to ship non-VCS-managed files.
[...]

Is the last one really such a big problem in Debian? I know that many upstream
VCS don't contain autogenerated files but most .orig.tar.gz's already
contain them today, so I would have guessed people either only have
their debian/ in their Debian VCS or all upstream files from the
.orig.tar.gz.

 There are two obvious modifications to Joey's proposal that would allow
 shipping a working tree. The first is just to include the working tree
 in the .$VCS.tar.gz object. This has the advantage of being trivial to
 implement on top of the current code: the git module would need to do a
 'git checkout' after copying the .git, and the bzr module just wouldn't
 call 'bzr remove-tree'.

This would be a bad idea IMHO, and like a regression: instead of
shipping a .orig.tar+diff we now ship one, monolithic (bigger) tarball?
Sounds suboptimal. I'm pretty sure I don't want to see this one
implemented in dpkg-dev.

 The second possibility seems to me to be more flexible, though, and
 probably not all that hard to implement: build both a .tar.gz
 (containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source
 -x' to unpack the tree given at least one of these. This would allow
 various interesting possibilities such as:

Since you're essentially demoting the new format to an add-on, why not
just make it really one and just ship a real Format: 1.0 package
(i.e. orig-tar+diff or native-tar) instead of this 
half-half-working-tree-tarball.

[...]
 These seem to me to be non-trivial advantages that outweigh the space
 costs of shipping around the working tree. I'd be willing to have a go
 at implementing this once I've had a bit more sleep.
 
 Does any of this make sense?

I guess there are two aspects to Joey's proposal:

1) Make the source package more useful by including VCS metadata like
   history

2) Make is easier to include arbitrary changes to the upstream sources
   by using more advanced tools than diff/patch, i.e. a DVCS

By concentrating on the first point and making it optional you either have
to sacrifice point 2 by reusing the old source package (orig+diff) or give
people who choose not to download the vcs data a worse experience by
making it harder for them to find the actual diff (working tree tar).

On second thought you can reduce the regression by adding a pristine-gz
delta to the working tree so that you can split the working tree tarball
back into a orig+diff.

On third thought who says you have to fall back to Format 1.0 for the
non-VCS data? You could also fall back to Format 2.0 which would make
preserving advantage 2 easier.

So, no idea if my ramblings made any sense, 

Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Frank Lichtenheld wrote:
 I think there is a mechanism in git to disallow replacing old pack
 files (i.e. forcing to create additional ones with only new objects),
 however, I haven't used that myself, yet.

The packs in the diff package would be basically the same packs that
git-send-pack generates when git is pushing objects to a remote
repository. Where the remote repo would be the contents of
foo_1.0-1.git.gz, and the local repo would be foo-1.0-2. Intercept
those packs in transit (how?), and then you can take the 1.0-1 repo
and later apply them to it to regenerate the 1.0-2 repo.

 On a general note: I think we definetly could need the better tarball
 compression support _before_ adding huge amount of history into the
 archive...

This would mostly be an optimisation for upload size, total archive size
is only affected if foo 1.0-1 is in testing and 1.0-2 in unstable.

It's actually much more significant to both upload and total archive
size that all 61mb of dpkg's .git not be put into its .git.tar.gz. Thus
the shallow clones with only a few hundred repos or so.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Colin Watson wrote:
 So, I can't stand git's user interface. I generally try to avoid making
 a huge issue of this since it seems to be massively political on places
 like Planet at the moment, there seems to be a certain amount of
 confusion of people's personal opinions with that of their employers
 going on, and in any case I normally find that revision control
 flamewars have negative utility. (I don't think it's terribly relevant
 to this discussion why I prefer not to use git, and I don't want to
 sidetrack the thread with that; I just wanted to present an existence
 case of somebody who doesn't want to switch to .git.tar.gz and yet
 doesn't want to stay with .orig.tar.gz and .diff.gz forever.)

(So, FWIW, I'm not sold on git. Not sold at all yet. But it was a good
choice for this implementation for several reasons.)

 Still, this work looks pretty cool, and I'd like to be able to make use
 of it despite avoiding git whenever I can. I noticed that you'd
 helpfully structured your changes such that it would be possible to plug
 in a different revision control system, so I wrote a module to support
 bzr.

Nice. The FAQ has some questions aimed at adding other revision control
systems, could you try to answer those in the context of bzr? In
particular, is the data that would be shipped in the source package the
same data that bzr normally reads from untrusted sources, thus ensuring
that using it this way is equally (in)secure as using bzr to pull data
over the network? (Note that this wasn't 100% true for git and I have
had to put in several workarounds.) And is the data format stable and/or
one that bzr has a history of supporting old versions of in a way that
ensures backwards compatability?

Also, will the bzr repos always contain the full history, or is there
an equivilant to git shallow clones? How big do they tend to be?

 It's true that this is no worse than dbs or dpatch or whatever, and
 in fact it's better because dpkg-source will take care of the
 unpacking step automatically. Still, I do think it is a downside; we
 do still ship /usr/share/doc/debian/source-unpack.txt

BTW, source-unpack.txt fails for both packages containing
debian/subdirs/ and of course for wig-n-pen..

   * Buildds will need to have the VCS installed in their base system.

This seems easily solved by recommends (installed by default).

   * Some source packages want to ship non-VCS-managed files.
 
 It's very common for source packages to include autogenerated
 objects like configure, Makefile.in, etc. Whether to check these
 into a VCS is a somewhat religious matter (as acknowledged by the
 gettext info documentation, for instance), and personally I lean
 towards checking them in (with a few exceptions) just because it
 makes it easier to see when they change and keep an eye out for
 oddities, but I know that a lot of developers prefer to keep these
 outside their VCS. Shipping a working tree would make it easier to
 handle cases like this.

Hmm, I hadn't considered that this might be a problem.

I don't know if I'd want to write the code to do this, but shipping a
partial working tree consisting of just those files would be enough to
solve this.

   * Space-constrained mirrors could conceivably exclude the VCS data if
 they had to, though we probably wouldn't encourage this.
 
 These seem to me to be non-trivial advantages that outweigh the space
 costs of shipping around the working tree.

The space constraints seem pretty hard to me. Specifically, I don't want
to piss the ftpmasters off and get vcs source packages banned from the
archive.. The only saving grace really seems to be that shipping both
vcs and upstream tar will only double the size of the archive once most
everything uses the new format, and the archive will have probably
doubled in size several times over due to other factors before then.


I've eyeballed the code, it looks ok though so close to code I've been
looking at all week that I may be missing trees for the forest. :-)

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Frank Lichtenheld wrote:
 I guess if we use Joey's idea at all we will not be able to avoid
 shipping such a module for each distributed VCS, and I didn't get
 the impression that Joey thought otherwise.

I do think otherwise. If the distributed (or other) VCS does not meet
our criteria for security and backwards compatability, then we should
not ship it.

And yes, it'll be up to the dpkg maintainers to enforce those criteria
if you crack open the floodgates..

 Is the last one really such a big problem in Debian? I know that many upstream
 VCS don't contain autogenerated files but most .orig.tar.gz's already
 contain them today, so I would have guessed people either only have
 their debian/ in their Debian VCS or all upstream files from the
 .orig.tar.gz.

So would I, and most of the tools like git-buildpackage seem to assume
it too and not try to support this case AFAICS. Colin's probably right
that it's an issue religious wars can be fought over, but if they're
being fought in the context of keeping package source in revision
control it's happening quietly..

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Anthony Towns
On Sat, Oct 06, 2007 at 11:19:43AM -0400, Joey Hess wrote:
 Anthony Towns wrote:
  Changes in repository formats will presumably result in versioned
  dependencies too.
 I don't think that dpkg should add vcs formats that we don't have a good
 expectation of remaining supported by newer versions of the tools going
 forward (so svn repos are out). 

It's more that newer versions of the tools will create more optimised
repo formats, that older versions don't support -- bzr has done this
between etch and lenny, eg.

My inclination would be to have dpkg support it, but have it generate
a REJECT at upload time if we don't want to support the new format (yet).

 If the format changes in a non-backwards compatible way, we could have
 source packages built on unstable that cannot be extracted on stable,
 which I also think is suboptimal, but hard to completly avoid.

Well, that's true of any Version: 3 format already anyway.

  Once the unpack is done, I don't see any reason why you can't do an NMU
  in the traditional way, so presuming dpkg-source -x or apt-get source
  handles the unpack automatically, I don't think it necessarily imposes
  any new requirements on NMUers.
 Basically, you have to know how to git commit your changes before building
 the NMU, and that's all. As a bonus, it's rather easier to generate NMU
 patchsets. :-)

Well, there's two options:

- dpkg-source knows it's meant to be a git package, and
  can either warn you you have uncommitted changes (and tell
  you what to do) or just auto commit them for you

- dpkg-source doesn't know what sort of package it's meant to be
  and just builds a v1 source package

Both of which sound pretty trivial for an NMUer to deal with...

  Maybe providing a feature on packages.debian.org (or similar) to download
  sources in simple, non-VC, tarball format would make this a complete
  non-issue though?
 pristine-tar could be used for this, it would just need source packages
 to put the delta somewhere standaised (under debian/), and would need 
 some standarised way to get to the upstream source branch in git.

So the logic there would be:

if there's an upstream tag, then
generate an .orig.tgz
if there's a pristine-tar info,
hax0r it to be pristine
generate a .diff.gz
if the .diff failed goto bailout
generate a .dsc containing the orig and diff
publish all three
else:
(bailout:)
generate a .tar.gz
generate a .dsc containing the tar
publish both

  Would it make sense to have the source format look more like:
  Format: 3.0
  Source: dpkg
  ...
  Source-Depends: git-dpkg (= 3.14159)
  Source-Hooks: /usr/bin/git-dpkg
  ...
  Files:
   ... foo_1.2.git.tar.gz
  You could drop the Source-Hooks: line, and just have dpkg-source know
  to associate *.git.tar.gz with /usr/lib/dpkg/source/git, and trust the
  package will provide it.
 Not sure if this buys anything that using perl modules for the vcses
 can't do, really. 

It doesn't buy anything extra, so forget the Source-Hooks: and just
consider it to be a different package providing the VCS-specific perl
module.

That buys you:
- no changes to dpkg to support new source formats
- easy for other distros to support more or fewer VCS formats
- version info to deal with new repo formats
- explicit dependency info that can be checked at upload time
  to block source formats we don't want to support

 How do you envision this helping deal with repository
 format changes?

Repo formats that bzr in etch can unpack could be denoted by

Source-Depends: dpkg-bzr (= 0.11)

while repo formats that require bzr from lenny or later could be
denoted by:

Source-Depends: dpkg-bzr (= 0.18)

(Or you could have a versioning scheme that matches the repo format
directly, rather than the program being used. Or you could use virtual
packages and say dpkg-bzr-v3 and have that be Provided: by some package/s,
etc)

It'd be straightforward to make a policy decision to only ACCEPT uploads
with given Source-Depends: lines, eg ones that can be satisfied using
packages from stable, while letting third party repos experiment with
new repo formats without needing to use a different dpkg than Debian does.

Cheers,
aj



signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Anthony Towns
On Sat, Oct 06, 2007 at 10:37:48PM +, Colin Watson wrote:
 The second possibility seems to me to be more flexible, though, and
 probably not all that hard to implement: build both a .tar.gz
 (containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source
 -x' to unpack the tree given at least one of these. This would allow
 various interesting possibilities such as:

Would this be better in any way than having a web interface that provides
an autogenerated version-1 source package? Presume it's a url like:

http://v1source.qa.debian.org/i/ifupdown/ifupdown_0.6.8.dsc

   * Buildds could just fetch the .tar.gz; they have no need of the VCS
 data. Users who just want to inspect the current version of the
 source and not change it might want to do this too, using (say)
 'apt-get source --no-vcs package'.

dget -x http://v1source.qa.debian.org/i/ifupdown/ifupdown_0.6.8.dsc

   * Developers on slow connections could say 'apt-get source --vcs-only
 package' to fetch just the .$VCS.tar.gz, with the documented caveat
 that it would be just like checking the source out of a VCS in that
 you might have to recreate some autogenerated files.

That happens automatically.

   * Space-constrained mirrors could conceivably exclude the VCS data if
 they had to, though we probably wouldn't encourage this.

Mirrors wouldn't mirror the autogenerated stuff, so not an issue.

   * Derivative distributions who are slow to upgrade their dpkg-source
 could still interoperate to some degree.

They'd need to pull sources from the autogenerated url; though they'd
still probably have Build-Depends: issues if they're not updating
packages generally.

   * Tools like mc, vim's tar plugin, or
 http://www.mirrorservice.org/sites/ftp.debian.org/debian/ could
 still be used straightforwardly and without modifications to look
 inside source packages on mirrors.

Again, you'd have to go to the autogenerating url rather than a mirror.

Cheers,
aj



signature.asc
Description: Digital signature


[PATCH] proposed v3 source format using .git.tar.gz

2007-10-05 Thread Joey Hess
I've been working on making dpkg-source support a new source package format
based upon git. The idea is that a source package has only a .dsc and a
.git.tar.gz, which is just a git repo.

I've blogged[1] about some of what led me to this idea, and I've also written
a short FAQ[2]. Suggest reading both to understand where I'm coming from with
this.

[1] 
http://kitenet.net/~joey/blog/entry/an_evolutionary_change_to_the_Debian_source_package_format/
[2] http://wiki.debian.org/GitSrc

My implementation adds a new 3.0 version source format. A 3.0 format debian
source package can consist of any files allowed by formats 1 and 2, but
may also contain .$VCS.tar.gz files. To build a version 3 source package,
a new field is needed in debian/control:

Format: 3.0 (git)

The bit in parens specifies that it should use the git backend, which
is currently the only one available. That backend is in the
Dpkg::Source::VCS::git perl module.

I have a sourcev3 branch with my changes at git://kitenet.net/dpkg,
and have also attached a diff to this mail. I feel that this is ready
for review and hopefully merging into dpkg now. Looking forward to your
comments.

A sample dpkg source package built using this is at
http://kitenet.net/~joey/tmp/git-demo/. This demo package includes only
the last 200 commits to the dpkg git repo, so it's more than 1 mb *smaller*
than dpkg's normal .tar.gz!

-- 
see shy jo
diff --git a/debian/dpkg-dev.install b/debian/dpkg-dev.install
index 49e3835..ee65dbf 100644
--- a/debian/dpkg-dev.install
+++ b/debian/dpkg-dev.install
@@ -56,3 +56,4 @@ usr/share/man/*/dpkg-shlibdeps.1
 usr/share/man/*/*/dpkg-source.1
 usr/share/man/*/dpkg-source.1
 usr/share/perl5/Dpkg/BuildOptions.pm
+usr/share/perl5/Dpkg/Source
diff --git a/man/dpkg-source.1 b/man/dpkg-source.1
index 9bf9ff3..14c17c3 100644
--- a/man/dpkg-source.1
+++ b/man/dpkg-source.1
@@ -55,6 +55,10 @@ will look for the original source tarfile
 or the original source directory
 .IB directory .orig
 depending on the \fB\-sX\fP arguments.
+
+
+If the source package is being built as a version 3 source package using
+a VCS, no upstream tarball or original source directory is needed.
 .TP
 .BR \-h ,  \-\-help
 Show the usage message and exit.
@@ -109,7 +113,9 @@ This option negates a previously set
 .BR \-i [\fIregexp\fP]
 You may specify a perl regular expression to match files you want
 filtered out of the list of files for the diff. (This list is
-generated by a find command.) \fB\-i\fR by itself enables the option,
+generated by a find command.) (If the source package is being built as a
+version 3 source package using a VCS, this is instead used to
+ignore uncommitted files.) \fB\-i\fR by itself enables the option,
 with a default that will filter out control files and directories of the
 most common revision control systems, backup and swap files and Libtool
 build output directories. There can only be one active regexp, of multiple
@@ -162,6 +168,9 @@ will not overwrite existing tarfiles or directories. If this is
 desired then
 .BR \-sA ,  \-sP ,  \-sK ,  \-sU  and  \-sR
 should be used instead.
+.PP
+If the source package is being built as a version 3 source package using
+a VCS, these options do not make sense, and will be ignored.
 .TP
 .BR \-sk
 Specifies to expect the original source as a tarfile, by default
diff --git a/scripts/Dpkg/Source/VCS/git.pm b/scripts/Dpkg/Source/VCS/git.pm
new file mode 100644
index 000..cac7d05
--- /dev/null
+++ b/scripts/Dpkg/Source/VCS/git.pm
@@ -0,0 +1,226 @@
+#!/usr/bin/perl
+#
+# git support for dpkg-source
+#
+# Copyright © 2007 Joey Hess [EMAIL PROTECTED].
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
+package Dpkg::Source::VCS::git;
+
+use strict;
+use warnings;
+use Cwd;
+use File::Find;
+use Dpkg;
+use Dpkg::Gettext;
+
+push (@INC, $dpkglibdir);
+require 'controllib.pl';
+
+# Remove variables from the environment that might cause git to do
+# something unexpected.
+delete $ENV{GIT_DIR};
+delete $ENV{GIT_INDEX_FILE};
+delete $ENV{GIT_OBJECT_DIRECTORY};
+delete $ENV{GIT_ALTERNATE_OBJECT_DIRECTORIES};
+delete $ENV{GIT_WORK_TREE};
+
+sub sanity_check {
+	my $srcdir=shift;
+
+	if (! -s $srcdir/.git) {
+		main::error(sprintf(_g(%s is not the top directory of a git repository (%s/.git not present), but Format git was specified), $srcdir, $srcdir));
+	}
+	

Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-05 Thread Russ Allbery
Joey Hess [EMAIL PROTECTED] writes:

 My implementation adds a new 3.0 version source format. A 3.0 format
 debian source package can consist of any files allowed by formats 1 and
 2, but may also contain .$VCS.tar.gz files. To build a version 3 source
 package, a new field is needed in debian/control:

 Format: 3.0 (git)

 The bit in parens specifies that it should use the git backend, which is
 currently the only one available. That backend is in the
 Dpkg::Source::VCS::git perl module.

It's a little disturbing to have content in parentheses be significant in
a format based on RFC 822, although we have broken this rule elsewhere
(most notably in dependency fields, of course).

I think this is a great idea, although I can't comment on the code
implementation.

-- 
Russ Allbery ([EMAIL PROTECTED])   http://www.eyrie.org/~eagle/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-05 Thread Frank Lichtenheld
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
 I have a sourcev3 branch with my changes at git://kitenet.net/dpkg,
 and have also attached a diff to this mail. I feel that this is ready
 for review and hopefully merging into dpkg now. Looking forward to your
 comments.

A little code review follows.

 +# You should have received a copy of the GNU General Public License
 +# along with this program; if not, write to the Free Software
 +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA

old FSF address (not really important, but while we're at it ;)

 +sub sanity_check {
 + my $srcdir=shift;
 +
 + if (! -s $srcdir/.git) {
 + main::error(sprintf(_g(%s is not the top directory of a git 
 repository (%s/.git not present), but Format git was specified), $srcdir, 
 $srcdir));

you probably mean -e or -d here? -s on a directory is kinda strange.
printing $srcdir twice might bloat the error message.

 + }
 + if (-s $srcdir/.gitmodules) {
 + main::error(sprintf(_g(git repository %s uses submodules. This 
 is not yet supported.), $srcdir));
 + }
 +
 + # Symlinks from .git to outside could cause unpack failures, or
 + # point to files they shouldn't, so check for and don't allow.
 + if (-l $srcdir/.git) {
 + main::error(sprintf(_g(%s is a symlink), $srcdir/.git));
 + }
 + my $abs_srcdir=Cwd::abs_path($srcdir);
 + find(sub {
 + if (-l $_) {
 + if (Cwd::abs_path(readlink($_)) !~ 
 /^\Q$abs_srcdir\E(\/|$)/) {
 + main::error(sprintf(_g(%s is a symlink to 
 outside %s), $File::Find::name, $srcdir));
 + }
 + }
 + }, $srcdir/.git);

Maybe it would be easier to just disallow symlinks completly? Or are
there important use cases for that?

 +}
 +
 +# Called before a tarball is created, to prepare the tar directory.
 +sub prep_tar {
 + my $srcdir=shift;
 + my $tardir=shift;
 + 
 + sanity_check($srcdir);
 +
 + if (! -e $srcdir/.git) {
 + main::error(sprintf(_g(%s is not a git repository, but Format 
 git was specified), $srcdir));
 + }
 + if (-e $srcdir/.gitmodules) {
 + main::error(sprintf(_g(git repository %s uses submodules. This 
 is not yet supported.), $srcdir));
 + }

Duplicated code from sanity_check

 +
 + # Check for uncommitted files.
 + open(GIT_STATUS, LANG=C cd $srcdir  git-status |) ||
 + main::subprocerr(cd $srcdir  git-status);

you make a lot cd $srcdir. Maybe you should just chdir() in the parent
process? This would also take care of funny things in $srcdir like
whitespaces...

 + my $clean=0;
 + my $status=;
 + while (GIT_STATUS) {
 + if (/^\Qnothing to commit (working directory clean)\E$/) {
 + $clean=1;
 + }
 + else {
 + $status.=git-status: $_;
 + }
 + }
 + close GIT_STATUS;
 + # git-status exits 1 if there are uncommitted changes or if
 + # the repo is clean, and 0 if there are uncommitted changes
 + # listed in the index.
 + if ($?  $?  8 != 1) {
 + main::subprocerr(cd $srcdir  git status);
 + }
 + if (! $clean) {
 + # To support dpkg-buildpackage -i, get a list of files

dpkg-source -i would be the proper attribution here. dpkg-buildpackage
implements -i only as a pass-through option.

 + # eqivilant to the ones git-status finds, and remove any

is that an English word?

 + # ignored files from it.
 + my @ignores=--exclude-per-directory=.gitignore;
 + my $core_excludesfile=`cd $srcdir  git-config --get 
 core.excludesfile`;
 + chomp $core_excludesfile;
 + if (length $core_excludesfile  -e 
 $srcdir/$core_excludesfile) {
 + push @ignores, --exclude-from='$core_excludesfile';
 + }
 + if (-e $srcdir/.git/info/exclude) {
 + push @ignores, --exclude-from=.git/info/exclude;
 + }
 + open(GIT_LS_FILES, cd $srcdir  git-ls-files -m -d -o 
 @ignores |) ||
 + main::subprocerr(cd $srcdir  git-ls-files);

If you get rid of the cd you could use the '-|', @array form of open
here which would be preferable imho.
This is essentially running git-status again without the output
beautification... Can't we avoid doing the work twice?

Also I would prefer using long options where available. It's not like
anyone has to type them more than once ;)

 + my @files;
 + while (GIT_LS_FILES) {
 + chomp;
 + if (! length $main::diff_ignore_regexp ||
 + ! m/$main::diff_ignore_regexp/o) {
 + push @files, $_;
 + }
 + }
 + close(GIT_LS_FILES) || 

Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-05 Thread Frank Lichtenheld
One thing I forgot:

On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
 @@ -825,14 +881,17 @@ if ($opmode eq 'build') {
  if ($native) {
   warning(_g(multiple tarfiles in native package)) if @tarfiles  1;
   warning(_g(native package with .orig.tar))
 - unless $seen{'.tar'} or $seen{-$revision.tar};
 + unless $seen{'.tar'} or $seen{-$revision.tar} or %vcsfiles;
  } else {
 - warning(_g(no upstream tarfile in Files field)) unless 
 $seen{'.orig.tar'};
 + warning(_g(no upstream tarfile in Files field)) unless 
 $seen{'.orig.tar'} or %vcsfiles;

This should probably error out. Aren't v3 packages always native in the
sense tested here?

   if ($dscformat =~ /^1\./) {
   warning(sprintf(_g(multiple upstream tarballs in %s format dsc), 
 $dscformat)) if @tarfiles  1;
   warning(sprintf(_g(debian.tar in %s format dsc), $dscformat)) if 
 $debianfile;
   }
  }
 +if (%vcsfiles  $dscformat !~ /^3\./) {
 + warning(sprintf(_g(rc.tar file in %s format dsc), $dscformat));
 +}
  
  $newdirectory = $sourcepackage.'-'.$baseversion unless 
 defined($newdirectory);
  $expectprefix = $newdirectory;


Gruesse,
-- 
Frank Lichtenheld [EMAIL PROTECTED]
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-05 Thread Joey Hess
Thanks a lot for the code review. Any comments on the big picture or design?

Frank Lichtenheld wrote:
 On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
  I have a sourcev3 branch with my changes at git://kitenet.net/dpkg,
  and have also attached a diff to this mail. I feel that this is ready
  for review and hopefully merging into dpkg now. Looking forward to your
  comments.
 
 A little code review follows.
 
  +# You should have received a copy of the GNU General Public License
  +# along with this program; if not, write to the Free Software
  +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
 
 old FSF address (not really important, but while we're at it ;)

Copied from elsewhere in dpkg source. :-)

  +sub sanity_check {
  +   my $srcdir=shift;
  +
  +   if (! -s $srcdir/.git) {
  +   main::error(sprintf(_g(%s is not the top directory of a git 
  repository (%s/.git not present), but Format git was specified), $srcdir, 
  $srcdir));
 
 you probably mean -e or -d here? -s on a directory is kinda strange.
 printing $srcdir twice might bloat the error message.

Yes, I meant -d, the -s snuck in from the other test.

ACK on the duplication.

  +   }
  +   if (-s $srcdir/.gitmodules) {
  +   main::error(sprintf(_g(git repository %s uses submodules. This 
  is not yet supported.), $srcdir));
  +   }
  +
  +   # Symlinks from .git to outside could cause unpack failures, or
  +   # point to files they shouldn't, so check for and don't allow.
  +   if (-l $srcdir/.git) {
  +   main::error(sprintf(_g(%s is a symlink), $srcdir/.git));
  +   }
  +   my $abs_srcdir=Cwd::abs_path($srcdir);
  +   find(sub {
  +   if (-l $_) {
  +   if (Cwd::abs_path(readlink($_)) !~ 
  /^\Q$abs_srcdir\E(\/|$)/) {
  +   main::error(sprintf(_g(%s is a symlink to 
  outside %s), $File::Find::name, $srcdir));
  +   }
  +   }
  +   }, $srcdir/.git);
 
 Maybe it would be easier to just disallow symlinks completly? Or are
 there important use cases for that?

I've tried to not make dpkg have to know too much about git internals.
(As you can see I've not been 100% successful, but have kept it to about
the level someone with a week's knowledge of git would be comfortable
with.) So while I don't see any symlinks in my git repos, if git decides
to use symlinks, I don't want dpkg to have to be updated. (I think git
did historically use symlinks in the repo).

There are probably semi-valid reasons to manually add symlinks inside a .git
directory today, too.

  +}
  +
  +# Called before a tarball is created, to prepare the tar directory.
  +sub prep_tar {
  +   my $srcdir=shift;
  +   my $tardir=shift;
  +   
  +   sanity_check($srcdir);
  +
  +   if (! -e $srcdir/.git) {
  +   main::error(sprintf(_g(%s is not a git repository, but Format 
  git was specified), $srcdir));
  +   }
  +   if (-e $srcdir/.gitmodules) {
  +   main::error(sprintf(_g(git repository %s uses submodules. This 
  is not yet supported.), $srcdir));
  +   }
 
 Duplicated code from sanity_check

Doh!

  +
  +   # Check for uncommitted files.
  +   open(GIT_STATUS, LANG=C cd $srcdir  git-status |) ||
  +   main::subprocerr(cd $srcdir  git-status);
 
 you make a lot cd $srcdir. Maybe you should just chdir() in the parent
 process?

I could make it do that, I suppose it would be safe as long as I cd back
(dpkg-source in general assumes it's in the parent dir of the source
tree).

 This would also take care of funny things in $srcdir like
 whitespaces...

 If you get rid of the cd you could use the '-|', @array form of open
 here which would be preferable imho.

Wow, you've taught me something new, I only knew about the much more
clumsy manual fork and open(-|) approach. I'll do this, but it will
take a little while.

  +   my $clean=0;
  +   my $status=;
  +   while (GIT_STATUS) {
  +   if (/^\Qnothing to commit (working directory clean)\E$/) {
  +   $clean=1;
  +   }
  +   else {
  +   $status.=git-status: $_;
  +   }
  +   }
  +   close GIT_STATUS;
  +   # git-status exits 1 if there are uncommitted changes or if
  +   # the repo is clean, and 0 if there are uncommitted changes
  +   # listed in the index.
  +   if ($?  $?  8 != 1) {
  +   main::subprocerr(cd $srcdir  git status);
  +   }
  +   if (! $clean) {
  +   # To support dpkg-buildpackage -i, get a list of files
 
 dpkg-source -i would be the proper attribution here. dpkg-buildpackage
 implements -i only as a pass-through option.

True.

  +   # eqivilant to the ones git-status finds, and remove any
 
 is that an English word?

Even better, a common typo of one. :-)

  +   # ignored files from it.
  +   my @ignores=--exclude-per-directory=.gitignore;
  +   my $core_excludesfile=`cd $srcdir  git-config --get 
  core.excludesfile`;
  +   chomp 

Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-05 Thread Joey Hess
Frank Lichtenheld wrote:
 One thing I forgot:
 
 On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
  @@ -825,14 +881,17 @@ if ($opmode eq 'build') {
   if ($native) {
  warning(_g(multiple tarfiles in native package)) if @tarfiles  1;
  warning(_g(native package with .orig.tar))
  -   unless $seen{'.tar'} or $seen{-$revision.tar};
  +   unless $seen{'.tar'} or $seen{-$revision.tar} or %vcsfiles;
   } else {
  -   warning(_g(no upstream tarfile in Files field)) unless 
  $seen{'.orig.tar'};
  +   warning(_g(no upstream tarfile in Files field)) unless 
  $seen{'.orig.tar'} or %vcsfiles;
 
 This should probably error out. Aren't v3 packages always native in the
 sense tested here?

Not necessarily. I wanted to leave the option open to use wig-n-pen to
constuct mixed source packages that maybe use vcs for debian/ and
pristine source for the rest + a diff.gz, or something like that.

I think the code will basically handle unpacking such a mongrel,
although there are no tools to create one.

-- 
see shy jo


signature.asc
Description: Digital signature