Hi Ian, I've got a few questions/comments about git-based publishing imports and history for patches-applied imports that I was hoping to bounce off you and other VCS folks. I apologize for the very long e-mail to follow!
For context for everyone: I (along with Robie Basak and others) have been developing an importer that will take the publishing history of a source package and import the source package contents into a git tree, with tags for each publish and branches for each 'series'. That's a broad gloss of the details, of course, but probably sufficient. This could be, I think, of use to both Ubuntu and Debian. If you're not interested in this, though, please feel free to disregard :) On to the questions/comments: 1) Some source packages (bouncycastle, php7.0 are the ones I can think of off the top of my head) upstream tarballs contain .gitattributes files, which will change the behavior of git itself when checking out a branch. This is not by definition a problem, except that to get to a fully patches-applied state, I believe you must be checkout an appropriate commit to be at (meaning you're adjusting the working directory's contents) -- which may then differ from what is shipped by the upstream tarball. I have seen this with eol adjustments, and much more annoyingly (because git knows it is doing it, while vi/emacs do not) the special ident handling. For now, we've added the following to our git repository's .git/info/attributes at checkout time (using our wrapper to `git clone`): * -ident * -text * -eol In other words, the underlying issue is that the upstream uses git as well, and their git 'configuration' (not necessarily just .gitconfig) will interfere with the behavior of any git using the upstream contents in the working tree. Does the above seem reasonable? Afaict, there is no way for me to really enforce the above in the repository itself, without patching the upstream source. 2) How do we determine if a source package is 1.0 vs. 3.0? I am currently using `dpkg-source --print-format`, but have found one source package (util-linux 2.13~rc3-5), where dpkg-source emits: syntax error in /tmp/tmp3y515osf/util-linux-2.13~rc3/debian/control at line 14: duplicate field Depends found and thus we error out. 3) Imagine the following graph in the git repository: A D ->o.....f....>o-> ^ . . ^ | c e | o . . o ^ . . ^ | B | o o o ^ ^ ^ | | | ->o---->o---->o-> a b d Each o is a commit in the repository a, b, d are patches-unapplied imports of publishes for a given release, which are on a fast-forwarding, branch-unnapplied branch. A is the corresponding patches-applied import of a (with each o reflecting one patch application from the source package). D is the corresponding patches-applied import of d (with each o reflecting one patch application from the source package). c, e and f are for demonstration purposes and do not necessarily exist (except as discussed below). Ideally, we'd have a fast-forwarding branch for the patches-applied imports, as well. Let's assert that there is a problem with obtaining the patches-applied version of b. This can occur for (at least) the following reasons: i) as in 2), we might not be able to determine the source package format (implementation detail, to some degree), so are unable to correctly derive if there is a patches-applied state that is distinct from b. ii) some patches may fail to apply with a trivial `quilt push`. This occurs with, at least, a historical publish of samba. In theory, there are other reasons/cases where this might happen and the importer needs to never fail (so it is of some use to run automatically :) The questions I have relate to what to do when we encounter this situation, which in turn is divided into two parts: i) Do we want to 'tag' this failed-to-apply patches-applied import in the repository (currently, every successful patches-applied import is tagged as 'applied/<dep14 of the published version>')? This is important for semantics for end-users (and the importer itself). - We assert, currently that tagged objects in the tree correspond to the source package as published in Launchpad/archive. Tagging this failed-to-apply state as that, would violate that assertion. - We also rely (implementation detail) on being able to find 'nearest' publishes by tag name, which sort of leads into the next issue... ii) What should happen to the branch? - if the branch is left at A, then (even if only momentarily), upon finishing the import at B, the branch does not reflect the latest state in Launchpad relative to the importer's progress. - if the branch is left at B, then it is no longer fast-forwarding, as there is no connection between A and B. - if the branch is left at B, and we add the connection c to make it fast-forwarding, we violate a different semantic we assert about parenting relationships in our repository: namely, that a commit contains everything in all of its parents. Equally so, it doesn't really make sense to put c in, as B does not represent a fully patches-applied import, while A does. Let's presume that this failure is not persistent and that D is able to be imported successfully, we again have to make decisions about parenting. I think it only makes sense for one of c or f to exist, based upon what we decide is the right policy above. In sum, I think we have one of two/three options: 1) When we encounter a failure to derive a patches-applied import of a publish, whatever the reason, we do nothing with that treeish. It will not be present in the history of any branches or even be tagged. 1a) Slightly less extreme, we will not place it in the history, but we will tag it as 'broken/applied/<dep14 of version>' or so? 2) We will always treat patches-applied as 'best-effort', so that, for instance, if we do fail to apply all the patches for a given publish, we will simply tag the last succesful application as the patches-applied import. The parenting relationships for the applied branches will not have the same meaning as the unapplied branches, but will always exist. There might be other options than these, but it's what I've come up with so far. Any comments, suggestions and/or feedback would be greatly appreciated! Thanks! -Nish -- Nishanth Aravamudan Ubuntu Server Canonical Ltd _______________________________________________ vcs-pkg-discuss mailing list vcs-pkg-discuss@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss