Hello! While waiting for boarding at the Atlanta airport, I finally fleshed out a plan about how to begin populating and using our private Savannah Hurd-specific glibc repository / fork / whatever you name it. Can all of Roland, Samuel, other interested parties subscribe to that scheme?
Some abbreviations: * R(libc) -- Savannah Hurd-specific glibc repository, <http://git.savannah.gnu.org/cgit/hurd/glibc.git/> * D(libc) -- Debian libc package; this is these days based on eglibc, not glibc * P(D(x)) -- patches applied to D(x), compared to (upstream) x * P(y,D(x)) -- y-specific patches of P(D(x)) Objective, problems, some solutions: * R(libc) is to be based on glibc, not eglibc. eglibc shouldn't contain a lot of differences in the Hurd-specific parts, but yet, there may be differences. * R(libc) has to be usable for my own libc works, as well as other peoples' similar needs of course: (a) provide a stable basis for building cross toolchains, (b) general libc development. * R(libc) needs to automatically generate the P(hurd,D(libc)) files. There is no reason we should continue to maintain P(hurd,D(libc)) in parallel, manually. Preferrably, one patch file per topic should be generated, not a huge combined one. Usually, these generated patches should be equivalent to the (a) subset of R(libc), but: * There will be patches in R(libc) that are not wanted for P(hurd,D(libc)), because they're already being handled outside of P(hurd,D(libc)) in P(D(libc)), as they're needed for other non-Linux architectures (k*BSD), too, for example. There must be a way to exclude these. * There will be patches whose content is indeed wanted for P(hurd,D(libc)), but needs to be (slightly) frobbed before being usable in Debian's eglibc context. That can be taken care of by additional patches in P(hurd,D(libc)); these are not derived from R(libc), and are either prepended or appended (as appropriate) to the series derived from R(libc). * There will be patches that are really only relevant to Debian. We can either still keep them in R(libc) too, or these stay manually maintained in P(hurd,D(libc)) as they're now. * The versions of the upstream libc package / the version R(libc) is based on may be slightly different from what D(libc) is using. This shouldn't pose major problems, though, and can be handled as above, too. * All reasonable stuff from R(libc) should eventually be merged into the upstream glibc repository. Instead of staging stuff in R(libc), we could submit stuff upstream right from the beginning, but there are problems: * Upstream essentially only wants perfect patches. Not everything we have at the moment is ready for prime time. (Yet, unfinished patches are needed / used to base further work upon.) * Often, there is a huge delay between submission and acceptance (if at all). * One central repository for handling all stuff involving R(libc) is not sufficient, but a distributed one is needed. This needs to include all meta information (dependencies between different topic branches, for example). * Upstream glibc / eglibc repositories are not at all in a usable shape at the moment. Thus: simple topic branches which will eventually be merged into master are not possible; again, dependencies between topic branches. * Yet, for the usual reasons, maintaining stuff in topic branches is preferrable to a linearized structure (simply committing stuff in the appropriate order in one Git branch; which would inherently resolve all dependencies). There are like 30 topic branches at the moment. Possible solutions, and their problems: * One Git branch: not wanted, as above. * Many Git (topic) branches: possible, but problem of dependencies between topic branches. Possibly difficult to generated usable P(hurd,D(libc)) patches from that structure. * Quilt, <http://savannah.nongnu.org/projects/quilt>, <http://www.suse.de/~agruen/quilt.pdf>. Had a brief look at it. Instead of topic branches, has several (topic) patch files, which are automatically applied in the appropritate ordering. This is what D(libc) (and a lot of other packages) are doing right now. Problem: unwieldy for general development; doesn't integrate with Git, thus not easily distributable. * guilt (quilt on top of git), <http://www.kernel.org/pub/linux/kernel/people/jsipek/guilt/man/>. Had a brief look at it. Like Quilt; but does in a way integrate with Git, but not really in a distributed fashion, as I understand it. Might be feasible to be used, but so far didn't look at it in too much detail. * StGit (stacked Git), <http://www.procode.org/stgit/>. Had a brief look at it. Mostly like guilt, it seems. What's the difference between these two, exactly? * TopGit, <http://repo.or.cz/w/topgit.git>. Had a more-than-brief look at it. Like guilt and StGit, but patches' meta-data (dependencies) is a first-class Git citizen. This seems to be exactly what we want to use. <http://repo.or.cz/w/topgit.git?a=blob;f=README> begins like this, but the whole file is well worth reading: | TopGit - A different patch queue manager | | | DESCRIPTION | ----------- | | TopGit aims to make handling of large amount of interdependent topic | branches easier. In fact, it is designed especially for the case | when you maintain a queue of third-party patches on top of another | (perhaps Git-controlled) project and want to easily organize, maintain | and submit them - TopGit achieves that by keeping a separate topic | branch for each patch and providing few tools to maintain the branches. | | | RATIONALE | --------- | | Why not use something like StGIT or Guilt or rebase -i for maintaining | your patch queue? The advantage of these tools is their simplicity; | they work with patch _series_ and defer to the reflog facility for | version control of patches (reordering of patches is not | version-controlled at all). But there are several disadvantages - | for one, these tools (especially StGIT) do not actually fit well | with plain Git at all: it is basically impossible to take advantage | of the index effectively when using StGIT. But more importantly, | these tools horribly fail in the face of distributed environment. | | TopGit has been designed around three main tenets: | | (i) TopGit is as thin layer on top of Git as possible. | You still maintain your index and commit using Git, TopGit will | only automate few indispensable tasks. | | (ii) TopGit is anxious about _keeping_ your history. It will | never rewrite your history and all metadata is also tracked by Git, | smoothly and non-obnoxiously. It is good to have a _single_ point | when the history is cleaned up, and that is at the point of inclusion | in the upstream project; locally, you can see how your patch has evolved | and easily return to older versions. | | (iii) TopGit is specifically designed to work in distributed | environment. You can have several instances of TopGit-aware repositories | and smoothly keep them all up-to-date and transfer your changes between | them. | | As mentioned above, the main intended use-case for TopGit is tracking | third-party patches, where each patch is effectively a single topic | branch. In order to flexibly accommodate even complex scenarios when | you track many patches where many are independent but some depend | on others, TopGit ignores the ancient Quilt heritage of patch series | and instead allows the patches to freely form graphs (DAGs just like | Git history itself, only "one level higher"). For now, you have | to manually specify which patches does the current one depend | on, but TopGit might help you with that in the future in a darcs-like | fashion. | | [...] Any comments before I have a go at implementing this scheme? Regards, Thomas
pgpwWcHnWTly3.pgp
Description: PGP signature