Re: apt-get wrapper for maintaining Partial Mirrors
On Sunday 21 June 2009 03:33:33 Goswin von Brederlow wrote: snip The Release could be signed using an rsign method with the machine(s) that manage the repository, or it could be done locally on the server using gpg-agent, or an unencrypted private key, depending on how the administrator prefers to manage it. The simplest implementation would be a tiny proxy applet that, when a deb file is requested, checks if the file is in the local archive. If it is then send it. If not then request file from upstream and pipe it to apt (no latency) and a tempfile. When the download has finished then reprepro --include suite deb. Doing the same for source is a little more tricky as you needs the dsc and related files as a group. I don't understand the tempfile part. Otherwise, that's a better idea, since my idea depended on running reprepro update, then sending the appropriate debs. Optional the apt proxy could prefetch package versions but for me that wouldn't be a high priority. Nice would be that it fetches sources along with binaries. When I find a bug in some software while traveling I would hate to not have the source available to fix it. But then it also needs to fetch Build-depends and their depends. So that would complicate matters a lot. I mentioned that part above. MfG Goswin Overall, I think that reprepro does a good job of maintaining a local repository, and we shouldn't reimplement what it does. Reprepro also seems flexible enough to implement most of the backend with simple commands and options. I've never tried to implement a new apt-method before, so I think that would take a bit more research from me. I totally agree that reprepro as the cache/storage backend would be great use of existing software. This is where I'm starting the code. Since regardless of how the partial mirror(s) will be managed, we agree that using reprepro as the backend is the best choice, I decided to start making a frontend or more appropriately middle-layer for this. Making this part simple enough to use with the most likely used configuration, while keeping the option to be almost as flexible as reprepro is has been a quite bit of work and thought. I have been working from the assumption that the local repository won't be a merged repository, but will be a set of partial mirrors. By this I mean that debian.org doesn't have to be merged with backports.org, but sid/debian.org may be in the same repository as lenny/debian.org (although even this could be separate, even if not recommended). What I'm saying is that I'm trying to allow either separate or merged repositories to be used where they make the most sense. The problem I have with it being an apt method is that the apt method runs on a different host than the reprepro. That would require ssh logins from all participating clients or something to alter the reprepro filter. I didn't stop to think about authentication, but I agree that it adds another level of work. I took a bit of time to try and read up on how apt transport methods work, but I didn't get very far. The only two transport methods that are available now are https and debtorrent. Both of those are written in C, which I'm not very good at using. I think that I'm just going to work on the basics of controlling reprepro, and adding/merging/removing filterlists, and when I'm satisfied that's working properly it'll be easier to decide how to control/manage it. I think that it will be better to work in that direction first, since it will be needed anyway. I have a small amount of code that I've started on. It doesn't do anything yet, but create the distribution and updates files in the conf/ directory(ies). I also have a bit of code to help merge filterlists, but I don't have any code that actually creates the lists and uses them in the reprepro config. Once I figure out where to upload the code, I'll let you know. -- Thanks: Joseph Rawson signature.asc Description: This is a digitally signed message part.
Re: apt-get wrapper for maintaining Partial Mirrors
Joseph Rawson umebos...@gmail.com writes: On Sunday 21 June 2009 03:33:33 Goswin von Brederlow wrote: snip The Release could be signed using an rsign method with the machine(s) that manage the repository, or it could be done locally on the server using gpg-agent, or an unencrypted private key, depending on how the administrator prefers to manage it. The simplest implementation would be a tiny proxy applet that, when a deb file is requested, checks if the file is in the local archive. If it is then send it. If not then request file from upstream and pipe it to apt (no latency) and a tempfile. When the download has finished then reprepro --include suite deb. Doing the same for source is a little more tricky as you needs the dsc and related files as a group. I don't understand the tempfile part. Otherwise, that's a better idea, since my idea depended on running reprepro update, then sending the appropriate debs. A tempfile so after download the proxy can run: reprepro include sid foo.deb MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
Joseph Rawson umebos...@gmail.com writes: On Saturday 20 June 2009 03:16:33 Goswin von Brederlow wrote: But now you made me think about this too. So here is what I think: - My bandwidth at home is fast enough to fetch packages directly. No need to mirror at all. - I don't want to download a package multiple times (once per host) so some shared proxy would be good. My idea would keep that from happening, at the expense of latency. The latency would be minimal, as it would just be dependant on reprepro retrieving the package(s) and signalling the client that the package is ready. Using reprepro to add extra packages to the repository from upstream without doing a full update may not be possible, but if it were, the latency would certainly be minimum, and the bandwidth to the internet would also be minimum. I just looked at the manpage again, and this may be possible by using the --nolistsdownload option with the update/checkupdate command. - Bootstraping a chroot still benefits from local packages but a shared proxy would do there too. - When I'm not at home I might not have network access or only a slow one so then I need a mirror. And my parents computer has a Linux that only I use and that needs a major update every time I vistit. So the ideal setup would be an apt proxy that stores the packages in the normal pool structure and has a simple command to create Packages.gz, Sources.gz, Release and Release.gpg files so the cache directory can be copied onto a USB disk and used as a repository of its own. Getting reprepro to do this would save a lot of the hassle, but getting reprepro to act as an apt proxy is also tricky. The current cache and proxy methods in the apt-proxy and apt-cache packages don't work as well in making a good repository, as opposed to reprepro. The Release could be signed using an rsign method with the machine(s) that manage the repository, or it could be done locally on the server using gpg-agent, or an unencrypted private key, depending on how the administrator prefers to manage it. The simplest implementation would be a tiny proxy applet that, when a deb file is requested, checks if the file is in the local archive. If it is then send it. If not then request file from upstream and pipe it to apt (no latency) and a tempfile. When the download has finished then reprepro --include suite deb. Doing the same for source is a little more tricky as you needs the dsc and related files as a group. Optional the apt proxy could prefetch package versions but for me that wouldn't be a high priority. Nice would be that it fetches sources along with binaries. When I find a bug in some software while traveling I would hate to not have the source available to fix it. But then it also needs to fetch Build-depends and their depends. So that would complicate matters a lot. I mentioned that part above. MfG Goswin Overall, I think that reprepro does a good job of maintaining a local repository, and we shouldn't reimplement what it does. Reprepro also seems flexible enough to implement most of the backend with simple commands and options. I've never tried to implement a new apt-method before, so I think that would take a bit more research from me. I totally agree that reprepro as the cache/storage backend would be great use of existing software. The problem I have with it being an apt method is that the apt method runs on a different host than the reprepro. That would require ssh logins from all participating clients or something to alter the reprepro filter. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
Joseph Rawson umebos...@gmail.com writes: On Friday 19 June 2009 12:57:25 Goswin von Brederlow wrote: Or have a proxy that adds packages that are requested. When I woke up this morning, I was thinking that it might be interesting to have an apt method that talks directly to reprepro. It's just a vague idea now, but I'll give it some more thought later. Way too much latency to mirror a deb when requested and you need to run apt-get update for it to show up. The best you can do is add the package to the filter list and then fetch it directly. Then the next night the mirror will pick it up for future updates. But now you made me think about this too. So here is what I think: - My bandwidth at home is fast enough to fetch packages directly. No need to mirror at all. - I don't want to download a package multiple times (once per host) so some shared proxy would be good. - Bootstraping a chroot still benefits from local packages but a shared proxy would do there too. - When I'm not at home I might not have network access or only a slow one so then I need a mirror. And my parents computer has a Linux that only I use and that needs a major update every time I vistit. So the ideal setup would be an apt proxy that stores the packages in the normal pool structure and has a simple command to create Packages.gz, Sources.gz, Release and Release.gpg files so the cache directory can be copied onto a USB disk and used as a repository of its own. Optional the apt proxy could prefetch package versions but for me that wouldn't be a high priority. Nice would be that it fetches sources along with binaries. When I find a bug in some software while traveling I would hate to not have the source available to fix it. But then it also needs to fetch Build-depends and their depends. So that would complicate matters a lot. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
On Saturday 20 June 2009 03:16:33 Goswin von Brederlow wrote: Joseph Rawson umebos...@gmail.com writes: On Friday 19 June 2009 12:57:25 Goswin von Brederlow wrote: Or have a proxy that adds packages that are requested. When I woke up this morning, I was thinking that it might be interesting to have an apt method that talks directly to reprepro. It's just a vague idea now, but I'll give it some more thought later. Way too much latency to mirror a deb when requested and you need to run apt-get update for it to show up. The best you can do is add the package to the filter list and then fetch it directly. Then the next night the mirror will pick it up for future updates. What I had in mind would eliminate a large part of the latency, and also keep from downloading the deb twice. Use a server application (I'll call it repserve for now) on the machine that hosts the reprepro repository. apt-get update The apt method talks to repserve, then repserve tells reprepro to run either update or checkupdate, then repserve feeds the appropriate files from the reprepro lists/ director(y/ies) back to the apt-get process on the local machine. This would probably use a bit more bandwidth (at least for the first update) since apt-get will download .pdiff files, where reprepro just grabs the whole Packages.gz files. apt-get install, upgrade, build-dep The apt method determines which source in it's apt lists to retrieve the package from, then sends that info to repserve. Repserve looks in it's repositor(y/ies) to determine where those packages are (or if they aren't yet mirrored), probably by scanning the filter lists. Repserve then tells reprepro to update in the appropriate repositories (if necessary). Then repserve signals the local client (or local client polls repserve), and the debs are then transferred from reprepro repos to local client. After that, the repserve process could instruct reprepro to retrieve the sources, if it's configured to do that. Also, it could try and determine build deps for those packages, and retrieve them and the sources, if it's configured to do that as well. With retrieving builddeps enabled, there might be a problem in having to explicitly list preferred alternatives, but this is mainly for packages that have drop-in replacements for libfoo-dev, like libgamin-dev provides libfam-dev. This is still just a rough idea. One of the interesting things about using an idea like this, is that it can still allow reprepro to be used in the normal way, so you can have a couple of machines that instruct repserve to help maintain the repository, and other machines on the network can just use reprepro directly through apache, ftp, etc. The controlling machines would have a sources.list like: deb repserve://myhost/debrepos/debian lenny main contrib non-free The repserve method on the client would send that line to the repserve server. The server would parse the line and match it to the appropriate repository from its configuration. The other hosts would just have this in sources.list: deb http://myhost/debrepos/debian lenny main contrib non-free The hosts using repserve could be the only ones with filter lists in reprepro, but it may be desired to have filter lists from the other machines, also. This would help keep packages from disappearing from the pool when they are still needed. It may also be nice to use reprepro's snapshotting each time a repserve method updates a repository, although this may require using those snapshot urls on the hosts that aren't using repserve. But now you made me think about this too. So here is what I think: - My bandwidth at home is fast enough to fetch packages directly. No need to mirror at all. - I don't want to download a package multiple times (once per host) so some shared proxy would be good. My idea would keep that from happening, at the expense of latency. The latency would be minimal, as it would just be dependant on reprepro retrieving the package(s) and signalling the client that the package is ready. Using reprepro to add extra packages to the repository from upstream without doing a full update may not be possible, but if it were, the latency would certainly be minimum, and the bandwidth to the internet would also be minimum. I just looked at the manpage again, and this may be possible by using the --nolistsdownload option with the update/checkupdate command. - Bootstraping a chroot still benefits from local packages but a shared proxy would do there too. - When I'm not at home I might not have network access or only a slow one so then I need a mirror. And my parents computer has a Linux that only I use and that needs a major update every time I vistit. So the ideal setup would be an apt proxy that stores the packages in the normal pool structure and has a simple command to create Packages.gz, Sources.gz, Release and Release.gpg files so the
Re: apt-get wrapper for maintaining Partial Mirrors
On Friday 19 June 2009 00:27:06 Goswin von Brederlow wrote: Joseph Rawson umebos...@gmail.com writes: BTW, the subject of this thread is apt-get wrapper for maintaining Partial Mirrors. The solution I'm proposing is a simple tool for maintaining Partial Mirrors (which could possibly be wrapped by apt-get later). I think that just pursuing an apt-get wrapper leads to some complications that could be avoided by creating the partial mirror tool first, then looking at wrapping it later. One complication might be how do handle apt-get remove, and another might be how to handle sid libraries that disappear from official repository, yet local machines must have them. Ahh, so maybe I completly misread that part. It was my fault for not making this point clear, as I should've done. FWIW, I would be much more interested in making a tool that would make it easier to manage local/partial debian mirrors (i.e. one that helped resolve the dependencies), rather than have an apt-get wrapper. I also think that once such a tool is made, it would make it easier to build an apt-get wrapper that works with it. I don't think that viewing the problem with an apt-get wrapper solution is the best way to approach it, but I do think that it would be valuable once the underlying problems are solved. Do you mean a wrapper around apt-get so that apt-get install foo on any client would automatically add foo to the list of packages being mirrored on the server? It was the original poster who mentioned the apt-get wrapper, but I took it to mean exactly what you said above. The tool I was envisioning would take a short list of packages (a text file with package names separated by newlines, or a collection of such text files) combined with a list of apt sources and generate the partial mirror from just that information. There are still some things that should be explicitly included in those lists, such as either gamin, fam, or both, as an example. If so then you can configure a post invoke hook in apt that will copy the dpkg status file of the host to the server [as status.$(hostname)] and then use those on the server to generate the filter for reprepro. I think I still have a script for that somewhere but it is easy enough to rewrite. That's good for binaries, but I don't know about the source. It wasn't long ago that I noticed a problem with reprepro not obtaining the corresponding source packages when you use a filter list taken from dpkg --get-selections. I remember that the source for jigdo wasn't in my partial mirror, because there were no binaries named jigdo, rather jigdo-file and jigdo-lite. Since there were no sources with that name, the jigdo source was never mirrored on my partial mirror. I don't know if that behavior has been fixed now, since there is now a binary named jigdo, instead of jigdo-lite. Also, it's more difficult for the local repository to determine the difference between the automatically selected and manually selected packages in this type of setup, since you would be sending a longer list of manually selected packages, instead of distinguishing which ones are actually selected. I guess that it doesn't matter much, as a package would only be removed from the repository once it's not listed on any of the lists. There were times when I didn't want certain packages to be removed from the repository, regardless of whether they were installed or not, so I used to run xxdiff on the packages files, so the newer ones were added. In my way of thinking, I'm not looking to merge upstream repositories together in one repository. Besides, there are already tools, such as apt-move that would be better for this job. Long ago, apt-move was the primary tool that I used to keep a local repository, and it worked pretty well, as long as all the machines that were using it were on the same release. I have found that reprepro is the absolute best tool for maintaining a debian mirror. The only problem I have with it is when I want to maintain a partial mirror, and I don't want a merged repository, is that I have to spread the packages lists to different places, and when you start adding machines, you start adding more lists to the configuration, when it would probably be better to maintain a set of master lists that are generated from the many lists that come from the machines. MfG Goswin -- Thanks: Joseph Rawson signature.asc Description: This is a digitally signed message part.
Re: apt-get wrapper for maintaining Partial Mirrors
On Thursday 18 June 2009 03:17:13 Frank Lin PIAT wrote: On Tue, 2009-06-09 at 16:16 -0500, Joseph Rawson wrote: On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote: I had an idea in mind whereby the task of making mirrors for personal distributions can be automated. lazy-way Depending on what you want to achieve, a caching proxy might be an easy solution (there are a specialized in the archive already) /lazy-way Or possibly apt-move called as a post-invoke action of apt-get. This can be stated as: if a person wants to keep a customised set of packages for usage with the distribution, the tool should be able to develop dependencies, fetch packages, generate appropriate documentation and then create the corresponding directory structure in the target mirror! The task can be extended to include packages which are currently not under one of the standard mirrors! lazy-way One don't have to merge the repositories, one can just declare multiple sources in /etc/apt/* /lazy-way Then it becomes harder to send the package to the appropriate local repository, since they aren't merged. I would also prefer to not have to deal with a merged repository, but keep separate upstream partial mirrors, as they would probably be easier to manage. I think the tool can have immense utility in helping people automate the task of mantaining the repositories. Suggestions, positive and negative are invited. I have not included the impl details as I would first like to evaluate the idea at a feasibility and utility level. If the scope of your project includes being able to bootstrap systems from the mirror, resolving dependency is much more complex (some packages aren't resolved by dependencies. For instance, the right kernel is select by some logic in Debian-installer). I found some interesting logic in debian-cd package. Still, I don't consider that allowing bootstrapping is mandatory. Your project would still be extremely valuable without it. [for those 95% of the people that install from CD, as opposed to netboot]. The reason that I recommended tying germinate and reprepro together with a tool was because the original post was discussing personal distributions. To me, this implies the ability to bootstrap, and also the need to have a self building source/binary repository. I have just made some other responses to Goswin that should help explain my view on things a bit better. Regards, Franklin -- Thanks: Joseph Rawson signature.asc Description: This is a digitally signed message part.
Re: apt-get wrapper for maintaining Partial Mirrors
On Thursday 18 June 2009 04:47:45 Goswin von Brederlow wrote: Frank Lin PIAT fp...@klabs.be writes: On Tue, 2009-06-09 at 16:16 -0500, Joseph Rawson wrote: On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote: This can be stated as: if a person wants to keep a customised set of packages for usage with the distribution, the tool should be able to develop dependencies, fetch packages, generate appropriate documentation and then create the corresponding directory structure in the target mirror! The task can be extended to include packages which are currently not under one of the standard mirrors! lazy-way One don't have to merge the repositories, one can just declare multiple sources in /etc/apt/* /lazy-way Lets say I want to mirror xserver-xorg from experimental. Then I would want it to include xserver-xorg-core (= xyz) also from experimental as the dependency dictates but not include libc6 from experimental as the sid one is sufficient. A key point here would be flexibility. This is something that I haven't considered yet. This would be one of the problems that might occur with the post invoke hook that you mentioned earlier using dpkg status. Actually this wouldn't be much of a problem, I was confused. I was thinking you were meaning --get-selections which just returns the name of the package and install/deinstall, but status also contains the version being used, and this could be matched to the appropriate repository in the sources list (so you get the libc from main instead of experimental, since the status file uses the version that's in main). However, I don't know how to use that info with reprepro. With reprepro, I've only sent --get-selections lists to it. In fact, this is how I used to install new packages in sid, and make sure they came from the local repository first. #!/bin/bash packages=`grep-status install ok not-installed | grep Package | gawk '{print $2}'` #packages=`aptitude search ~N | grep ^.i | gawk '{print $2}'` touch conf/list-uninstalled.tmp for package in $packages do echo -e $package\t\tinstall conf/list-uninstalled.tmp done cat conf/list-uninstalled.tmp | uniq | sort conf/list-uninstalled rm conf/list-uninstalled.tmp You may be able to tell by looking at the script that I'm still in the process of getting used to aptitude, being a longtime dselect user. ;) Anyway, I don't know much about determining (with reprepro) which upstream repository holds the version of the package that I want installed. I think the tool can have immense utility in helping people automate the task of mantaining the repositories. Suggestions, positive and negative are invited. I have not included the impl details as I would first like to evaluate the idea at a feasibility and utility level. If the scope of your project includes being able to bootstrap systems from the mirror, resolving dependency is much more complex (some packages aren't resolved by dependencies. For instance, the right kernel is select by some logic in Debian-installer). I found some interesting logic in debian-cd package. You would include linux-image-type in your package list. That isn't really a problem of the tool. Just of the input you need to provide. Also you would include everything udeb and everything essential/required for bootstraping purposes. I was also thinking along those lines, too. Same with fam/gamin and other packages that have drop-in replacements. Again flexibility is the key. Still, I don't consider that allowing bootstrapping is mandatory. Your project would still be extremely valuable without it. [for those 95% of the people that install from CD, as opposed to netboot]. Regards, Franklin MfG Goswin PS: the essential/required packages can already easily be filtered with grep-dctrl. -- Thanks: Joseph Rawson signature.asc Description: This is a digitally signed message part.
Re: apt-get wrapper for maintaining Partial Mirrors
On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote: would be much more interested in making a tool that would make it easier to manage local/partial debian mirrors (i.e. one that helped resolve the dependencies), rather than have an apt-get wrapper. I also think that once such a tool is made, it would make it easier to build an apt-get wrapper that works with it. I don't think that viewing the problem with an apt-get wrapper solution is the best way to approach it, but I do think that it would be valuable once the underlying problems are solved. And reprepro does not fit the bill because? -- Tzafrir Cohen | tzaf...@jabber.org | VIM is http://tzafrir.org.il || a Mutt's tzaf...@cohens.org.il || best ICQ# 16849754 || friend -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote: On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote: would be much more interested in making a tool that would make it easier to manage local/partial debian mirrors (i.e. one that helped resolve the dependencies), rather than have an apt-get wrapper. I also think that once such a tool is made, it would make it easier to build an apt-get wrapper that works with it. I don't think that viewing the problem with an apt-get wrapper solution is the best way to approach it, but I do think that it would be valuable once the underlying problems are solved. And reprepro does not fit the bill because? It fits part of the bill, as it's an excellent tool for maintaining a repository, but it doesn't resolve dependencies (nor should it). -- Tzafrir Cohen | tzaf...@jabber.org | VIM is http://tzafrir.org.il || a Mutt's tzaf...@cohens.org.il || best ICQ# 16849754 || friend -- Thanks: Joseph Rawson signature.asc Description: This is a digitally signed message part.
Re: apt-get wrapper for maintaining Partial Mirrors
On Friday 19 June 2009 00:27:06 Goswin von Brederlow wrote: Joseph Rawson umebos...@gmail.com writes: BTW, the subject of this thread is apt-get wrapper for maintaining Partial Mirrors. The solution I'm proposing is a simple tool for maintaining Partial Mirrors (which could possibly be wrapped by apt-get later). I think that just pursuing an apt-get wrapper leads to some complications that could be avoided by creating the partial mirror tool first, then looking at wrapping it later. One complication might be how do handle apt-get remove, and another might be how to handle sid libraries that disappear from official repository, yet local machines must have them. Ahh, so maybe I completly misread that part. Do you mean a wrapper around apt-get so that apt-get install foo on any client would automatically add foo to the list of packages being mirrored on the server? If so then you can configure a post invoke hook in apt that will copy the dpkg status file of the host to the server [as status.$(hostname)] and then use those on the server to generate the filter for reprepro. I think I still have a script for that somewhere but it is easy enough to rewrite. When you mentioned the word hook, I was reminded of reprepro's ability to use hooks. I started testing using a ListHook script with reprepro. I'm attaching the script so you can see the general idea. The script doesn't do anything effective, but may be helpful in understanding more of the way I'm approaching the idea. Please don't laugh too hard, I'm just playing with ideas now. Among other possible reasons, there are two main reasons why this particular approach won't work. One reason is that the ListHook calls a script for each list independently. So, if you have a package in contrib that depends on a package in main, like many do, the dependency won't be resolved using this method. Also, the germinator object only handles one arch at a time, so if you are mirroring multiple arches, you need to use a germinator object for each one. One way that this problem can be countered is by running a simple server that holds the germinator object, and the script that ListHook executes would communicate with that server. Then the server would grow the seeds and create the filter lists that would be used by reprepro. I tried this approach because I didn't see the sense in downloading the packages lists more than necessary. The way I was thinking before was to seed germinate (which would download the package lists), parse the output, create filter lists from that output, send them to reprepro, and call reprepro to update. This forces all of those package lists to be downloaded twice, which was something I tried to avoid with this short experiment. It also seems to be somewhat difficult to plant the seeds into germinate manually. I'm sure that problem could be solved by looking through the code a bit longer. MfG Goswin -- Thanks: Joseph Rawson testgerm Description: application/python signature.asc Description: This is a digitally signed message part.
Re: apt-get wrapper for maintaining Partial Mirrors
* Joseph Rawson umebos...@gmail.com [090619 13:23]: On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote: On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote: would be much more interested in making a tool that would make it easier to manage local/partial debian mirrors (i.e. one that helped resolve the dependencies), rather than have an apt-get wrapper. I also think that once such a tool is made, it would make it easier to build an apt-get wrapper that works with it. I don't think that viewing the problem with an apt-get wrapper solution is the best way to approach it, but I do think that it would be valuable once the underlying problems are solved. And reprepro does not fit the bill because? It fits part of the bill, as it's an excellent tool for maintaining a repository, but it doesn't resolve dependencies (nor should it). Actually, I'm quite open to having some depedency handling in reprepro and already have written some simple prototype for a related project. The problem is that calculating a simple cover of selected packages in the dependency graph is not enough: Usually the cover is not unique but the existance of alternatives in dependencies causes multiple solutions. For an initial checkout that is no problem, as one can choose one some set by some pseudo-random selection (like packages with alphabetically lower names get the first depedency in an alternative tried first and similar things for virtual packages). The problem is that no such criterion can be stable against changes in the partially mirrored distribution. So in this cases knowing what packages upstream has and what packages are wanted is not enough but one has to take into account what packages are currently selected. And a simply covering no longer is enough but one needs a full resolver knowing which installed states can be easily brought to which other installed states. (and things get even more complicated if the currently mirrored packages allow multiple subsets which clients using this repository might have installed)... Hochachtungsvoll, Bernhard R. Link -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
Joseph Rawson umebos...@gmail.com writes: On Friday 19 June 2009 00:27:06 Goswin von Brederlow wrote: Joseph Rawson umebos...@gmail.com writes: If so then you can configure a post invoke hook in apt that will copy the dpkg status file of the host to the server [as status.$(hostname)] and then use those on the server to generate the filter for reprepro. I think I still have a script for that somewhere but it is easy enough to rewrite. That's good for binaries, but I don't know about the source. It wasn't long ago that I noticed a problem with reprepro not obtaining the corresponding source packages when you use a filter list taken from dpkg --get-selections. I remember that the source for jigdo wasn't in my partial mirror, because there were no binaries named jigdo, rather jigdo-file and jigdo-lite. Since there were no sources with that name, the jigdo source was never mirrored on my partial mirror. I don't know if that behavior has been fixed now, since there is now a binary named jigdo, instead of jigdo-lite. My filter first converted the packages listed in the status file(s) to source package names (packages with different name have a Source: entry) and then output those for sources. Also, it's more difficult for the local repository to determine the difference between the automatically selected and manually selected packages in this type of setup, since you would be sending a longer list of manually selected packages, instead of distinguishing which ones are actually selected. I guess that it doesn't matter much, as a package would only be removed from the repository once it's not listed on any of the lists. There were times when I didn't want certain packages to be removed from the repository, regardless of whether they were installed or not, so I used to run xxdiff on the packages files, so the newer ones were added. Same problem here. Esspecially build-depends. There where a lot of packages I only needed inside my build chroots and only for the time of the build. So they never showed up on the mirror. Then I just resized the mirror partition and mirrored all debs. In my way of thinking, I'm not looking to merge upstream repositories together in one repository. Besides, there are already tools, such as apt-move that would be better for this job. Long ago, apt-move was the primary tool that I used to keep a local repository, and it worked pretty well, as long as all the machines that were using it were on the same release. I have found that reprepro is the absolute best tool for maintaining a debian mirror. The only problem I have with it is when I want to maintain a partial mirror, and I don't want a merged repository, is that I have to spread the packages lists to different places, and when you start adding machines, you start adding more lists to the configuration, when it would probably be better to maintain a set of master lists that are generated from the many lists that come from the machines. Or have a proxy that adds packages that are requested. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
On Friday 19 June 2009 07:14:08 Bernhard R. Link wrote: Actually, I'm quite open to having some depedency handling in reprepro That is interesting. I've been working on the assumption that there would never be any dependency handling in reprepro, as I didn't consider it part of it's function. and already have written some simple prototype for a related project. The problem is that calculating a simple cover of selected packages in the dependency graph is not enough: Usually the cover is not unique but the existance of alternatives in dependencies causes multiple solutions. This is a problem across the board. Even aptitude seems to have problems in automatically determining the most appropriate dependencies. Let's use this example. Suppose you already have a system with apache2 installed, but no php yet. Next you try to install phpldapadmin, using aptitude (from the command line). Aptitude will tell you that libapache-mod-php5 is broken, and proceed to present some alternatives that would resolve the dependencies. umebo...@stdinstall:~$ sudo aptitude -s install phpldapadmin Reading package lists... Done Building dependency tree Reading state information... Done Reading extended state information Initializing package states... Done Reading task descriptions... Done The following packages are BROKEN: libapache-mod-php5 The following NEW packages will be installed: php5-common{a} php5-ldap{a} phpldapadmin 0 packages upgraded, 4 newly installed, 0 to remove and 0 not upgraded. Need to get 3821kB of archives. After unpacking 11.8MB will be used. The following packages have unmet dependencies: libapache-mod-php5: Depends: libdb4.4 which is a virtual package. Depends: apache-common (= 1.3.34) which is a virtual package. Depends: php5-common (= 5.2.0-10+lenny1) but 5.2.6.dfsg.1-1+lenny3 is to be installed. The following actions will resolve these dependencies: Install the following packages: libapache2-mod-php5 [5.2.6.dfsg.1-1+lenny3 (stable)] Keep the following packages at their current version: libapache-mod-php5 [Not Installed] Score is 50 Accept this solution? [Y/n/q/?] n The following actions will resolve these dependencies: Install the following packages: php5-cgi [5.2.6.dfsg.1-1+lenny3 (stable)] Keep the following packages at their current version: libapache-mod-php5 [Not Installed] Score is 50 Accept this solution? [Y/n/q/?] n The following actions will resolve these dependencies: Install the following packages: libapache2-mod-php5 [5.2.6.dfsg.1-1+lenny2 (stable)] php5-common [5.2.6.dfsg.1-1+lenny2 (stable)] php5-ldap [5.2.6.dfsg.1-1+lenny2 (stable)] Keep the following packages at their current version: libapache-mod-php5 [Not Installed] Score is -30 etc, etc, etc . apt-get, on the other hand, seems to use the first dependency that's listed as an alternative. Depends: apache2 | httpd, php5-ldap, libapache2-mod-php5 | libapache-mod-php5 | php5-cgi | php5, debconf (= 0.5) | debconf-2.0 Here, since we already have apache2 on the system, libapache2-mod-php5 is chosen (I'm guessing because it's the first one listed). For an initial checkout that is no problem, as one can choose one some set by some pseudo-random selection (like packages with alphabetically lower names get the first depedency in an alternative tried first and similar things for virtual packages). I think that it should be up to the maintainer of the local mirror to explicitly list the alternatives that are preferred. I don't think that there is anyway that an automatic dependency resolver will ever be able to do this. The automatic dependency resolver can make this easier by marking those dependencies as automatically selected, alternative available or something similar. One of the nice things about germinate, is that it has a why column in it's output that tells why a package was selected (although it doesn't make it clear that it's one of many alternatives). The problem is that no such criterion can be stable against changes in the partially mirrored distribution. I'm not sure what you mean here. Are you talking about an alternative that's selected for the local mirror, but removed from the official mirror? So in this cases knowing what packages upstream has and what packages are wanted is not enough but one has to take into account what packages are currently selected. And a simply covering no longer is enough but one needs a full resolver knowing which installed states can be easily brought to which other installed states. (and things get even more complicated if the currently mirrored packages allow multiple subsets which clients using this repository might have installed)... I used to have to keep outdated libraries in my filter list when I was using a partial
Re: apt-get wrapper for maintaining Partial Mirrors
On Fri, Jun 19, 2009 at 06:23:08AM -0500, Joseph Rawson wrote: On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote: On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote: would be much more interested in making a tool that would make it easier to manage local/partial debian mirrors (i.e. one that helped resolve the dependencies), rather than have an apt-get wrapper. I also think that once such a tool is made, it would make it easier to build an apt-get wrapper that works with it. I don't think that viewing the problem with an apt-get wrapper solution is the best way to approach it, but I do think that it would be valuable once the underlying problems are solved. And reprepro does not fit the bill because? It fits part of the bill, as it's an excellent tool for maintaining a repository, but it doesn't resolve dependencies (nor should it). Just in case it might help, here's a script we used internally (at the Sarge time) to maintain a dummy repository that would help us eventually resolve an original list of packages to a complete list of packages we ask a reprepro source to update. -- Tzafrir Cohen | tzaf...@jabber.org | VIM is http://tzafrir.org.il || a Mutt's tzaf...@cohens.org.il || best ICQ# 16849754 || friend -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
On Fri, Jun 19, 2009 at 02:14:08PM +0200, Bernhard R. Link wrote: * Joseph Rawson umebos...@gmail.com [090619 13:23]: On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote: On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote: would be much more interested in making a tool that would make it easier to manage local/partial debian mirrors (i.e. one that helped resolve the dependencies), rather than have an apt-get wrapper. I also think that once such a tool is made, it would make it easier to build an apt-get wrapper that works with it. I don't think that viewing the problem with an apt-get wrapper solution is the best way to approach it, but I do think that it would be valuable once the underlying problems are solved. And reprepro does not fit the bill because? It fits part of the bill, as it's an excellent tool for maintaining a repository, but it doesn't resolve dependencies (nor should it). Actually, I'm quite open to having some depedency handling in reprepro and already have written some simple prototype for a related project. The problem is that calculating a simple cover of selected packages in the dependency graph is not enough: Usually the cover is not unique but the existance of alternatives in dependencies causes multiple solutions. For an initial checkout that is no problem, as one can choose one some set by some pseudo-random selection (like packages with alphabetically lower names get the first depedency in an alternative tried first and similar things for virtual packages). The problem is that no such criterion can be stable against changes in the partially mirrored distribution. While it's a good queastion, the interface I'm used to use is apt-get / aptitude. Thus the interface I had in mind is a list of packages to install (in a single installation). Using some tweaking this allows you to get exactly what you want. If you want your repository to include conflicting options, you should allow the interface to include multiple such entries. In our case we had multiple files. Each file was a list of packages, and each file was basically a single apt-get command. -- Tzafrir Cohen | tzaf...@jabber.org | VIM is http://tzafrir.org.il || a Mutt's tzaf...@cohens.org.il || best ICQ# 16849754 || friend -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
On Friday 19 June 2009 12:57:25 Goswin von Brederlow wrote: Joseph Rawson umebos...@gmail.com writes: On Friday 19 June 2009 00:27:06 Goswin von Brederlow wrote: Joseph Rawson umebos...@gmail.com writes: If so then you can configure a post invoke hook in apt that will copy the dpkg status file of the host to the server [as status.$(hostname)] and then use those on the server to generate the filter for reprepro. I think I still have a script for that somewhere but it is easy enough to rewrite. That's good for binaries, but I don't know about the source. It wasn't long ago that I noticed a problem with reprepro not obtaining the corresponding source packages when you use a filter list taken from dpkg --get-selections. I remember that the source for jigdo wasn't in my partial mirror, because there were no binaries named jigdo, rather jigdo-file and jigdo-lite. Since there were no sources with that name, the jigdo source was never mirrored on my partial mirror. I don't know if that behavior has been fixed now, since there is now a binary named jigdo, instead of jigdo-lite. My filter first converted the packages listed in the status file(s) to source package names (packages with different name have a Source: entry) and then output those for sources. Also, it's more difficult for the local repository to determine the difference between the automatically selected and manually selected packages in this type of setup, since you would be sending a longer list of manually selected packages, instead of distinguishing which ones are actually selected. I guess that it doesn't matter much, as a package would only be removed from the repository once it's not listed on any of the lists. There were times when I didn't want certain packages to be removed from the repository, regardless of whether they were installed or not, so I used to run xxdiff on the packages files, so the newer ones were added. Same problem here. Esspecially build-depends. There where a lot of packages I only needed inside my build chroots and only for the time of the build. So they never showed up on the mirror. Then I just resized the mirror partition and mirrored all debs. That was my ultimate solution to the problem. I bought one of the new terabyte usb external drives and just mirrored the whole repository. I had been satisfied to just call the problem solved at that point, but this thread resparked my interest in obtaining a better solution. Before I bought the hard drive, I was seriously looking into getting germinate and reprepro working together, but once I bought the drive, I just set it all aside. Still, this external drive isn't portable, and my small portable drive is only 80G (which is more than enough for a partial mirror of source, i386, and amd64), so I do still need to solve the problem. Besides, a month after I bought the drive, I discovered that I have a monthly cap on my transfers so it would be better, all around, to stop mirroring the complete repository. In my way of thinking, I'm not looking to merge upstream repositories together in one repository. Besides, there are already tools, such as apt-move that would be better for this job. Long ago, apt-move was the primary tool that I used to keep a local repository, and it worked pretty well, as long as all the machines that were using it were on the same release. I have found that reprepro is the absolute best tool for maintaining a debian mirror. The only problem I have with it is when I want to maintain a partial mirror, and I don't want a merged repository, is that I have to spread the packages lists to different places, and when you start adding machines, you start adding more lists to the configuration, when it would probably be better to maintain a set of master lists that are generated from the many lists that come from the machines. Or have a proxy that adds packages that are requested. When I woke up this morning, I was thinking that it might be interesting to have an apt method that talks directly to reprepro. It's just a vague idea now, but I'll give it some more thought later. MfG Goswin -- Thanks: Joseph Rawson signature.asc Description: This is a digitally signed message part.
Re: apt-get wrapper for maintaining Partial Mirrors
On Friday 19 June 2009 20:54:28 Tzafrir Cohen wrote: On Fri, Jun 19, 2009 at 06:23:08AM -0500, Joseph Rawson wrote: On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote: On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote: would be much more interested in making a tool that would make it easier to manage local/partial debian mirrors (i.e. one that helped resolve the dependencies), rather than have an apt-get wrapper. I also think that once such a tool is made, it would make it easier to build an apt-get wrapper that works with it. I don't think that viewing the problem with an apt-get wrapper solution is the best way to approach it, but I do think that it would be valuable once the underlying problems are solved. And reprepro does not fit the bill because? It fits part of the bill, as it's an excellent tool for maintaining a repository, but it doesn't resolve dependencies (nor should it). Just in case it might help, here's a script we used internally (at the Sarge time) to maintain a dummy repository that would help us eventually resolve an original list of packages to a complete list of packages we ask a reprepro source to update. Did you forget to attach it? :) -- Tzafrir Cohen | tzaf...@jabber.org | VIM is http://tzafrir.org.il || a Mutt's tzaf...@cohens.org.il || best ICQ# 16849754 || friend -- Thanks: Joseph Rawson signature.asc Description: This is a digitally signed message part.
Re: apt-get wrapper for maintaining Partial Mirrors
Actually attaching the file this time... On Sat, Jun 20, 2009 at 01:54:28AM +, Tzafrir Cohen wrote: On Fri, Jun 19, 2009 at 06:23:08AM -0500, Joseph Rawson wrote: On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote: On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote: would be much more interested in making a tool that would make it easier to manage local/partial debian mirrors (i.e. one that helped resolve the dependencies), rather than have an apt-get wrapper. I also think that once such a tool is made, it would make it easier to build an apt-get wrapper that works with it. I don't think that viewing the problem with an apt-get wrapper solution is the best way to approach it, but I do think that it would be valuable once the underlying problems are solved. And reprepro does not fit the bill because? It fits part of the bill, as it's an excellent tool for maintaining a repository, but it doesn't resolve dependencies (nor should it). Just in case it might help, here's a script we used internally (at the Sarge time) to maintain a dummy repository that would help us eventually resolve an original list of packages to a complete list of packages we ask a reprepro source to update. -- Tzafrir Cohen | tzaf...@jabber.org | VIM is http://tzafrir.org.il || a Mutt's tzaf...@cohens.org.il || best ICQ# 16849754 || friend -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org -- Tzafrir Cohen | tzaf...@jabber.org | VIM is http://tzafrir.org.il || a Mutt's tzaf...@cohens.org.il || best ICQ# 16849754 || friend #!/bin/bash # using bash-specific PIPESTATUS CMD=`basename $0` REPREPRO=reprepro BASE_DIR=repo APT_BASE_DIR=${BASE_DIR}/Aptdir APT_DIR=${APT_DIR:-${APT_BASE_DIR}/unstable} MAIN_REPO=/home/repo PACKAGES_LIST_FILE=packages STATIC_DIR=$MAIN_REPO/static STATIC_INST=$MAIN_REPO/static_inst INSTALLER_PATH=$BASE_DIR/dists/sarge/main/installer-i386/current CD_OVERRIDE=cd-override/cd set -e usage() { echo 2 apter: apt resolver wrapper echo 2(functionality varies by basename of \$0) echo 2 Usage: $0 setup|generate|refresh } # $1: file # $2: condition get_entry() { awk $BASE_DIR/conf/$1 -v RS='\n\n' /\$2\\n/ {print \$0} #echo 2 printed updates section $2. } # $1: file (updates/distributions) # $2: condition # $3: field name get_field() { get_entry $1 $2 | grep ^$3: | cut -d: -f2- } dists_list() { awk '/^Codename: / {print $2}' $BASE_DIR/conf/distributions } case $CMD in apt-get|apt-cache|aptitude) exec $CMD \ -o Dir=$PWD/$APT_DIR \ -o Dir::State::status=$PWD/$APT_DIR/var/lib/dpkg/status \ $@ ;; apter) case $1 in setup) for dist in `dists_list` do APT_DIR=$APT_BASE_DIR/$dist export APT_DIR for dir in \ etc/apt var/lib/apt/lists/partial \ var/lib/dpkg var/cache/apt/archives/partial do mkdir -p $APT_DIR/$dir done touch $APT_DIR/var/lib/dpkg/status # relevant update sources: update_sources=`get_field distributions Codename: $dist Update` ( for upd in $update_sources do get_entry updates Name: $upd echo '' done ) | tools/updates2sources $APT_DIR/etc/apt/sources.list cat EOF $APT_DIR/etc/apt/preferences # give our packages a higher priority: Package: * Pin: release o=Xorcom Pin-Priority: 600 EOF done ;; generate) # setup the apt wrapper: $0 setup $0 refresh ;; refresh) rm -rf $BASE_DIR/{db,dists,lists,pool} $0 refresh-nodel ;; upgrade|refresh-nodel) apt_cmd=`dirname $0`/apt-get for file in `ls $STATIC_DIR` do rsync -a --delete $STATIC_DIR/$file
Re: apt-get wrapper for maintaining Partial Mirrors
Joseph Rawson umebos...@gmail.com writes: There is another application that will help with the dependencies. It's called germinate, and it will take a short list of packages and a list of repositories and build a bunch of different lists of packages and their dependencies. Germinate will also determine build dependencies for those packages and recursively build a list of builddeps and the builddeps' builddeps. I have thought of making an application that would get germinate and reprepro to work together to help build a decent partial mirror that had the correct set of packages, but the process was a bit time consuming. It's been a while Was it that bad? It only needs to run 4 times a day when the mirror push comes in. since I've worked on this, since my temporary solution to the problem was to buy a larger hard drive. Currently, I have a full mirror that I keep updated, and a repository of locally built packages next to it. I'm not really happy with this solution, as it uses too much disk space and I'm downloading packages that will never be used, but it's given me time to tackle more important problems. Before writing any code, I would recommend taking a look at both reprepro and germinate, as each of these applications is good at solving half of the problems you describe. I think that an ideal solution would be to write a frontend program that takes a list of packages and upstream repositories, feeds them into germinate, obtains the result from germinate, parse those results and build a reprepro configuration from that, then get reprepro to fetch the appropriate packages. Combining germinate and reprepro is the right thing to do. Or reprepro and a new filter instead of germinate. But don't rewrite reprepro. Given a little bit of care when writing the reprepro config this can be completly done as part of the filtering. There is no need for a seperate run that scanns all upstream repositories as long as you can define a partial order between them, i.e. contrib needs things from main but main never from contrib. That would also have the benefit that you only need to process those packages files that have changed. I would be happy to help with this, as I could use such an application, and I already have a meager bit of python code that parses the output of germinate (germinate uses a wiki-type markup in it's output files). I stopped working on the code since I bought a new hard drive, since I just used the extra space to solve the problem for me, but I can bring it back to life, as I would desire to use a more correct solution. Urgs, that sucks. It should take a Packages/Sources style input and output the same format. Maybe rewriting it using libapt would be better than wrapping germinate. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
On Tue, 2009-06-09 at 16:16 -0500, Joseph Rawson wrote: On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote: I had an idea in mind whereby the task of making mirrors for personal distributions can be automated. lazy-way Depending on what you want to achieve, a caching proxy might be an easy solution (there are a specialized in the archive already) /lazy-way This can be stated as: if a person wants to keep a customised set of packages for usage with the distribution, the tool should be able to develop dependencies, fetch packages, generate appropriate documentation and then create the corresponding directory structure in the target mirror! The task can be extended to include packages which are currently not under one of the standard mirrors! lazy-way One don't have to merge the repositories, one can just declare multiple sources in /etc/apt/* /lazy-way I think the tool can have immense utility in helping people automate the task of mantaining the repositories. Suggestions, positive and negative are invited. I have not included the impl details as I would first like to evaluate the idea at a feasibility and utility level. If the scope of your project includes being able to bootstrap systems from the mirror, resolving dependency is much more complex (some packages aren't resolved by dependencies. For instance, the right kernel is select by some logic in Debian-installer). I found some interesting logic in debian-cd package. Still, I don't consider that allowing bootstrapping is mandatory. Your project would still be extremely valuable without it. [for those 95% of the people that install from CD, as opposed to netboot]. Regards, Franklin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
Frank Lin PIAT fp...@klabs.be writes: On Tue, 2009-06-09 at 16:16 -0500, Joseph Rawson wrote: On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote: This can be stated as: if a person wants to keep a customised set of packages for usage with the distribution, the tool should be able to develop dependencies, fetch packages, generate appropriate documentation and then create the corresponding directory structure in the target mirror! The task can be extended to include packages which are currently not under one of the standard mirrors! lazy-way One don't have to merge the repositories, one can just declare multiple sources in /etc/apt/* /lazy-way Lets say I want to mirror xserver-xorg from experimental. Then I would want it to include xserver-xorg-core (= xyz) also from experimental as the dependency dictates but not include libc6 from experimental as the sid one is sufficient. A key point here would be flexibility. I think the tool can have immense utility in helping people automate the task of mantaining the repositories. Suggestions, positive and negative are invited. I have not included the impl details as I would first like to evaluate the idea at a feasibility and utility level. If the scope of your project includes being able to bootstrap systems from the mirror, resolving dependency is much more complex (some packages aren't resolved by dependencies. For instance, the right kernel is select by some logic in Debian-installer). I found some interesting logic in debian-cd package. You would include linux-image-type in your package list. That isn't really a problem of the tool. Just of the input you need to provide. Also you would include everything udeb and everything essential/required for bootstraping purposes. Again flexibility is the key. Still, I don't consider that allowing bootstrapping is mandatory. Your project would still be extremely valuable without it. [for those 95% of the people that install from CD, as opposed to netboot]. Regards, Franklin MfG Goswin PS: the essential/required packages can already easily be filtered with grep-dctrl. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
On Thursday 18 June 2009 02:46:42 Goswin von Brederlow wrote: Joseph Rawson umebos...@gmail.com writes: There is another application that will help with the dependencies. It's called germinate, and it will take a short list of packages and a list of repositories and build a bunch of different lists of packages and their dependencies. Germinate will also determine build dependencies for those packages and recursively build a list of builddeps and the builddeps' builddeps. I have thought of making an application that would get germinate and reprepro to work together to help build a decent partial mirror that had the correct set of packages, but the process was a bit time consuming. It's been a while Was it that bad? It only needs to run 4 times a day when the mirror push comes in. It wasn't the running that was time consuming, but the writing of all the code to seed germinate, then try and use the results for reprepro. I'm sorry if I wasn't clear on which part was consuming time. since I've worked on this, since my temporary solution to the problem was to buy a larger hard drive. Currently, I have a full mirror that I keep updated, and a repository of locally built packages next to it. I'm not really happy with this solution, as it uses too much disk space and I'm downloading packages that will never be used, but it's given me time to tackle more important problems. Before writing any code, I would recommend taking a look at both reprepro and germinate, as each of these applications is good at solving half of the problems you describe. I think that an ideal solution would be to write a frontend program that takes a list of packages and upstream repositories, feeds them into germinate, obtains the result from germinate, parse those results and build a reprepro configuration from that, then get reprepro to fetch the appropriate packages. Combining germinate and reprepro is the right thing to do. Or reprepro and a new filter instead of germinate. But don't rewrite reprepro. I never intended to rewrite reprepro. It does it's job very well. It's not reprepro's job to resolve dependencies, nor should it be, as a dependency could lie in an entirely different repository. I do think that since each program has it's specific area of responsibility, that a program that glues them together would be appropriate, and help from reinventing wheels when it's not necessary. Given a little bit of care when writing the reprepro config this can be completly done as part of the filtering. There is no need for a seperate run that scanns all upstream repositories as long as you can define a partial order between them, i.e. contrib needs things from main but main never from contrib. That would also have the benefit that you only need to process those packages files that have changed. I would be happy to help with this, as I could use such an application, and I already have a meager bit of python code that parses the output of germinate (germinate uses a wiki-type markup in it's output files). I stopped working on the code since I bought a new hard drive, since I just used the extra space to solve the problem for me, but I can bring it back to life, as I would desire to use a more correct solution. Urgs, that sucks. It should take a Packages/Sources style input and output the same format. I don't like the output either, but I haven't taken much time to dig into the germinate code very much. Maybe rewriting it using libapt would be better than wrapping germinate. Germinate uses libapt. It imports apt_pkg from the python-apt package, which is a python binding to libapt, AFAIK. It might be easier to just add '/usr/lib/germinate' to the sys.path and control the Germinator object directly, bypassing the way that the package lists are output from germinate. Germinate does have an advantage in that it can recursively add the builddeps for a package list, making a list for a partial, self-building mirror. BTW, the subject of this thread is apt-get wrapper for maintaining Partial Mirrors. The solution I'm proposing is a simple tool for maintaining Partial Mirrors (which could possibly be wrapped by apt-get later). I think that just pursuing an apt-get wrapper leads to some complications that could be avoided by creating the partial mirror tool first, then looking at wrapping it later. One complication might be how do handle apt-get remove, and another might be how to handle sid libraries that disappear from official repository, yet local machines must have them. MfG Goswin -- Thanks: Joseph Rawson signature.asc Description: This is a digitally signed message part.
Re: apt-get wrapper for maintaining Partial Mirrors
Joseph Rawson umebos...@gmail.com writes: BTW, the subject of this thread is apt-get wrapper for maintaining Partial Mirrors. The solution I'm proposing is a simple tool for maintaining Partial Mirrors (which could possibly be wrapped by apt-get later). I think that just pursuing an apt-get wrapper leads to some complications that could be avoided by creating the partial mirror tool first, then looking at wrapping it later. One complication might be how do handle apt-get remove, and another might be how to handle sid libraries that disappear from official repository, yet local machines must have them. Ahh, so maybe I completly misread that part. Do you mean a wrapper around apt-get so that apt-get install foo on any client would automatically add foo to the list of packages being mirrored on the server? If so then you can configure a post invoke hook in apt that will copy the dpkg status file of the host to the server [as status.$(hostname)] and then use those on the server to generate the filter for reprepro. I think I still have a script for that somewhere but it is easy enough to rewrite. MfG Goswin -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
apt-get wrapper for maintaining Partial Mirrors
Hi all, We all know that there are various distro's that build around Debian. I had an idea in mind whereby the task of making mirrors for personal distributions can be automated. This can be stated as: if a person wants to keep a customised set of packages for usage with the distribution, the tool should be able to develop dependencies, fetch packages, generate appropriate documentation and then create the corresponding directory structure in the target mirror! The task can be extended to include packages which are currently not under one of the standard mirrors! I think the tool can have immense utility in helping people automate the task of mantaining the repositories. Suggestions, positive and negative are invited. I have not included the impl details as I would first like to evaluate the idea at a feasibility and utility level. -- To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: apt-get wrapper for maintaining Partial Mirrors
On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote: Hi all, We all know that there are various distro's that build around Debian. I had an idea in mind whereby the task of making mirrors for personal distributions can be automated. This can be stated as: if a person wants to keep a customised set of packages for usage with the distribution, the tool should be able to develop dependencies, fetch packages, generate appropriate documentation and then create the corresponding directory structure in the target mirror! The task can be extended to include packages which are currently not under one of the standard mirrors! I think the tool can have immense utility in helping people automate the task of mantaining the repositories. Suggestions, positive and negative are invited. I have not included the impl details as I would first like to evaluate the idea at a feasibility and utility level. I have been working on this idea myself for quite a while, but I haven't messed with the problem recently. I was using reprepro to maintain partial mirrors, but it required using the output from dpkg --get-selections from almost every machine that I needed to mirror packages for. The reprepro program is excellent for making partial mirrors, but it has a drawback in that it doesn't help resolve dependencies. This means that you can't just make a short list of packages and easily build a partial mirror that contains those packages and their dependencies, rather you have to install a machine with those packages and use the list of packages from that machine with reprepro to get a decent mirror. There is another application that will help with the dependencies. It's called germinate, and it will take a short list of packages and a list of repositories and build a bunch of different lists of packages and their dependencies. Germinate will also determine build dependencies for those packages and recursively build a list of builddeps and the builddeps' builddeps. I have thought of making an application that would get germinate and reprepro to work together to help build a decent partial mirror that had the correct set of packages, but the process was a bit time consuming. It's been a while since I've worked on this, since my temporary solution to the problem was to buy a larger hard drive. Currently, I have a full mirror that I keep updated, and a repository of locally built packages next to it. I'm not really happy with this solution, as it uses too much disk space and I'm downloading packages that will never be used, but it's given me time to tackle more important problems. Before writing any code, I would recommend taking a look at both reprepro and germinate, as each of these applications is good at solving half of the problems you describe. I think that an ideal solution would be to write a frontend program that takes a list of packages and upstream repositories, feeds them into germinate, obtains the result from germinate, parse those results and build a reprepro configuration from that, then get reprepro to fetch the appropriate packages. I would be happy to help with this, as I could use such an application, and I already have a meager bit of python code that parses the output of germinate (germinate uses a wiki-type markup in it's output files). I stopped working on the code since I bought a new hard drive, since I just used the extra space to solve the problem for me, but I can bring it back to life, as I would desire to use a more correct solution. -- Thanks: Joseph Rawson signature.asc Description: This is a digitally signed message part.