Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-25 Thread Joseph Rawson
On Sunday 21 June 2009 03:33:33 Goswin von Brederlow wrote:
snip
  The Release could be signed using an rsign method with the machine(s)
  that manage the repository, or it could be done locally on the server
  using gpg-agent, or an unencrypted private key, depending on how the
  administrator prefers to manage it.

 The simplest implementation would be a tiny proxy applet that, when a
 deb file is requested, checks if the file is in the local
 archive. If it is then send it. If not then request file from
 upstream and pipe it to apt (no latency) and a tempfile. When the
 download has finished then reprepro --include suite deb. Doing the
 same for source is a little more tricky as you needs the dsc and
 related files as a group.

I don't understand the tempfile part.  Otherwise, that's a better idea, since 
my idea depended on running reprepro update, then sending the appropriate 
debs.
  Optional the apt proxy could prefetch package versions but for me that
  wouldn't be a high priority.
 
  Nice would be that it fetches sources along with binaries. When I find
  a bug in some software while traveling I would hate to not have the
  source available to fix it. But then it also needs to fetch
  Build-depends and their depends. So that would complicate matters a
  lot.
 
  I mentioned that part above.
 
  MfG
  Goswin
 
  Overall, I think that reprepro does a good job of maintaining a local
  repository, and we shouldn't reimplement what it does.  Reprepro also
  seems flexible enough to implement most of the backend with simple
  commands and options.  I've never tried to implement a new apt-method
  before, so I think that would take a bit more research from me.

 I totally agree that reprepro as the cache/storage backend would be
 great use of existing software.

This is where I'm starting the code.  Since regardless of how the partial 
mirror(s) will be managed, we agree that using reprepro as the backend is the 
best choice,  I decided to start making a frontend or more 
appropriately middle-layer for this.  Making this part simple enough to use  
with the most likely used configuration, while keeping the option to be 
almost as flexible as reprepro is has been a quite bit of work and thought.

I have been working from the assumption that the local repository won't be a 
merged repository, but will be a set of partial mirrors.  By this I mean 
that debian.org doesn't have to be merged with backports.org, 
but sid/debian.org may be in the same repository as lenny/debian.org 
(although even this could be separate, even if not recommended).  What I'm 
saying is that I'm trying to allow either separate or merged repositories to 
be used where they make the most sense.

 The problem I have with it being an apt method is that the apt method
 runs on a different host than the reprepro. That would require ssh
 logins from all participating clients or something to alter the
 reprepro filter.

I didn't stop to think about authentication, but I agree that it adds another 
level of work.  I took a bit of time to try and read up on how apt transport 
methods work, but I didn't get very far.  The only two transport methods that 
are available now are https and debtorrent.  Both of those are written in C, 
which I'm not very good at using.

I think that I'm just going to work on the basics of controlling reprepro, and 
adding/merging/removing filterlists, and when I'm satisfied that's working 
properly it'll be easier to decide how to control/manage it.  I think that it 
will be better to work in that direction first, since it will be needed 
anyway.

I have a small amount of code that I've started on.  It doesn't do anything 
yet, but create the distribution and updates files in the conf/ 
directory(ies).  I also have a bit of code to help merge filterlists, but I 
don't have any code that actually creates the lists and uses them in the 
reprepro config.  Once I figure out where to upload the code, I'll let you 
know.

-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-25 Thread Goswin von Brederlow
Joseph Rawson umebos...@gmail.com writes:

 On Sunday 21 June 2009 03:33:33 Goswin von Brederlow wrote:
 snip
  The Release could be signed using an rsign method with the machine(s)
  that manage the repository, or it could be done locally on the server
  using gpg-agent, or an unencrypted private key, depending on how the
  administrator prefers to manage it.

 The simplest implementation would be a tiny proxy applet that, when a
 deb file is requested, checks if the file is in the local
 archive. If it is then send it. If not then request file from
 upstream and pipe it to apt (no latency) and a tempfile. When the
 download has finished then reprepro --include suite deb. Doing the
 same for source is a little more tricky as you needs the dsc and
 related files as a group.

 I don't understand the tempfile part.  Otherwise, that's a better idea, since 
 my idea depended on running reprepro update, then sending the appropriate 
 debs.

A tempfile so after download the proxy can run:
  reprepro include sid foo.deb

MfG
Goswin


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-21 Thread Goswin von Brederlow
Joseph Rawson umebos...@gmail.com writes:

 On Saturday 20 June 2009 03:16:33 Goswin von Brederlow wrote:
 But now you made me think about this too. So here is what I think:

 - My bandwidth at home is fast enough to fetch packages directly. No
   need to mirror at all.

 - I don't want to download a package multiple times (once per host) so
   some shared proxy would be good.

 My idea would keep that from happening, at the expense of latency.  The 
 latency would be minimal, as it would just be dependant on reprepro 
 retrieving the package(s) and signalling the client that the package is 
 ready.  Using reprepro to add extra packages to the repository from upstream 
 without doing a full update may not be possible, but if it were, the latency 
 would certainly be minimum, and the bandwidth to the internet would also be 
 minimum.  I just looked at the manpage again, and this may be possible by 
 using the --nolistsdownload option with the update/checkupdate command.


 - Bootstraping a chroot still benefits from local packages but a
   shared proxy would do there too.

 - When I'm not at home I might not have network access or only a slow
   one so then I need a mirror. And my parents computer has a Linux that
   only I use and that needs a major update every time I vistit.

 So the ideal setup would be an apt proxy that stores the packages in
 the normal pool structure and has a simple command to create
 Packages.gz, Sources.gz, Release and Release.gpg files so the cache
 directory can be copied onto a USB disk and used as a repository of
 its own.

 Getting reprepro to do this would save a lot of the hassle, but getting 
 reprepro to act as an apt proxy is also tricky.  The current cache and proxy 
 methods in the apt-proxy and apt-cache packages don't work as well in making 
 a good repository, as opposed to reprepro.

 The Release could be signed using an rsign method with the machine(s) that 
 manage the repository, or it could be done locally on the server using 
 gpg-agent, or an unencrypted private key, depending on how the administrator 
 prefers to manage it.

The simplest implementation would be a tiny proxy applet that, when a
deb file is requested, checks if the file is in the local
archive. If it is then send it. If not then request file from
upstream and pipe it to apt (no latency) and a tempfile. When the
download has finished then reprepro --include suite deb. Doing the
same for source is a little more tricky as you needs the dsc and
related files as a group.

 Optional the apt proxy could prefetch package versions but for me that
 wouldn't be a high priority.

 Nice would be that it fetches sources along with binaries. When I find
 a bug in some software while traveling I would hate to not have the
 source available to fix it. But then it also needs to fetch
 Build-depends and their depends. So that would complicate matters a
 lot.
 I mentioned that part above.

 MfG
 Goswin

 Overall, I think that reprepro does a good job of maintaining a local 
 repository, and we shouldn't reimplement what it does.  Reprepro also seems 
 flexible enough to implement most of the backend with simple commands and 
 options.  I've never tried to implement a new apt-method before, so I think 
 that would take a bit more research from me.

I totally agree that reprepro as the cache/storage backend would be
great use of existing software.

The problem I have with it being an apt method is that the apt method
runs on a different host than the reprepro. That would require ssh
logins from all participating clients or something to alter the
reprepro filter.

MfG
Goswin


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-20 Thread Goswin von Brederlow
Joseph Rawson umebos...@gmail.com writes:

 On Friday 19 June 2009 12:57:25 Goswin von Brederlow wrote:
 Or have a proxy that adds packages that are requested.
 When I woke up this morning, I was thinking that it might be interesting to 
 have an apt method that talks directly to reprepro.  It's just a vague idea 
 now, but I'll give it some more thought later.

Way too much latency to mirror a deb when requested and you need to
run apt-get update for it to show up.

The best you can do is add the package to the filter list and then
fetch it directly. Then the next night the mirror will pick it up for
future updates.


But now you made me think about this too. So here is what I think:

- My bandwidth at home is fast enough to fetch packages directly. No
  need to mirror at all.

- I don't want to download a package multiple times (once per host) so
  some shared proxy would be good.

- Bootstraping a chroot still benefits from local packages but a
  shared proxy would do there too.

- When I'm not at home I might not have network access or only a slow
  one so then I need a mirror. And my parents computer has a Linux that
  only I use and that needs a major update every time I vistit.

So the ideal setup would be an apt proxy that stores the packages in
the normal pool structure and has a simple command to create
Packages.gz, Sources.gz, Release and Release.gpg files so the cache
directory can be copied onto a USB disk and used as a repository of
its own.

Optional the apt proxy could prefetch package versions but for me that
wouldn't be a high priority.

Nice would be that it fetches sources along with binaries. When I find
a bug in some software while traveling I would hate to not have the
source available to fix it. But then it also needs to fetch
Build-depends and their depends. So that would complicate matters a
lot.

MfG
Goswin


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-20 Thread Joseph Rawson
On Saturday 20 June 2009 03:16:33 Goswin von Brederlow wrote:
 Joseph Rawson umebos...@gmail.com writes:
  On Friday 19 June 2009 12:57:25 Goswin von Brederlow wrote:
  Or have a proxy that adds packages that are requested.
 
  When I woke up this morning, I was thinking that it might be interesting
  to have an apt method that talks directly to reprepro.  It's just a vague
  idea now, but I'll give it some more thought later.

 Way too much latency to mirror a deb when requested and you need to
 run apt-get update for it to show up.

 The best you can do is add the package to the filter list and then
 fetch it directly. Then the next night the mirror will pick it up for
 future updates.

What I had in mind would eliminate a large part of the latency, and also keep 
from downloading the deb twice.

Use a server application (I'll call it repserve for now) on the machine that 
hosts the reprepro repository.  

apt-get update
The apt method talks to repserve, then repserve tells reprepro to run either 
update or checkupdate, then repserve feeds the appropriate files from the 
reprepro lists/ director(y/ies) back to the apt-get process on the local 
machine.  This would probably use a bit more bandwidth (at least for the 
first update) since apt-get will download .pdiff files, where reprepro just 
grabs the whole Packages.gz files.

apt-get install, upgrade, build-dep
The apt method determines which source in it's apt lists to retrieve the 
package from, then sends that info to repserve.  Repserve looks in it's 
repositor(y/ies) to determine where those packages are (or if they aren't yet 
mirrored), probably by scanning the filter lists.  Repserve then tells 
reprepro to update in the appropriate repositories (if necessary).  Then 
repserve signals the local client (or local client polls repserve), and the 
debs are then transferred from reprepro repos to local client.  After that, 
the repserve process could instruct reprepro to retrieve the sources, if it's 
configured to do that.  Also, it could try and determine build deps for those 
packages, and retrieve them and the sources, if it's configured to do that as 
well.  With retrieving builddeps enabled, there might be a problem in having 
to explicitly list preferred alternatives, but this is mainly for packages 
that have drop-in replacements for libfoo-dev, like libgamin-dev provides 
libfam-dev.

This is still just a rough idea.  One of the interesting things about using an 
idea like this, is that it can still allow reprepro to be used in the normal 
way, so you can have a couple of machines that instruct repserve to help 
maintain the repository, and other machines on the network can just use 
reprepro directly through apache, ftp, etc.  The controlling machines would 
have a sources.list like:

deb repserve://myhost/debrepos/debian lenny main contrib non-free

The repserve method on the client would send that line to the repserve server.  
The server would parse the line and match it to the appropriate repository 
from its configuration.

The other hosts would just have this in sources.list:

deb http://myhost/debrepos/debian lenny main contrib non-free

The hosts using repserve could be the only ones with filter lists in reprepro, 
but it may be desired to have filter lists from the other machines, also.  
This would help keep packages from disappearing from the pool when they are 
still needed.  It may also be nice to use reprepro's snapshotting each time a 
repserve method updates a repository, although this may require using those 
snapshot urls on the hosts that aren't using repserve.



 But now you made me think about this too. So here is what I think:

 - My bandwidth at home is fast enough to fetch packages directly. No
   need to mirror at all.

 - I don't want to download a package multiple times (once per host) so
   some shared proxy would be good.

My idea would keep that from happening, at the expense of latency.  The 
latency would be minimal, as it would just be dependant on reprepro 
retrieving the package(s) and signalling the client that the package is 
ready.  Using reprepro to add extra packages to the repository from upstream 
without doing a full update may not be possible, but if it were, the latency 
would certainly be minimum, and the bandwidth to the internet would also be 
minimum.  I just looked at the manpage again, and this may be possible by 
using the --nolistsdownload option with the update/checkupdate command.


 - Bootstraping a chroot still benefits from local packages but a
   shared proxy would do there too.

 - When I'm not at home I might not have network access or only a slow
   one so then I need a mirror. And my parents computer has a Linux that
   only I use and that needs a major update every time I vistit.

 So the ideal setup would be an apt proxy that stores the packages in
 the normal pool structure and has a simple command to create
 Packages.gz, Sources.gz, Release and Release.gpg files so the 

Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Friday 19 June 2009 00:27:06 Goswin von Brederlow wrote:
 Joseph Rawson umebos...@gmail.com writes:
  BTW, the subject of this thread is apt-get wrapper for maintaining
  Partial Mirrors.  The solution I'm proposing is a simple tool for
  maintaining Partial Mirrors (which could possibly be wrapped by apt-get
  later).
 
  I think that just pursuing an apt-get wrapper leads to some
  complications that could be avoided by creating the partial mirror tool
  first, then looking at wrapping it later.  One complication might be how
  do handle apt-get remove, and another might be how to handle sid
  libraries that disappear from official repository, yet local machines
  must have them.

 Ahh, so maybe I completly misread that part.

It was my fault for not making this point clear, as I should've done.  FWIW, I 
would be much more interested in making a tool that would make it easier to 
manage local/partial debian mirrors (i.e. one that helped resolve the 
dependencies), rather than have an apt-get wrapper.  I also think that once 
such a tool is made, it would make it easier to build an apt-get wrapper that 
works with it.  I don't think that viewing the problem with an apt-get 
wrapper solution is the best way to approach it, but I do think that it 
would be valuable once the underlying problems are solved.

 Do you mean a wrapper around apt-get so that apt-get install foo on
 any client would automatically add foo to the list of packages being
 mirrored on the server?

It was the original poster who mentioned the apt-get wrapper, but I took it to 
mean exactly what you said above.  The tool I was envisioning would take a 
short list of packages (a text file with package names separated by newlines, 
or a collection of such text files) combined with a list of apt sources and 
generate the partial mirror from just that information.  There are still some 
things that should be explicitly included in those lists, such as either 
gamin, fam, or both, as an example.

 If so then you can configure a post invoke hook in apt that will copy
 the dpkg status file of the host to the server [as status.$(hostname)]
 and then use those on the server to generate the filter for
 reprepro. I think I still have a script for that somewhere but it is
 easy enough to rewrite.
That's good for binaries, but I don't know about the source.  It wasn't long 
ago that I noticed a problem with reprepro not obtaining the corresponding 
source packages when you use a filter list taken 
from  dpkg --get-selections.  I remember that the source for jigdo wasn't 
in my partial mirror, because there were no binaries named jigdo, 
rather jigdo-file and jigdo-lite.  Since there were no sources with that 
name, the jigdo source was never mirrored on my partial mirror.  I don't know 
if that behavior has been fixed now, since there is now a binary named jigdo, 
instead of jigdo-lite.

Also, it's more difficult for the local repository to determine the difference 
between the automatically selected and manually selected packages in this 
type of setup, since you would be sending a longer list of manually selected 
packages, instead of distinguishing which ones are actually selected.  I 
guess that it doesn't matter much, as a package would only be removed from 
the repository once it's not listed on any of the lists.  There were times 
when I didn't want certain packages to be removed from the repository, 
regardless of whether they were installed or not, so I used to run xxdiff on 
the packages files, so the newer ones were added.

In my way of thinking, I'm not looking to merge upstream repositories together 
in one repository.  Besides, there are already tools, such as apt-move that 
would be better for this job.  Long ago, apt-move was the primary tool that I 
used to keep a local repository, and it worked pretty well, as long as all 
the machines that were using it were on the same release.

I have found that reprepro is the absolute best tool for maintaining a debian 
mirror.  The only problem I have with it is when I want to maintain a partial 
mirror, and I don't want a merged repository, is that I have to spread the 
packages lists to different places, and when you start adding machines, you 
start adding more lists to the configuration, when it would probably be 
better to maintain a set of master lists that are generated from the many 
lists that come from the machines.


 MfG
 Goswin



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Thursday 18 June 2009 03:17:13 Frank Lin PIAT wrote:
 On Tue, 2009-06-09 at 16:16 -0500, Joseph Rawson wrote:
  On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote:
   I had an idea in mind whereby the task of making mirrors for personal
   distributions can be automated.

 lazy-way
 Depending on what you want to achieve, a caching proxy might be an easy
 solution (there are a specialized in the archive already)
 /lazy-way

Or possibly apt-move called as a post-invoke action of apt-get.

   This can be stated as: if a person
   wants to keep a customised set of packages for usage with the
   distribution, the tool should be able to develop dependencies, fetch
   packages, generate appropriate documentation and then create the
   corresponding directory structure in the target mirror! The task can
   be extended to include packages which are currently not under one of
   the standard mirrors!

 lazy-way
 One don't have to merge the repositories, one can just declare multiple
 sources in /etc/apt/*
 /lazy-way

Then it becomes harder to send the package to the appropriate local 
repository, since they aren't merged.  I would also prefer to not have to 
deal with a merged repository, but keep separate upstream partial mirrors, as 
they would probably be easier to manage.

   I think the tool can have immense utility in helping people automate
   the task of mantaining the repositories. Suggestions, positive and
   negative are invited.
  
   I have not included the impl details as I would first like to evaluate
   the idea at a feasibility and utility level.

 If the scope of your project includes being able to bootstrap systems
 from the mirror, resolving dependency is much more complex (some
 packages aren't resolved by dependencies. For instance, the right kernel
 is select by some logic in Debian-installer).
 I found some interesting logic in debian-cd package.

 Still, I don't consider that allowing bootstrapping is mandatory. Your
 project would still be extremely valuable without it. [for those 95% of
 the people that install from CD, as opposed to netboot].

The reason that I recommended tying germinate and reprepro together with a 
tool was because the original post was discussing personal distributions.  
To me, this implies the ability to bootstrap, and also the need to have 
a self building source/binary repository.

I have just made some other responses to Goswin that should help explain my 
view on things a bit better.

 Regards,

 Franklin



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Thursday 18 June 2009 04:47:45 Goswin von Brederlow wrote:
 Frank Lin PIAT fp...@klabs.be writes:
  On Tue, 2009-06-09 at 16:16 -0500, Joseph Rawson wrote:
  On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote:
   This can be stated as: if a person
   wants to keep a customised set of packages for usage with the
   distribution, the tool should be able to develop dependencies, fetch
   packages, generate appropriate documentation and then create the
   corresponding directory structure in the target mirror! The task can
   be extended to include packages which are currently not under one of
   the standard mirrors!
 
  lazy-way
  One don't have to merge the repositories, one can just declare multiple
  sources in /etc/apt/*
  /lazy-way

 Lets say I want to mirror xserver-xorg from experimental. Then I would
 want it to include xserver-xorg-core (= xyz) also from experimental
 as the dependency dictates but not include libc6 from experimental as
 the sid one is sufficient.

 A key point here would be flexibility.
This is something that I haven't considered yet.  This would be one of the 
problems that might occur with the post invoke hook that you mentioned 
earlier using dpkg status.  Actually this wouldn't be much of a problem, I 
was confused.  I was thinking you were meaning --get-selections which just 
returns the name of the package and install/deinstall, but status also 
contains the version being used, and this could be matched to the appropriate 
repository in the sources list (so you get the libc from main instead of 
experimental, since the status file uses the version that's in main).

However, I don't know how to use that info with reprepro.  With reprepro, I've 
only sent --get-selections lists to it.  In fact, this is how I used to 
install new packages in sid, and make sure they came from the local 
repository first.


#!/bin/bash
packages=`grep-status  install ok not-installed | grep Package | 
gawk '{print $2}'`
#packages=`aptitude search ~N | grep ^.i | gawk '{print $2}'`
touch conf/list-uninstalled.tmp
for package in $packages 
  do echo -e $package\t\tinstall  conf/list-uninstalled.tmp
done
cat conf/list-uninstalled.tmp | uniq | sort  conf/list-uninstalled
rm conf/list-uninstalled.tmp


You may be able to tell by looking at the script that I'm still in the process 
of getting used to aptitude, being a longtime dselect user. ;)
Anyway, I don't know much about determining (with reprepro) which upstream 
repository holds the version of the package that I want installed.


   I think the tool can have immense utility in helping people automate
   the task of mantaining the repositories. Suggestions, positive and
   negative are invited.
  
   I have not included the impl details as I would first like to evaluate
   the idea at a feasibility and utility level.
 
  If the scope of your project includes being able to bootstrap systems
  from the mirror, resolving dependency is much more complex (some
  packages aren't resolved by dependencies. For instance, the right kernel
  is select by some logic in Debian-installer).
  I found some interesting logic in debian-cd package.

 You would include linux-image-type in your package list. That
 isn't really a problem of the tool. Just of the input you need to provide.
 Also you would include everything udeb and everything
 essential/required for bootstraping purposes.

I was also thinking along those lines, too.  Same with fam/gamin and other 
packages that have drop-in replacements.

 Again flexibility is the key.

  Still, I don't consider that allowing bootstrapping is mandatory. Your
  project would still be extremely valuable without it. [for those 95% of
  the people that install from CD, as opposed to netboot].
 
  Regards,
 
  Franklin

 MfG
 Goswin

 PS: the essential/required packages can already easily be filtered
 with grep-dctrl.



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Tzafrir Cohen
On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote:

 would be much more interested in making a tool that would make it easier to 
 manage local/partial debian mirrors (i.e. one that helped resolve the 
 dependencies), rather than have an apt-get wrapper.  I also think that once 
 such a tool is made, it would make it easier to build an apt-get wrapper that 
 works with it.  I don't think that viewing the problem with an apt-get 
 wrapper solution is the best way to approach it, but I do think that it 
 would be valuable once the underlying problems are solved.

And reprepro does not fit the bill because?

-- 
Tzafrir Cohen | tzaf...@jabber.org | VIM is
http://tzafrir.org.il || a Mutt's
tzaf...@cohens.org.il ||  best
ICQ# 16849754 || friend


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote:
 On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote:
  would be much more interested in making a tool that would make it easier
  to manage local/partial debian mirrors (i.e. one that helped resolve the
  dependencies), rather than have an apt-get wrapper.  I also think that
  once such a tool is made, it would make it easier to build an apt-get
  wrapper that works with it.  I don't think that viewing the problem with
  an apt-get wrapper solution is the best way to approach it, but I do
  think that it would be valuable once the underlying problems are solved.

 And reprepro does not fit the bill because?

It fits part of the bill, as it's an excellent tool for maintaining a 
repository, but it doesn't resolve dependencies (nor should it).

 --
 Tzafrir Cohen | tzaf...@jabber.org | VIM is
 http://tzafrir.org.il || a Mutt's
 tzaf...@cohens.org.il ||  best
 ICQ# 16849754 || friend



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Friday 19 June 2009 00:27:06 Goswin von Brederlow wrote:
 Joseph Rawson umebos...@gmail.com writes:
  BTW, the subject of this thread is apt-get wrapper for maintaining
  Partial Mirrors.  The solution I'm proposing is a simple tool for
  maintaining Partial Mirrors (which could possibly be wrapped by apt-get
  later).
 
  I think that just pursuing an apt-get wrapper leads to some
  complications that could be avoided by creating the partial mirror tool
  first, then looking at wrapping it later.  One complication might be how
  do handle apt-get remove, and another might be how to handle sid
  libraries that disappear from official repository, yet local machines
  must have them.

 Ahh, so maybe I completly misread that part.

 Do you mean a wrapper around apt-get so that apt-get install foo on
 any client would automatically add foo to the list of packages being
 mirrored on the server?

 If so then you can configure a post invoke hook in apt that will copy
 the dpkg status file of the host to the server [as status.$(hostname)]
 and then use those on the server to generate the filter for
 reprepro. I think I still have a script for that somewhere but it is
 easy enough to rewrite.

When you mentioned the word hook, I was reminded of reprepro's ability to 
use hooks.  I started testing using a ListHook script with reprepro.  I'm 
attaching the script so you can see the general idea.  The script doesn't do 
anything effective, but may be helpful in understanding more of the way I'm 
approaching the idea.  Please don't laugh too hard, I'm just playing with 
ideas now.

Among other possible reasons, there are two main reasons why this particular 
approach won't work.  One reason is that the ListHook calls a script for each 
list independently.  So, if you have a package in contrib that depends on a 
package in main, like many do, the dependency won't be resolved using this 
method.  Also, the germinator object only handles one arch at a time, so if 
you are mirroring multiple arches, you need to use a germinator object for 
each one.  One way that this problem can be countered is by running a simple 
server that holds the germinator object, and the script that ListHook 
executes would communicate with that server.  Then the server would grow 
the seeds and create the filter lists that would be used by reprepro.

I tried this approach because I didn't see the sense in downloading the 
packages lists more than necessary.  The way I was thinking before was to 
seed germinate (which would download the package lists), parse the output, 
create filter lists from that output, send them to reprepro, and call 
reprepro to update.  This forces all of those package lists to be downloaded 
twice, which was something I tried to avoid with this short experiment.

It also seems to be somewhat difficult to plant the seeds into germinate 
manually.  I'm sure that problem could be solved by looking through the code 
a bit longer.

 MfG
 Goswin



-- 
Thanks:
Joseph Rawson


testgerm
Description: application/python


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Bernhard R. Link
* Joseph Rawson umebos...@gmail.com [090619 13:23]:
 On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote:
  On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote:
   would be much more interested in making a tool that would make it easier
   to manage local/partial debian mirrors (i.e. one that helped resolve the
   dependencies), rather than have an apt-get wrapper.  I also think that
   once such a tool is made, it would make it easier to build an apt-get
   wrapper that works with it.  I don't think that viewing the problem with
   an apt-get wrapper solution is the best way to approach it, but I do
   think that it would be valuable once the underlying problems are solved.
 
  And reprepro does not fit the bill because?
 
 It fits part of the bill, as it's an excellent tool for maintaining a
 repository, but it doesn't resolve dependencies (nor should it).

Actually, I'm quite open to having some depedency handling in reprepro
and already have written some simple prototype for a related project.
The problem is that calculating a simple cover of selected packages in
the dependency graph is not enough:

Usually the cover is not unique but the existance of alternatives in
dependencies causes multiple solutions. For an initial checkout that
is no problem, as one can choose one some set by some pseudo-random
selection (like packages with alphabetically lower names get the
first depedency in an alternative tried first and similar things
for virtual packages). The problem is that no such criterion can be
stable against changes in the partially mirrored distribution.

So in this cases knowing what packages upstream has and what packages
are wanted is not enough but one has to take into account what packages
are currently selected. And a simply covering no longer is enough but
one needs a full resolver knowing which installed states can be easily
brought to which other installed states. (and things get even more
complicated if the currently mirrored packages allow multiple subsets
which clients using this repository might have installed)...

Hochachtungsvoll,
Bernhard R. Link


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Goswin von Brederlow
Joseph Rawson umebos...@gmail.com writes:

 On Friday 19 June 2009 00:27:06 Goswin von Brederlow wrote:
 Joseph Rawson umebos...@gmail.com writes:
 If so then you can configure a post invoke hook in apt that will copy
 the dpkg status file of the host to the server [as status.$(hostname)]
 and then use those on the server to generate the filter for
 reprepro. I think I still have a script for that somewhere but it is
 easy enough to rewrite.
 That's good for binaries, but I don't know about the source.  It wasn't long 
 ago that I noticed a problem with reprepro not obtaining the corresponding 
 source packages when you use a filter list taken 
 from  dpkg --get-selections.  I remember that the source for jigdo wasn't 
 in my partial mirror, because there were no binaries named jigdo, 
 rather jigdo-file and jigdo-lite.  Since there were no sources with that 
 name, the jigdo source was never mirrored on my partial mirror.  I don't know 
 if that behavior has been fixed now, since there is now a binary named jigdo, 
 instead of jigdo-lite.

My filter first converted the packages listed in the status file(s) to
source package names (packages with different name have a Source:
entry) and then output those for sources.

 Also, it's more difficult for the local repository to determine the 
 difference 
 between the automatically selected and manually selected packages in this 
 type of setup, since you would be sending a longer list of manually selected 
 packages, instead of distinguishing which ones are actually selected.  I 
 guess that it doesn't matter much, as a package would only be removed from 
 the repository once it's not listed on any of the lists.  There were times 
 when I didn't want certain packages to be removed from the repository, 
 regardless of whether they were installed or not, so I used to run xxdiff on 
 the packages files, so the newer ones were added.

Same problem here. Esspecially build-depends. There where a lot of
packages I only needed inside my build chroots and only for the time
of the build. So they never showed up on the mirror. Then I just
resized the mirror partition and mirrored all debs.

 In my way of thinking, I'm not looking to merge upstream repositories 
 together 
 in one repository.  Besides, there are already tools, such as apt-move that 
 would be better for this job.  Long ago, apt-move was the primary tool that I 
 used to keep a local repository, and it worked pretty well, as long as all 
 the machines that were using it were on the same release.

 I have found that reprepro is the absolute best tool for maintaining a debian 
 mirror.  The only problem I have with it is when I want to maintain a partial 
 mirror, and I don't want a merged repository, is that I have to spread the 
 packages lists to different places, and when you start adding machines, you 
 start adding more lists to the configuration, when it would probably be 
 better to maintain a set of master lists that are generated from the many 
 lists that come from the machines.

Or have a proxy that adds packages that are requested.

MfG
Goswin


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Friday 19 June 2009 07:14:08 Bernhard R. Link wrote:

 Actually, I'm quite open to having some depedency handling in reprepro
That is interesting.  I've been working on the assumption that there would 
never be any dependency handling in reprepro, as I didn't consider it part of 
it's function.

 and already have written some simple prototype for a related project.
 The problem is that calculating a simple cover of selected packages in
 the dependency graph is not enough:

 Usually the cover is not unique but the existance of alternatives in
 dependencies causes multiple solutions. 
This is a problem across the board.  Even aptitude seems to have problems in 
automatically determining the most appropriate dependencies.  

Let's use this example.  Suppose you already have a system with apache2 
installed, but no php yet.  Next you try to install phpldapadmin, using 
aptitude (from the command line).  Aptitude will tell you that 
libapache-mod-php5 is broken, and proceed to present some alternatives that 
would resolve the dependencies.


umebo...@stdinstall:~$ sudo aptitude -s install  phpldapadmin
Reading package lists... Done
Building dependency tree
Reading state information... Done
Reading extended state information
Initializing package states... Done
Reading task descriptions... Done
The following packages are BROKEN:
  libapache-mod-php5
The following NEW packages will be installed:
  php5-common{a} php5-ldap{a} phpldapadmin
0 packages upgraded, 4 newly installed, 0 to remove and 0 not upgraded.
Need to get 3821kB of archives. After unpacking 11.8MB will be used.
The following packages have unmet dependencies:
  libapache-mod-php5: Depends: libdb4.4 which is a virtual package.
  Depends: apache-common (= 1.3.34) which is a virtual 
package.
  Depends: php5-common (= 5.2.0-10+lenny1) but 
5.2.6.dfsg.1-1+lenny3 is to be installed.
The following actions will resolve these dependencies:

Install the following packages:
libapache2-mod-php5 [5.2.6.dfsg.1-1+lenny3 (stable)]

Keep the following packages at their current version:
libapache-mod-php5 [Not Installed]

Score is 50

Accept this solution? [Y/n/q/?] n
The following actions will resolve these dependencies:

Install the following packages:
php5-cgi [5.2.6.dfsg.1-1+lenny3 (stable)]

Keep the following packages at their current version:
libapache-mod-php5 [Not Installed]

Score is 50

Accept this solution? [Y/n/q/?] n
The following actions will resolve these dependencies:

Install the following packages:
libapache2-mod-php5 [5.2.6.dfsg.1-1+lenny2 (stable)]
php5-common [5.2.6.dfsg.1-1+lenny2 (stable)]
php5-ldap [5.2.6.dfsg.1-1+lenny2 (stable)]

Keep the following packages at their current version:
libapache-mod-php5 [Not Installed]

Score is -30


etc, etc, etc .

apt-get, on the other hand, seems to use the first dependency that's listed as 
an alternative.

Depends: apache2 | httpd, php5-ldap, libapache2-mod-php5 | 
libapache-mod-php5 |
 php5-cgi | php5, debconf (= 0.5) | debconf-2.0

Here, since we already have apache2 on the system, libapache2-mod-php5 is 
chosen (I'm guessing because it's the first one listed).

 For an initial checkout that 
 is no problem, as one can choose one some set by some pseudo-random
 selection (like packages with alphabetically lower names get the
 first depedency in an alternative tried first and similar things
 for virtual packages). 
I think that it should be up to the maintainer of the local mirror to 
explicitly list the alternatives that are preferred.  I don't think that 
there is anyway that an automatic dependency resolver will ever be able to do 
this.  The automatic dependency resolver can make this easier by marking 
those dependencies as automatically selected, alternative available or 
something similar.  One of the nice things about germinate, is that it has 
a why column in it's output that tells why a package was selected (although 
it doesn't make it clear that it's one of many alternatives).

 The problem is that no such criterion can be 
 stable against changes in the partially mirrored distribution.
I'm not sure what you mean here.  Are you talking about an alternative that's 
selected for the local mirror, but removed from the official mirror?


 So in this cases knowing what packages upstream has and what packages
 are wanted is not enough but one has to take into account what packages
 are currently selected. And a simply covering no longer is enough but
 one needs a full resolver knowing which installed states can be easily
 brought to which other installed states. (and things get even more
 complicated if the currently mirrored packages allow multiple subsets
 which clients using this repository might have installed)...

I used to have to keep outdated libraries in my filter list when I was using a 
partial 

Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Tzafrir Cohen
On Fri, Jun 19, 2009 at 06:23:08AM -0500, Joseph Rawson wrote:
 On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote:
  On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote:
   would be much more interested in making a tool that would make it easier
   to manage local/partial debian mirrors (i.e. one that helped resolve the
   dependencies), rather than have an apt-get wrapper.  I also think that
   once such a tool is made, it would make it easier to build an apt-get
   wrapper that works with it.  I don't think that viewing the problem with
   an apt-get wrapper solution is the best way to approach it, but I do
   think that it would be valuable once the underlying problems are solved.
 
  And reprepro does not fit the bill because?
 
 It fits part of the bill, as it's an excellent tool for maintaining a 
 repository, but it doesn't resolve dependencies (nor should it).

Just in case it might help, here's a script we used internally (at the
Sarge time) to maintain a dummy repository that would help us eventually
resolve an original list of packages to a complete list of packages we
ask a reprepro source to update.

-- 
Tzafrir Cohen | tzaf...@jabber.org | VIM is
http://tzafrir.org.il || a Mutt's
tzaf...@cohens.org.il ||  best
ICQ# 16849754 || friend


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Tzafrir Cohen
On Fri, Jun 19, 2009 at 02:14:08PM +0200, Bernhard R. Link wrote:
 * Joseph Rawson umebos...@gmail.com [090619 13:23]:
  On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote:
   On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote:
would be much more interested in making a tool that would make it easier
to manage local/partial debian mirrors (i.e. one that helped resolve the
dependencies), rather than have an apt-get wrapper.  I also think that
once such a tool is made, it would make it easier to build an apt-get
wrapper that works with it.  I don't think that viewing the problem with
an apt-get wrapper solution is the best way to approach it, but I do
think that it would be valuable once the underlying problems are solved.
  
   And reprepro does not fit the bill because?
  
  It fits part of the bill, as it's an excellent tool for maintaining a
  repository, but it doesn't resolve dependencies (nor should it).
 
 Actually, I'm quite open to having some depedency handling in reprepro
 and already have written some simple prototype for a related project.
 The problem is that calculating a simple cover of selected packages in
 the dependency graph is not enough:
 
 Usually the cover is not unique but the existance of alternatives in
 dependencies causes multiple solutions. For an initial checkout that
 is no problem, as one can choose one some set by some pseudo-random
 selection (like packages with alphabetically lower names get the
 first depedency in an alternative tried first and similar things
 for virtual packages). The problem is that no such criterion can be
 stable against changes in the partially mirrored distribution.

While it's a good queastion, the interface I'm used to use is apt-get /
aptitude. Thus the interface I had in mind is a list of packages to
install (in a single installation). Using some tweaking this allows you
to get exactly what you want.

If you want your repository to include conflicting options, you should 
allow the interface to include multiple such entries. In our case we had
multiple files. Each file was a list of packages, and each file was
basically a single apt-get command.

-- 
Tzafrir Cohen | tzaf...@jabber.org | VIM is
http://tzafrir.org.il || a Mutt's
tzaf...@cohens.org.il ||  best
ICQ# 16849754 || friend


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Friday 19 June 2009 12:57:25 Goswin von Brederlow wrote:
 Joseph Rawson umebos...@gmail.com writes:
  On Friday 19 June 2009 00:27:06 Goswin von Brederlow wrote:
  Joseph Rawson umebos...@gmail.com writes:
  If so then you can configure a post invoke hook in apt that will copy
  the dpkg status file of the host to the server [as status.$(hostname)]
  and then use those on the server to generate the filter for
  reprepro. I think I still have a script for that somewhere but it is
  easy enough to rewrite.
 
  That's good for binaries, but I don't know about the source.  It wasn't
  long ago that I noticed a problem with reprepro not obtaining the
  corresponding source packages when you use a filter list taken
  from  dpkg --get-selections.  I remember that the source for jigdo
  wasn't in my partial mirror, because there were no binaries named
  jigdo, rather jigdo-file and jigdo-lite.  Since there were no
  sources with that name, the jigdo source was never mirrored on my partial
  mirror.  I don't know if that behavior has been fixed now, since there is
  now a binary named jigdo, instead of jigdo-lite.

 My filter first converted the packages listed in the status file(s) to
 source package names (packages with different name have a Source:
 entry) and then output those for sources.

  Also, it's more difficult for the local repository to determine the
  difference between the automatically selected and manually selected
  packages in this type of setup, since you would be sending a longer list
  of manually selected packages, instead of distinguishing which ones are
  actually selected.  I guess that it doesn't matter much, as a package
  would only be removed from the repository once it's not listed on any of
  the lists.  There were times when I didn't want certain packages to be
  removed from the repository, regardless of whether they were installed or
  not, so I used to run xxdiff on the packages files, so the newer ones
  were added.

 Same problem here. Esspecially build-depends. There where a lot of
 packages I only needed inside my build chroots and only for the time
 of the build. So they never showed up on the mirror. Then I just
 resized the mirror partition and mirrored all debs.

That was my ultimate solution to the problem.  I bought one of the new 
terabyte usb external drives and just mirrored the whole repository.  I had 
been satisfied to just call the problem solved at that point, but this thread 
resparked my interest in obtaining a better solution.  Before I bought the 
hard drive, I was seriously looking into getting germinate and reprepro 
working together, but once I bought the drive, I just set it all aside.  
Still, this external drive isn't portable, and my small portable drive is 
only 80G (which is more than enough for a partial mirror of source, i386, and 
amd64), so I do still need to solve the problem.  Besides, a month after I 
bought the drive, I discovered that I have a monthly cap on my transfers so 
it would be better, all around, to stop mirroring the complete repository.

  In my way of thinking, I'm not looking to merge upstream repositories
  together in one repository.  Besides, there are already tools, such as
  apt-move that would be better for this job.  Long ago, apt-move was the
  primary tool that I used to keep a local repository, and it worked pretty
  well, as long as all the machines that were using it were on the same
  release.
 
  I have found that reprepro is the absolute best tool for maintaining a
  debian mirror.  The only problem I have with it is when I want to
  maintain a partial mirror, and I don't want a merged repository, is that
  I have to spread the packages lists to different places, and when you
  start adding machines, you start adding more lists to the configuration,
  when it would probably be better to maintain a set of master lists that
  are generated from the many lists that come from the machines.

 Or have a proxy that adds packages that are requested.
When I woke up this morning, I was thinking that it might be interesting to 
have an apt method that talks directly to reprepro.  It's just a vague idea 
now, but I'll give it some more thought later.


 MfG
 Goswin



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Friday 19 June 2009 20:54:28 Tzafrir Cohen wrote:
 On Fri, Jun 19, 2009 at 06:23:08AM -0500, Joseph Rawson wrote:
  On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote:
   On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote:
would be much more interested in making a tool that would make it
easier to manage local/partial debian mirrors (i.e. one that helped
resolve the dependencies), rather than have an apt-get wrapper.  I
also think that once such a tool is made, it would make it easier to
build an apt-get wrapper that works with it.  I don't think that
viewing the problem with an apt-get wrapper solution is the best
way to approach it, but I do think that it would be valuable once the
underlying problems are solved.
  
   And reprepro does not fit the bill because?
 
  It fits part of the bill, as it's an excellent tool for maintaining a
  repository, but it doesn't resolve dependencies (nor should it).

 Just in case it might help, here's a script we used internally (at the
 Sarge time) to maintain a dummy repository that would help us eventually
 resolve an original list of packages to a complete list of packages we
 ask a reprepro source to update.

Did you forget to attach it? :)

 --
 Tzafrir Cohen | tzaf...@jabber.org | VIM is
 http://tzafrir.org.il || a Mutt's
 tzaf...@cohens.org.il ||  best
 ICQ# 16849754 || friend



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Tzafrir Cohen
Actually attaching the file this time...

On Sat, Jun 20, 2009 at 01:54:28AM +, Tzafrir Cohen wrote:
 On Fri, Jun 19, 2009 at 06:23:08AM -0500, Joseph Rawson wrote:
  On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote:
   On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote:
would be much more interested in making a tool that would make it easier
to manage local/partial debian mirrors (i.e. one that helped resolve the
dependencies), rather than have an apt-get wrapper.  I also think that
once such a tool is made, it would make it easier to build an apt-get
wrapper that works with it.  I don't think that viewing the problem with
an apt-get wrapper solution is the best way to approach it, but I do
think that it would be valuable once the underlying problems are solved.
  
   And reprepro does not fit the bill because?
  
  It fits part of the bill, as it's an excellent tool for maintaining a 
  repository, but it doesn't resolve dependencies (nor should it).
 
 Just in case it might help, here's a script we used internally (at the
 Sarge time) to maintain a dummy repository that would help us eventually
 resolve an original list of packages to a complete list of packages we
 ask a reprepro source to update.
 
 -- 
 Tzafrir Cohen | tzaf...@jabber.org | VIM is
 http://tzafrir.org.il || a Mutt's
 tzaf...@cohens.org.il ||  best
 ICQ# 16849754 || friend
 
 
 -- 
 To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
 with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
 

-- 
Tzafrir Cohen | tzaf...@jabber.org | VIM is
http://tzafrir.org.il || a Mutt's
tzaf...@cohens.org.il ||  best
ICQ# 16849754 || friend
#!/bin/bash

# using bash-specific PIPESTATUS

CMD=`basename $0`

REPREPRO=reprepro
BASE_DIR=repo
APT_BASE_DIR=${BASE_DIR}/Aptdir
APT_DIR=${APT_DIR:-${APT_BASE_DIR}/unstable}
MAIN_REPO=/home/repo
PACKAGES_LIST_FILE=packages
STATIC_DIR=$MAIN_REPO/static
STATIC_INST=$MAIN_REPO/static_inst
INSTALLER_PATH=$BASE_DIR/dists/sarge/main/installer-i386/current
CD_OVERRIDE=cd-override/cd

set -e

usage() {
echo 2 apter: apt resolver wrapper
echo 2(functionality varies by basename of \$0)
echo 2 Usage: $0 setup|generate|refresh
}

# $1: file
# $2: condition
get_entry() {
  awk $BASE_DIR/conf/$1 -v RS='\n\n' /\$2\\n/ {print \$0} 
  #echo 2 printed updates section $2.
}

# $1: file (updates/distributions)
# $2: condition
# $3: field name
get_field() {
get_entry $1 $2 | grep ^$3:  | cut -d: -f2-
}

dists_list() {
awk '/^Codename: / {print $2}' $BASE_DIR/conf/distributions
}

case $CMD in
apt-get|apt-cache|aptitude)
exec $CMD \
-o Dir=$PWD/$APT_DIR \
-o Dir::State::status=$PWD/$APT_DIR/var/lib/dpkg/status 
\
$@
;;
apter)
case $1 in
setup)
for dist in `dists_list`
do
APT_DIR=$APT_BASE_DIR/$dist
export APT_DIR

for dir in \
etc/apt var/lib/apt/lists/partial \
var/lib/dpkg 
var/cache/apt/archives/partial
do mkdir -p $APT_DIR/$dir
done
touch $APT_DIR/var/lib/dpkg/status
# relevant update sources:
update_sources=`get_field distributions 
Codename: $dist Update`
(
for upd in $update_sources
do 
get_entry updates Name: $upd
echo ''
done
) | tools/updates2sources 
$APT_DIR/etc/apt/sources.list
cat EOF $APT_DIR/etc/apt/preferences
# give our packages a higher priority:
Package: *
Pin: release o=Xorcom
Pin-Priority: 600
EOF
done
;;
generate)
# setup the apt wrapper:
$0 setup
$0 refresh
;;
refresh) 
rm -rf $BASE_DIR/{db,dists,lists,pool}
$0 refresh-nodel
;;
upgrade|refresh-nodel) 
apt_cmd=`dirname $0`/apt-get
for file in `ls $STATIC_DIR`
do rsync -a --delete $STATIC_DIR/$file 

Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-18 Thread Goswin von Brederlow
Joseph Rawson umebos...@gmail.com writes:

 There is another application that will help with the dependencies.  It's 
 called germinate, and it will take a short list of packages and a list of 
 repositories and build a bunch of different lists of packages and their 
 dependencies.  Germinate will also determine build dependencies for those 
 packages and recursively build a list of builddeps and the builddeps' 
 builddeps.

 I have thought of making an application that would get germinate and reprepro 
 to work together to help build a decent partial mirror that had the correct 
 set of packages, but the process was a bit time consuming.  It's been a while 
Was it that bad? It only needs to run 4 times a day when the mirror
push comes in.

 since I've worked on this, since my temporary solution to the problem was to 
 buy a larger hard drive.  Currently, I have a full mirror that I keep 
 updated, and a repository of locally built packages next to it.  I'm not 
 really happy with this solution, as it uses too much disk space and I'm 
 downloading packages that will never be used, but it's given me time to 
 tackle more important problems.

 Before writing any code, I would recommend taking a look at both reprepro and 
 germinate, as each of these applications is good at solving half of the 
 problems you describe.  I think that an ideal solution would be to write a 
 frontend program that takes a list of packages and upstream repositories, 
 feeds them into germinate, obtains the result from germinate, parse those 
 results and build a reprepro configuration from that, then get reprepro to 
 fetch the appropriate packages.

Combining germinate and reprepro is the right thing to do. Or reprepro
and a new filter instead of germinate. But don't rewrite reprepro.

Given a little bit of care when writing the reprepro config this can
be completly done as part of the filtering. There is no need for a
seperate run that scanns all upstream repositories as long as you can
define a partial order between them, i.e. contrib needs things from
main but main never from contrib. That would also have the benefit
that you only need to process those packages files that have changed.

 I would be happy to help with this, as I could use such an application, and I 
 already have a meager bit of python code that parses the output of germinate 
 (germinate uses a wiki-type markup in it's output files).  I stopped working 
 on the code since I bought a new hard drive, since I just used the extra 
 space to solve the problem for me, but I can bring it back to life, as I 
 would desire to use a more correct solution.

Urgs, that sucks. It should take a Packages/Sources style input and
output the same format.

Maybe rewriting it using libapt would be better than wrapping germinate.

MfG
Goswin


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-18 Thread Frank Lin PIAT
On Tue, 2009-06-09 at 16:16 -0500, Joseph Rawson wrote:
 On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote:
  I had an idea in mind whereby the task of making mirrors for personal
  distributions can be automated.

lazy-way
Depending on what you want to achieve, a caching proxy might be an easy
solution (there are a specialized in the archive already)
/lazy-way

  This can be stated as: if a person
  wants to keep a customised set of packages for usage with the
  distribution, the tool should be able to develop dependencies, fetch
  packages, generate appropriate documentation and then create the
  corresponding directory structure in the target mirror! The task can
  be extended to include packages which are currently not under one of
  the standard mirrors!

lazy-way
One don't have to merge the repositories, one can just declare multiple
sources in /etc/apt/*
/lazy-way

  I think the tool can have immense utility in helping people automate
  the task of mantaining the repositories. Suggestions, positive and
  negative are invited.
 
  I have not included the impl details as I would first like to evaluate
  the idea at a feasibility and utility level.

If the scope of your project includes being able to bootstrap systems
from the mirror, resolving dependency is much more complex (some
packages aren't resolved by dependencies. For instance, the right kernel
is select by some logic in Debian-installer).
I found some interesting logic in debian-cd package.

Still, I don't consider that allowing bootstrapping is mandatory. Your
project would still be extremely valuable without it. [for those 95% of
the people that install from CD, as opposed to netboot].

Regards,

Franklin


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-18 Thread Goswin von Brederlow
Frank Lin PIAT fp...@klabs.be writes:

 On Tue, 2009-06-09 at 16:16 -0500, Joseph Rawson wrote:
 On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote:
  This can be stated as: if a person
  wants to keep a customised set of packages for usage with the
  distribution, the tool should be able to develop dependencies, fetch
  packages, generate appropriate documentation and then create the
  corresponding directory structure in the target mirror! The task can
  be extended to include packages which are currently not under one of
  the standard mirrors!

 lazy-way
 One don't have to merge the repositories, one can just declare multiple
 sources in /etc/apt/*
 /lazy-way

Lets say I want to mirror xserver-xorg from experimental. Then I would
want it to include xserver-xorg-core (= xyz) also from experimental
as the dependency dictates but not include libc6 from experimental as
the sid one is sufficient.

A key point here would be flexibility.

  I think the tool can have immense utility in helping people automate
  the task of mantaining the repositories. Suggestions, positive and
  negative are invited.
 
  I have not included the impl details as I would first like to evaluate
  the idea at a feasibility and utility level.

 If the scope of your project includes being able to bootstrap systems
 from the mirror, resolving dependency is much more complex (some
 packages aren't resolved by dependencies. For instance, the right kernel
 is select by some logic in Debian-installer).
 I found some interesting logic in debian-cd package.

You would include linux-image-type in your package list. That
isn't really a problem of the tool. Just of the input you need to provide.
Also you would include everything udeb and everything
essential/required for bootstraping purposes.

Again flexibility is the key.

 Still, I don't consider that allowing bootstrapping is mandatory. Your
 project would still be extremely valuable without it. [for those 95% of
 the people that install from CD, as opposed to netboot].

 Regards,

 Franklin

MfG
Goswin

PS: the essential/required packages can already easily be filtered
with grep-dctrl.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-18 Thread Joseph Rawson
On Thursday 18 June 2009 02:46:42 Goswin von Brederlow wrote:
 Joseph Rawson umebos...@gmail.com writes:
  There is another application that will help with the dependencies.  It's
  called germinate, and it will take a short list of packages and a list of
  repositories and build a bunch of different lists of packages and their
  dependencies.  Germinate will also determine build dependencies for those
  packages and recursively build a list of builddeps and the builddeps'
  builddeps.
 
  I have thought of making an application that would get germinate and
  reprepro to work together to help build a decent partial mirror that had
  the correct set of packages, but the process was a bit time consuming. 
  It's been a while

 Was it that bad? It only needs to run 4 times a day when the mirror
 push comes in.

It wasn't the running that was time consuming, but the writing of all the code 
to seed germinate, then try and use the results for reprepro.  I'm sorry if I 
wasn't clear on which part was consuming time.

  since I've worked on this, since my temporary solution to the problem was
  to buy a larger hard drive.  Currently, I have a full mirror that I keep
  updated, and a repository of locally built packages next to it.  I'm not
  really happy with this solution, as it uses too much disk space and I'm
  downloading packages that will never be used, but it's given me time to
  tackle more important problems.
 
  Before writing any code, I would recommend taking a look at both reprepro
  and germinate, as each of these applications is good at solving half of
  the problems you describe.  I think that an ideal solution would be to
  write a frontend program that takes a list of packages and upstream
  repositories, feeds them into germinate, obtains the result from
  germinate, parse those results and build a reprepro configuration from
  that, then get reprepro to fetch the appropriate packages.

 Combining germinate and reprepro is the right thing to do. Or reprepro
 and a new filter instead of germinate. But don't rewrite reprepro.

I never intended to rewrite reprepro.  It does it's job very well.  It's not 
reprepro's job to resolve dependencies, nor should it be, as a dependency 
could lie in an entirely different repository.

I do think that since each program has it's specific area of responsibility, 
that a program that glues them together would be appropriate, and help from 
reinventing wheels when it's not necessary.


 Given a little bit of care when writing the reprepro config this can
 be completly done as part of the filtering. There is no need for a
 seperate run that scanns all upstream repositories as long as you can
 define a partial order between them, i.e. contrib needs things from
 main but main never from contrib. That would also have the benefit
 that you only need to process those packages files that have changed.

  I would be happy to help with this, as I could use such an application,
  and I already have a meager bit of python code that parses the output of
  germinate (germinate uses a wiki-type markup in it's output files).  I
  stopped working on the code since I bought a new hard drive, since I just
  used the extra space to solve the problem for me, but I can bring it back
  to life, as I would desire to use a more correct solution.

 Urgs, that sucks. It should take a Packages/Sources style input and
 output the same format.

I don't like the output either, but I haven't taken much time to dig into the 
germinate code very much.
 Maybe rewriting it using libapt would be better than wrapping germinate.
Germinate uses libapt.  It imports apt_pkg from the python-apt package, which 
is a python binding to libapt, AFAIK.  It might be easier to just 
add '/usr/lib/germinate' to the sys.path and control the Germinator object 
directly, bypassing the way that the package lists are output from germinate.

Germinate does have an advantage in that it can recursively add the builddeps 
for a package list, making a list for a partial, self-building mirror.

BTW, the subject of this thread is apt-get wrapper for maintaining Partial 
Mirrors.  The solution I'm proposing is a simple tool for maintaining 
Partial Mirrors (which could possibly be wrapped by apt-get later).  

I think that just pursuing an apt-get wrapper leads to some complications 
that could be avoided by creating the partial mirror tool first, then 
looking at wrapping it later.  One complication might be how do handle 
apt-get remove, and another might be how to handle sid libraries that 
disappear from official repository, yet local machines must have them.


 MfG
 Goswin



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-18 Thread Goswin von Brederlow
Joseph Rawson umebos...@gmail.com writes:

 BTW, the subject of this thread is apt-get wrapper for maintaining Partial 
 Mirrors.  The solution I'm proposing is a simple tool for maintaining 
 Partial Mirrors (which could possibly be wrapped by apt-get later).  

 I think that just pursuing an apt-get wrapper leads to some complications 
 that could be avoided by creating the partial mirror tool first, then 
 looking at wrapping it later.  One complication might be how do handle 
 apt-get remove, and another might be how to handle sid libraries that 
 disappear from official repository, yet local machines must have them.

Ahh, so maybe I completly misread that part.

Do you mean a wrapper around apt-get so that apt-get install foo on
any client would automatically add foo to the list of packages being
mirrored on the server?

If so then you can configure a post invoke hook in apt that will copy
the dpkg status file of the host to the server [as status.$(hostname)]
and then use those on the server to generate the filter for
reprepro. I think I still have a script for that somewhere but it is
easy enough to rewrite.

MfG
Goswin


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



apt-get wrapper for maintaining Partial Mirrors

2009-06-09 Thread sanket agarwal
Hi all,

We all know that there are various distro's that build around Debian.
I had an idea in mind whereby the task of making mirrors for personal
distributions can be automated. This can be stated as: if a person
wants to keep a customised set of packages for usage with the
distribution, the tool should be able to develop dependencies, fetch
packages, generate appropriate documentation and then create the
corresponding directory structure in the target mirror! The task can
be extended to include packages which are currently not under one of
the standard mirrors!

I think the tool can have immense utility in helping people automate
the task of mantaining the repositories. Suggestions, positive and
negative are invited.

I have not included the impl details as I would first like to evaluate
the idea at a feasibility and utility level.


-- 
To UNSUBSCRIBE, email to debian-devel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-09 Thread Joseph Rawson
On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote:
 Hi all,

 We all know that there are various distro's that build around Debian.
 I had an idea in mind whereby the task of making mirrors for personal
 distributions can be automated. This can be stated as: if a person
 wants to keep a customised set of packages for usage with the
 distribution, the tool should be able to develop dependencies, fetch
 packages, generate appropriate documentation and then create the
 corresponding directory structure in the target mirror! The task can
 be extended to include packages which are currently not under one of
 the standard mirrors!

 I think the tool can have immense utility in helping people automate
 the task of mantaining the repositories. Suggestions, positive and
 negative are invited.

 I have not included the impl details as I would first like to evaluate
 the idea at a feasibility and utility level.

I have been working on this idea myself for quite a while, but I haven't 
messed with the problem recently.  I was using reprepro to maintain partial 
mirrors, but it required using the output from dpkg --get-selections from 
almost every machine that I needed to mirror packages for.  The reprepro 
program is excellent for making partial mirrors, but it has a drawback in 
that it doesn't help resolve dependencies.  This means that you can't just 
make a short list of packages and easily build a partial mirror that contains 
those packages and their dependencies, rather you have to install a machine 
with those packages and use the list of packages from that machine with 
reprepro to get a decent mirror.

There is another application that will help with the dependencies.  It's 
called germinate, and it will take a short list of packages and a list of 
repositories and build a bunch of different lists of packages and their 
dependencies.  Germinate will also determine build dependencies for those 
packages and recursively build a list of builddeps and the builddeps' 
builddeps.

I have thought of making an application that would get germinate and reprepro 
to work together to help build a decent partial mirror that had the correct 
set of packages, but the process was a bit time consuming.  It's been a while 
since I've worked on this, since my temporary solution to the problem was to 
buy a larger hard drive.  Currently, I have a full mirror that I keep 
updated, and a repository of locally built packages next to it.  I'm not 
really happy with this solution, as it uses too much disk space and I'm 
downloading packages that will never be used, but it's given me time to 
tackle more important problems.

Before writing any code, I would recommend taking a look at both reprepro and 
germinate, as each of these applications is good at solving half of the 
problems you describe.  I think that an ideal solution would be to write a 
frontend program that takes a list of packages and upstream repositories, 
feeds them into germinate, obtains the result from germinate, parse those 
results and build a reprepro configuration from that, then get reprepro to 
fetch the appropriate packages.

I would be happy to help with this, as I could use such an application, and I 
already have a meager bit of python code that parses the output of germinate 
(germinate uses a wiki-type markup in it's output files).  I stopped working 
on the code since I bought a new hard drive, since I just used the extra 
space to solve the problem for me, but I can bring it back to life, as I 
would desire to use a more correct solution.

-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.