On Wed, 15 Jun 2011 23:22:25 +0200
David Kuehling <[email protected]> wrote:

> >> That's exactly the problem: inconsistent versions in the archive or
> >> archive updates while multistrap runs.
> 
> > That sounds like a broken archive.
> 
> > I haven't implemented the versioned source call - I remain unconvinced
> > that a valid archive would cause the download of a source package of a
> > different version to the binary package.
> 
> Are you assuming a non-changing archive? 

No, I'm assuming a decent mirroring tool. There is also a need to
understand exactly what apt is doing with apt-get source. I don't think
you've got that clear.

> As soon as the archive changes
> non-atomically (with locking applied by the client) I think we're
> doomed. 

Broken mirror resulting from an inadequate mirroring tool. Not
something which either apt or multistrap can fix or avoid.

> As we're building images from debian sid, changes will be
> pretty common.

Changes in sid may have the source and Arch:all packages arrive before
the buildds have built the binary for that arch but that is not why the
source version string appears in the Packages file (otherwise versions
wouldn't appear in stable and they clearly do.) i.e. using the version
will NOT help you in this situation.

If that's a problem, don't use unstable. (unstable is not meant to be
used for images, that's why we have testing. It's only usually 10 days
behind after all - AND you are assured of the matching version of the
source and binaries for all packages.) Unstable does not promise that
every architecture will always be in sync with the latest source but it
does ensure that the source is retained for as long as any one
supported arch hasn't finished installing the newer version.

> I'm not sure how many times 'multistrap' performs 'apt-get update'.

Every time the sources.list files change during the run. Once for the
bootstrap (which is all we care about here) and once for the
aptsources for the runtime system. Subsequent operations all use the
cache which apt creates from those downloaded files.

> Even if it only does it once, the source and binary package indices are
> distinct files, and they are retrieved by distinct transaction from the
> ftp/http server, so I see no way that you can guarantee consistency
> during mirror pushes.

Mirror pushes update both files simultaneously. The packages are copied
over to a incoming location, then the database is locked, the files are
copied into the pool, the indices are updated and then the original
files are deleted. 

It is the mirror push which ensures that the Packages and Sources file
are synchronised.

> The only atomicity you have is for updating a single index file via
> 'mv'.

And preparing the indices separately then 'mv' each into place is not
going to cause any detectable failure. I think you're chasing your
tail or not using a decent mirroring tool. 

>  * mirror gets updated, probably first new files put into the pool, then
>    the new indices follow, then even later, it's going to delete the
>    files no longer referenced by the indices.

No. The mirror gets updated by putting stuff safely into a local
temporary space, then the database is locked, then updated, the new
files are copied alongside the existing ones, then the prepared indices
are 'mv''d to replace the previous ones, then the old files are removed
and then the database is unlocked. It's as close to atomic as makes no
odds.

>  * Concurrently I run 'multistrap' which runs 'apt-get update', fetching
>    dists/sid/main/binary-amd64/Packages.bz2 and
>    sid/main/source/Sources.bz2 .
>  
>  * Now I get Packages.bz2 from before the mirror update, and Sources.bz2
>    from after the mirror update. 

No you don't. You get Packages.bz2 and Sources.bz2 in sync at the same
time in the same apt-get update call. Indeed in most cases, as apt is
using parallel connections, Packages and Sources are downloaded at the
same time over multiple sockets. What happens afterwards is that apt-get
source uses that cached data to get the sources. 

If a new version has arrived in the meantime then the old source
(the same version as the version of the binary downloaded earlier) will
still be obtained because apt only has cached data for the downloaded
source version, the Sources file already downloaded before the new
version arrived. In most cases, the old version will remain for at
least 10 days because it's the version currently in testing. In other
cases, the version is retained until built by all architectures. Either
way, there is nothing you can do in the apt-get source call to change
that because apt-get source only uses the Sources file which it
downloaded the last time apt-get update was run. If the mirror has
removed that file since then, there is nothing apt can do about it
except ask for apt-get update to be run again.

apt-get source does NOT go to the mirror and lookup the latest source
version on the mirror. It goes to it's cache of the Sources file which
it previously downloaded in parallel with the Packages file and creates
a http:// address for the .dsc and expects to be able to get that URL
(in the same was as wget would). 

If that file has been removed since the cache was updated, there is
nothing apt (or multistrap) can do about that. However, that is only
likely to happen with packages where a new version is uploaded to
unstable without allowing the 10 days for a testing migration.
Generally, such uploads are made because the package *won't* migrate
because of an RC bug which makes the argument that you shouldn't be
expecting to create an image using packages which could be susceptible
to such bugs in the first place.

> Sources.bz2 is going to have some
>    packages updated to newer versions, and won't correspond 100% to
>    binaries from Packages.bz2, thus violating the GPL

Not true - unless you're talking about waiting for packages to arrive
from the buildd's, but that is a very small space of time normally and
if that bothers you, just don't use unstable for the kind of tasks
where you need synchronised sources. Unstable does NOT make that
promise and there is nothing apt or multistrap can do about it.

> Of course I guess the error rate will not be too high, at least not
> over a normal high-rate internet connection.  

It's nothing to do with the download speeds.

-- 


Neil Williams
=============
http://www.linux.codehelp.co.uk/

Attachment: pgp1dSEDaNNLx.pgp
Description: PGP signature

Reply via email to