Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-31 Thread David Kalnischkies
On Thu, Jan 30, 2014 at 03:42:13PM +0100, Julian Andres Klode wrote:
 On Thu, Jan 30, 2014 at 12:27:21PM +, Wookey wrote:
  +++ Julian Andres Klode [2014-01-30 08:12 +0100]:
   On Thu, Jan 30, 2014 at 03:13:16AM +, Wookey wrote:
  The problem is that in order to debootstrap you need all the packages in
  one repo so leaving the arch all packages in ftp.uk.debian.org means you
  can't debootstrap if you only uploaded the new-arch 'any' packages to
  the 'bootstrap' repo. It's also important to test that the arch-all
  build actually works, and not just the arch-any part so doing those
  builds and testing the results can be good. 
 
 A work around might be to reorder sources.list entries. The order of
 those entries determines from which source a package is retrieved, I
 believe the first match takes precedence.

The first one parsed decides which size is expected – and usually this is
also the one the package is acquired from, with the notable exception of
not downloading from an unsigned archive if a signed is available…
so, as this bootstrap archive is signed, is the key installed?


  It's fine for apt to consider these packages to be functionally
  equivalent, but it does need to check the correct checksum on download.
  It seems to me that this can be fixed by either adding size/hash to the
  hash as you suggest(making them 'different packages', or just separately
  ensuring that the checksum for the repo/file that was downloaded is
  used. Apt knows that there is more than one repo source for this
  package, but doesn't record that there might be more than one checksum?
  The fact that it can end up choosing one checksum and another source
  does seem wrong. Perhaps the code/object structure makes it hard to fix
  this this way and your fix is the only one that makes sense?
 
 It seems right to me in this case, because otherwise functional aspects
 like dependencies could differ as well. And if APT uses the dependencies
 from one source and then fetches the package from another source, but that
 one has different dependencies, installing it would produce an error.

This situation can't happen as you have yourself lined out that Depends
will influence the CRC hash, so they would get recognized as different
versions. That said, what could happen at the moment is that a package
could differ just by Multi-Arch field.
(minus hash collisions, but how likely is that…)

   An alternative would be to change the cache-building algorithms to look
   at SHA hashes and/or size and create different version entries in the 
   cache
   if they are present in both versions, but different. SHA Hashes would 
   require
   all repositories to use the same best checksum algorithm.
  
  I think just adding size to the hash would be cheap and easy and would
  largely solve this problem. Adding the hash would cover a few extra
  cases where the size came out the same too, but if it's difficult I'd be
  happy to have this mostly-solved, as it's a situation we normally try to
  avoid anyway.
 
 Adding the size to the hash is not possible, as dpkg does not store the
 size for installed packages. This would mean that an installed package
 always has a different hash than an available package, causing APT to
 go crazy (it would try to upgrade all installed packages...).

We could compare the size of the currently parsed version with the size
of the version we compare it with at the moment through (as long as the
current one isn't the status file one). See attached demo-patch.
Something like that (but tested, this one isn't) could be introduced
with the next abi break. It isn't bulletproof either, but a bit better.

(I wonder if it would make sense to move the comparison entirely into
 such an on-demand handling rather than this generate CRC for everyone.)


Best regards

David Kalnischkies
diff --git a/apt-pkg/deb/deblistparser.cc b/apt-pkg/deb/deblistparser.cc
index 68d544e..4fe5919 100644
--- a/apt-pkg/deb/deblistparser.cc
+++ b/apt-pkg/deb/deblistparser.cc
@@ -95,44 +95,51 @@ string debListParser::Version()
return Section.FindS(Version);
 }
 	/*}}}*/
-// ListParser::NewVersion - Fill in the version structure		/*{{{*/
-// -
-/* */
-bool debListParser::NewVersion(pkgCache::VerIterator Ver)
+unsigned char debListParser::ParseMultiArch(bool const showErrors)	/*{{{*/
 {
-   // Parse the section
-   Ver-Section = UniqFindTagWrite(Section);
-
-   // Parse multi-arch
+   unsigned char MA;
string const MultiArch = Section.FindS(Multi-Arch);
if (MultiArch.empty() == true)
-  Ver-MultiArch = pkgCache::Version::None;
+  MA = pkgCache::Version::None;
else if (MultiArch == same) {
   // Parse multi-arch
   if (ArchitectureAll() == true)
   {
 	 /* Arch all packages can't be Multi-Arch: same */
-	 _error-Warning(Architecture: all package '%s' can't be Multi-Arch: same,
-			Section.FindS(Package).c_str());
-	 Ver-MultiArch 

Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-30 Thread Wookey
+++ Julian Andres Klode [2014-01-30 08:12 +0100]:
 On Thu, Jan 30, 2014 at 03:13:16AM +, Wookey wrote:
  Package: apt
  Version: 0.9.15
  Severity: important
  
  In the sources I have my own bootstrap repository containing a lot of
  (unstable) packages built for arm64, and plain debian unstable and saucy 
  repos
  
  apt-get install arch-all-package   (that is available in all 3 repos)
  results in a size mismatch error. It seems that apt is using the
  checksum from one repo but downloading the package from another.
  
  The packages used is just an example it seems to be the same for any arch 
  all package
  
  (debian-arm64)# apt-cache policy x11proto-scrnsaver-dev
  x11proto-scrnsaver-dev:
Installed: (none)
Candidate: 1.2.2-1
Version table:
   1.2.2-1 0
  500 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
  debianstrap/main arm64 Packages
  500 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
  500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 Packages
  
 
 Right, and that's a problem, as having two different packages with the
 same version is not really supported. APT differentiates packages with the
 same version by CRC-16 hashing the fields
   Installed-Size
   Depends
   Pre-Depends
   Conflicts
   Breaks
   Replaces
 in order to handle packages where those are the same APT would need to hash
 size or SHA hash as well, but this fails for installed packages, as this
 information is not provided in /var/lib/dpkg/status.

OK. That makes sense. I see what's going on now. 

Which of course if why we do -B builds for other architectures and
carefully ensure there is only one copy of the arch all packages.


The problem is that in order to debootstrap you need all the packages in
one repo so leaving the arch all packages in ftp.uk.debian.org means you
can't debootstrap if you only uploaded the new-arch 'any' packages to
the 'bootstrap' repo. It's also important to test that the arch-all
build actually works, and not just the arch-any part so doing those
builds and testing the results can be good. 

It's fine for apt to consider these packages to be functionally
equivalent, but it does need to check the correct checksum on download.
It seems to me that this can be fixed by either adding size/hash to the
hash as you suggest(making them 'different packages', or just separately
ensuring that the checksum for the repo/file that was downloaded is
used. Apt knows that there is more than one repo source for this
package, but doesn't record that there might be more than one checksum?
The fact that it can end up choosing one checksum and another source
does seem wrong. Perhaps the code/object structure makes it hard to fix
this this way and your fix is the only one that makes sense?

 An alternative would be to change the cache-building algorithms to look
 at SHA hashes and/or size and create different version entries in the cache
 if they are present in both versions, but different. SHA Hashes would require
 all repositories to use the same best checksum algorithm.

I think just adding size to the hash would be cheap and easy and would
largely solve this problem. Adding the hash would cover a few extra
cases where the size came out the same too, but if it's difficult I'd be
happy to have this mostly-solved, as it's a situation we normally try to
avoid anyway.

I am clueless about the apt codebase (and C++ if it's not fairly 'C'-ey)
but am prepared to take a stab at this if you give me a clue where to
look.

thanks for the quick response.

Wookey
-- 
Principal hats:  Linaro, Emdebian, Wookware, Balloonboard, ARM
http://wookware.org/


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-30 Thread Julian Andres Klode
On Thu, Jan 30, 2014 at 12:27:21PM +, Wookey wrote:
 +++ Julian Andres Klode [2014-01-30 08:12 +0100]:
  On Thu, Jan 30, 2014 at 03:13:16AM +, Wookey wrote:
   Package: apt
   Version: 0.9.15
   Severity: important
   
   In the sources I have my own bootstrap repository containing a lot of
   (unstable) packages built for arm64, and plain debian unstable and saucy 
   repos
   
   apt-get install arch-all-package   (that is available in all 3 repos)
   results in a size mismatch error. It seems that apt is using the
   checksum from one repo but downloading the package from another.
   
   The packages used is just an example it seems to be the same for any arch 
   all package
   
   (debian-arm64)# apt-cache policy x11proto-scrnsaver-dev
   x11proto-scrnsaver-dev:
 Installed: (none)
 Candidate: 1.2.2-1
 Version table:
1.2.2-1 0
   500 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
   debianstrap/main arm64 Packages
   500 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
   500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 
   Packages
   
  
  Right, and that's a problem, as having two different packages with the
  same version is not really supported. APT differentiates packages with the
  same version by CRC-16 hashing the fields
  Installed-Size
  Depends
  Pre-Depends
  Conflicts
  Breaks
  Replaces
  in order to handle packages where those are the same APT would need to hash
  size or SHA hash as well, but this fails for installed packages, as this
  information is not provided in /var/lib/dpkg/status.
 
 OK. That makes sense. I see what's going on now. 
 
 Which of course if why we do -B builds for other architectures and
 carefully ensure there is only one copy of the arch all packages.
 
 
 The problem is that in order to debootstrap you need all the packages in
 one repo so leaving the arch all packages in ftp.uk.debian.org means you
 can't debootstrap if you only uploaded the new-arch 'any' packages to
 the 'bootstrap' repo. It's also important to test that the arch-all
 build actually works, and not just the arch-any part so doing those
 builds and testing the results can be good. 

A work around might be to reorder sources.list entries. The order of
those entries determines from which source a package is retrieved, I
believe the first match takes precedence.

 
 It's fine for apt to consider these packages to be functionally
 equivalent, but it does need to check the correct checksum on download.
 It seems to me that this can be fixed by either adding size/hash to the
 hash as you suggest(making them 'different packages', or just separately
 ensuring that the checksum for the repo/file that was downloaded is
 used. Apt knows that there is more than one repo source for this
 package, but doesn't record that there might be more than one checksum?
 The fact that it can end up choosing one checksum and another source
 does seem wrong. Perhaps the code/object structure makes it hard to fix
 this this way and your fix is the only one that makes sense?

It seems right to me in this case, because otherwise functional aspects
like dependencies could differ as well. And if APT uses the dependencies
from one source and then fetches the package from another source, but that
one has different dependencies, installing it would produce an error.

 
  An alternative would be to change the cache-building algorithms to look
  at SHA hashes and/or size and create different version entries in the cache
  if they are present in both versions, but different. SHA Hashes would 
  require
  all repositories to use the same best checksum algorithm.
 
 I think just adding size to the hash would be cheap and easy and would
 largely solve this problem. Adding the hash would cover a few extra
 cases where the size came out the same too, but if it's difficult I'd be
 happy to have this mostly-solved, as it's a situation we normally try to
 avoid anyway.

Adding the size to the hash is not possible, as dpkg does not store the
size for installed packages. This would mean that an installed package
always has a different hash than an available package, causing APT to
go crazy (it would try to upgrade all installed packages...).

David or Michael probably have some more ideas.

-- 
Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.

Please do not top-post if possible.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-30 Thread Wookey
+++ Julian Andres Klode [2014-01-30 15:42 +0100]:
 On Thu, Jan 30, 2014 at 12:27:21PM +, Wookey wrote:
  +++ Julian Andres Klode [2014-01-30 08:12 +0100]:
   On Thu, Jan 30, 2014 at 03:13:16AM +, Wookey wrote:
Package: apt
Version: 0.9.15
Severity: important

In the sources I have my own bootstrap repository containing a lot of
(unstable) packages built for arm64, and plain debian unstable and 
saucy repos

apt-get install arch-all-package   (that is available in all 3 repos)
results in a size mismatch error. It seems that apt is using the
checksum from one repo but downloading the package from another.

The packages used is just an example it seems to be the same for any 
arch all package

(debian-arm64)# apt-cache policy x11proto-scrnsaver-dev
x11proto-scrnsaver-dev:
  Installed: (none)
  Candidate: 1.2.2-1
  Version table:
 1.2.2-1 0
500 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
debianstrap/main arm64 Packages
500 http://ftp.uk.debian.org/debian/ unstable/main amd64 
Packages
500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 
Packages

   
   Right, and that's a problem, as having two different packages with the
   same version is not really supported. 

  OK. That makes sense. I see what's going on now. 
  
  The problem is that in order to debootstrap you need all the packages in
  one repo so leaving the arch all packages in ftp.uk.debian.org means you
  can't debootstrap if you only uploaded the new-arch 'any' packages to
  the 'bootstrap' repo. It's also important to test that the arch-all
  build actually works, and not just the arch-any part so doing those
  builds and testing the results can be good. 
 
 A work around might be to reorder sources.list entries. The order of
 those entries determines from which source a package is retrieved, I
 believe the first match takes precedence.

Ha. That does indeed provide a working workaround :-)

Moving the repo that the 'all' packages are being download from, to the
top of the list makes it work.

so moving ftp.uk.debian.org above p.d.o/~wookey/bootstrap

# apt-cache policy x11proto-scrnsaver-dev
x11proto-scrnsaver-dev:
  Installed: 1.2.2-1
  Candidate: 1.2.2-1
  Version table:
 *** 1.2.2-1 0
   550 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
  1001 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
debianstrap/main arm64 Packages
   500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 Packages
  100 /var/lib/dpkg/status

means that both downloads and checksums come from ftp.uk.debian.org

I'll use it like this for a bit and see if it always works, or just sometimes 
:-)

This isn't really any sort of actual 'solution', but it's a very handy 
suggestion.

Wookey
-- 
Principal hats:  Linaro, Emdebian, Wookware, Balloonboard, ARM
http://wookware.org/


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-29 Thread Wookey
Package: apt
Version: 0.9.15
Severity: important

In the sources I have my own bootstrap repository containing a lot of
(unstable) packages built for arm64, and plain debian unstable and saucy repos

apt-get install arch-all-package   (that is available in all 3 repos)
results in a size mismatch error. It seems that apt is using the
checksum from one repo but downloading the package from another.

The packages used is just an example it seems to be the same for any arch all 
package

(debian-arm64)# apt-cache policy x11proto-scrnsaver-dev
x11proto-scrnsaver-dev:
  Installed: (none)
  Candidate: 1.2.2-1
  Version table:
 1.2.2-1 0
500 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
debianstrap/main arm64 Packages
500 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 Packages

#apt-get install x11proto-scrnsaver-dev
Reading package lists... Done
Building dependency tree   
Reading state information... Done
The following NEW packages will be installed:
  x11proto-scrnsaver-dev
0 upgraded, 1 newly installed, 0 to remove and 118 not upgraded.
Need to get 22.3 kB of archives.
After this operation, 106 kB of additional disk space will be used.
Get:1 http://ftp.uk.debian.org/debian/ unstable/main x11proto-scrnsaver-dev all 
1.2.2-1 [22.3 kB]
Fetched 25.0 kB in 0s (1526 kB/s) 
E: Failed to fetch 
http://ftp.uk.debian.org/debian/pool/main/x/x11proto-scrnsaver/x11proto-scrnsaver-dev_1.2.2-1_all.deb
  Size mismatch

wget 
http://ftp.uk.debian.org/debian/pool/main/x/x11proto-scrnsaver/x11proto-scrnsaver-dev_1.2.2-1_all.deb
wget 
http://people.debian.org/~wookey/bootstrap/debianrepo2/pool/main/x/x11proto-scrnsaver/x11proto-scrnsaver-dev_1.2.2-1_all.deb
This is the one from ftp.uk.debian.org:
(debian-arm64)# md5sum x11proto-scrnsaver-dev_1.2.2-1_all.deb
fc8b3d0bc4c7e7aefa0177d94382adc4  x11proto-scrnsaver-dev_1.2.2-1_all.deb
This is the one from people.debian.org:
(debian-arm64)# md5sum x11proto-scrnsaver-dev_1.2.2-1_all.deb.1 
842270da2db205f3819a4dbaf4a75658  x11proto-scrnsaver-dev_1.2.2-1_all.deb.1

looking in the packages files those numbers are correct:
/var/lib/apt/lists/ftp.uk.debian.org_debian_dists_unstable_main_binary-amd64_Packages
MD5sum: fc8b3d0bc4c7e7aefa0177d94382adc4
SHA1: 5660bef42accd401efc3a04056330a9e34cbaf2d
SHA256: 505bb5098c80355c4474df5c8b3677fe1fda74764a52a29f7afca8e3df0603ad

/var/lib/apt/lists/people.debian.org_%7ewookey_bootstrap_debianrepo2_dists_debianstrap_main_binary-arm64_Packages
SHA256: e00c64cd6cab5e0eef91fb18440ec78827aeeb6452f79f450fb37acaa16f7984
SHA1: 83177ab07be653b427cb3d0d94a05f47f4a49a87
MD5sum: 842270da2db205f3819a4dbaf4a75658

So there is no reason why it should be saying 'size mismatch'.
A clue may be that if we set some pinning the 'wrong' .deb gets downloaded:

# apt-cache policy x11proto-scrnsaver-dev
x11proto-scrnsaver-dev:
  Installed: (none)
  Candidate: 1.2.2-1
  Version table:
 1.2.2-1 0
   1001 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
debianstrap/main arm64 Packages
   550 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
   500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 Packages

# apt-get install x11proto-scrnsaver-dev
Reading package lists... Done
Building dependency tree   
Reading state information... Done
The following NEW packages will be installed:
  x11proto-scrnsaver-dev
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 22.3 kB of archives.
After this operation, 106 kB of additional disk space will be used.
Get:1 http://ftp.uk.debian.org/debian/ unstable/main x11proto-scrnsaver-dev all 
1.2.2-1 [22.3 kB]
Fetched 25.0 kB in 0s (0 B/s)
E: Failed to fetch 
http://ftp.uk.debian.org/debian/pool/main/x/x11proto-scrnsaver/x11proto-scrnsaver-dev_1.2.2-1_all.deb
  Size mismatch

Should it not choose the repo with the highest pinning?
Is it getting the MD5SUM from one source but the binary from another?

If I remove 2 of the sources so that only one is available, then
x11proto-scrnsaver-dev is downloaded and installed OK.

#apt-get install  x11proto-scrnsaver-dev/debianstrap
still downloads the one from  http://ftp.uk.debian.org/debian/ and still gets 
the size mismatch
Specifying a codename just affects the version selection, not where it
is downloaded from (which would be fine if it checked the right checksum
:)

# apt-get install  x11proto-scrnsaver-dev/debianstrap 
Reading package lists... Done
Building dependency tree   
Reading state information... Done
Selected version '1.2.2-1' (Multiarch native-bootstrap 
packages:people.debian.org, Debian:unstable [all]) for 'x11proto-scrnsaver-dev'
The following NEW packages will be installed:
  x11proto-scrnsaver-dev
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 22.3 kB of archives.
After this operation, 106 kB of additional disk space will be used.
Get:1 http://ftp.uk.debian.org/debian/ 

Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-29 Thread Julian Andres Klode
On Thu, Jan 30, 2014 at 03:13:16AM +, Wookey wrote:
 Package: apt
 Version: 0.9.15
 Severity: important
 
 In the sources I have my own bootstrap repository containing a lot of
 (unstable) packages built for arm64, and plain debian unstable and saucy repos
 
 apt-get install arch-all-package   (that is available in all 3 repos)
 results in a size mismatch error. It seems that apt is using the
 checksum from one repo but downloading the package from another.
 
 The packages used is just an example it seems to be the same for any arch all 
 package
 
 (debian-arm64)# apt-cache policy x11proto-scrnsaver-dev
 x11proto-scrnsaver-dev:
   Installed: (none)
   Candidate: 1.2.2-1
   Version table:
  1.2.2-1 0
 500 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
 debianstrap/main arm64 Packages
 500 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
 500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 Packages
 

Right, and that's a problem, as having two different packages with the
same version is not really supported. APT differentiates packages with the
same version by CRC-16 hashing the fields
Installed-Size
Depends
Pre-Depends
Conflicts
Breaks
Replaces
in order to handle packages where those are the same APT would need to hash
size or SHA hash as well, but this fails for installed packages, as this
information is not provided in /var/lib/dpkg/status.

An alternative would be to change the cache-building algorithms to look
at SHA hashes and/or size and create different version entries in the cache
if they are present in both versions, but different. SHA Hashes would require
all repositories to use the same best checksum algorithm.

-- 
Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.

Please do not top-post if possible.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org