Bug#504413: KeyError: problem

2008-11-04 Thread John Wright
Hi Pietro,

Please check out my branch:

[EMAIL PROTECTED]:/tmp$ git clone 
git://git.debian.org/git/pkg-python-debian/python-debian.git
Initialized empty Git repository in /tmp/python-debian/.git/
remote: Counting objects: 1209, done.
remote: Compressing objects: 100% (338/338), done.
remote: Total 1209 (delta 858), reused 1198 (delta 853)
Receiving objects: 100% (1209/1209), 341.36 KiB | 263 KiB/s, done.
Resolving deltas: 100% (858/858), done.
[EMAIL PROTECTED]:/tmp$ cd python-debian/
[EMAIL PROTECTED]:/tmp/python-debian$ git checkout 
origin/jsw/apt-pkg-without-shared-storage
Note: moving to origin/jsw/apt-pkg-without-shared-storage which isn't a local 
branch
If you want to create a new branch from this checkout, you may do so
(now or later) by using -b with the checkout command again. Example:
  git checkout -b new_branch_name
HEAD is now at f8c3ca7... Add a use_apt_pkg parameter to Deb822.iter_paragraphs

I have added a use_apt_pkg parameter to the iter_paragraphs method, and
changed the behavior of the shared_storage parameter.  Now, use_apt_pkg
just turns on or off the usage of the (fast) apt_pkg parser, while
shared_storage determines whether to then copy the parsed data (when
False) or to use apt_pkg's internal Tags objects directly (which is
faster than copying, but results in objects not really being consistent
across iterations, as you found).

For now, both parameters default to True, to match previous behavior.
I'm going to run a few benchmarks to see if it actually makes sense to
default to shared_storage=False, since that's definitely the option that
would cause the fewest surprises.  I think the main speed benefit to
shared_storage previously was that it used apt_pkg's parser, not that it
didn't keep data across iterations.

Anyway, I plan to pull this into the master branch and upload it after
I've tested it a bit more (and probably written a few more unit tests).

Thanks,
-- 
John Wright [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#504413: KeyError: problem

2008-11-04 Thread John Wright
tags 504413 + pending
thanks

On Tue, Nov 04, 2008 at 03:20:16PM +0100, Pietro Abate wrote:
 On Tue, Nov 04, 2008 at 02:43:22AM -0700, John Wright wrote:
  Anyway, I plan to pull this into the master branch and upload it after
  I've tested it a bit more (and probably written a few more unit tests).
 
 It seems that using share_storage = False (fast parser and implicit copy) 
 gives
 the best result. I think this should be the default as it's backward 
 compatible 
 and it's iteration-safe.

I wrote some unit tests, found and fixed a nasty bug with
use_apt_pkg=True and shared_storage=False (key order wasn't being
preserved).  I also figured out how to make the gpg verification stuff
play nice with this, so I'm a bit more comfortable pushing it back up
now.

I made a couple of small optimizations that resulted in pretty
significant performance advantages over what I had before.  As you can
see, except for raw Deb822 paragraphs (with none of the
multivalued-fields magic), there is little difference between using
shared_storage=False and shared_storage=True.  Since the performance is
so similar, I've made use_apt_pkg=True, shared_storage=False the
default.  I've attached a little benchmark script and the results on my
system.

-- 
John Wright [EMAIL PROTECTED]
#!/usr/bin/python

import datetime
from debian_bundle import deb822

PACKAGES = /var/lib/apt/lists/linuxcoe.corp.hp.com_LinuxCOE_Debian_dists_sid_main_binary-i386_Packages

def benchmark(func, *args, **kwargs):
t_0 = datetime.datetime.now()
func(*args, **kwargs)
t_f = datetime.datetime.now()
print Elapsed time:, t_f - t_0


def iter_through_all(cls, *args, **kwargs):
iterator = cls.iter_paragraphs(*args, **kwargs)
for p in iterator:
pass


if __name__ == __main__:

for cls in deb822.Deb822, deb822.Packages:
print Class:, cls

print use_apt_pkg=True, shared_storage=True
benchmark(iter_through_all, cls, open(PACKAGES),
 use_apt_pkg=True, shared_storage=True)
print use_apt_pkg=True, shared_storage=False
benchmark(iter_through_all, cls, open(PACKAGES),
 use_apt_pkg=True, shared_storage=False)
print use_apt_pkg=False, shared_storage=False
benchmark(iter_through_all, cls, open(PACKAGES),
 use_apt_pkg=False, shared_storage=False)
print
[EMAIL PROTECTED]:~/debian/python-debian/benchmark$ python benchmark.py 
Class: class 'debian_bundle.deb822.Deb822'
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:02.137641
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:06.978174
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:23.644512

Class: class 'debian_bundle.deb822.Packages'
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:12.842322
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:18.014803
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:28.468205

[EMAIL PROTECTED]:~/debian/python-debian/benchmark$ python benchmark.py 
Class: class 'debian_bundle.deb822.Deb822'
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:02.183086
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:07.134066
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:23.840483

Class: class 'debian_bundle.deb822.Packages'
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:12.793268
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:17.962679
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:28.890604

[EMAIL PROTECTED]:~/debian/python-debian/benchmark$ python benchmark.py 
Class: class 'debian_bundle.deb822.Deb822'
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:02.184487
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:06.981271
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:23.387675

Class: class 'debian_bundle.deb822.Packages'
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:13.834592
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:18.01
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:28.221635


Bug#504413: KeyError: problem

2008-11-03 Thread Pietro Abate
Package: python-debian
Version: 0.1.11
Severity: normal


This snippet of code highligths this problem that is probably related to
the fact that keys are saved at the class level and not at the instance
level (something to do with shared_storage maybe ??).

:)
p


- test.py --

mport debian_bundle.deb822


paragraphs = 
debian_bundle.deb822.Packages.iter_paragraphs(open('/var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_contrib_binary-amd64_Packages','r'))

pkglist = {}

pkg = paragraphs.next()
k = (pkg['package'],pkg['version'])
pkglist[k] = pkg
print pkg

pkg = paragraphs.next()
k = (pkg['package'],pkg['version'])
pkglist[k] = pkg
print pkg

pkg = paragraphs.next()
k = (pkg['package'],pkg['version'])
pkglist[k] = pkg
print pkg

print pkglist

sys.exit(0)



[EMAIL PROTECTED]:~/Projects/$python test.py
Package: acx100-source
Priority: extra
Section: contrib/net
Installed-Size: 292
Maintainer: Stefano Canepa [EMAIL PROTECTED]
Architecture: all
Source: acx100
Version: 20070101-3
Depends: debhelper (= 4.0), dpatch, module-assistant
Filename: pool/contrib/a/acx100/acx100-source_20070101-3_all.deb
Size: 224900
MD5sum: f9673656b0c676c22bd64c351dd7536d
SHA1: 3cac7ad52e0215d1603605fc43d8ce3c81594990
SHA256: 6ad06bb46bdf8a7d941d5d3f7e02e976dbfe6ce3f26fb5067590bd044d44cb08
Description: ACX100/ACX111 wireless network drivers source
 This package provides the source code of the Linux drivers for wireless
 network cards using TI ACX100/ACX111 chips. This includes DWL-[G]520+
 PCI, DWL-[G]650+ CardBus, GL-2422MP mini-PCI, DWL-120+ USB, USR5410, etc.
 See http://acx100.sourceforge.net/ for information about your wireless
 device.
 .
 In order to compile the kernel modules you need the kernel sources (or
 the kernel-headers for the kernel-image packages from Debian). For
 compile instructions look into usr/share/doc/acx100-source/README.Debian
 or simply use the module-assistant utility.
 .
 Please also note that the ACX100/111 chips need a firmware to be
 operational. You can get this firmware from the Microsoft Windows or
 from homepage.
Homepage: http://lisas.de/~andi/acx100/
Tag: admin::kernel, implemented-in::c, role::source, use::driver

Package: alien-arena
Priority: extra
Section: contrib/games
Installed-Size: 1420
Maintainer: Debian Games Team [EMAIL PROTECTED]
Architecture: amd64
Version: 7.0-1
Depends: libc6 (= 2.7-1), libcurl3-gnutls (= 7.16.2-1), libgl1-mesa-glx | 
libgl1, libglu1-mesa | libglu1, libjpeg62, libsdl1.2debian (= 1.2.10-1), 
libx11-6, libxext6, libxxf86dga1, libxxf86vm1, alien-arena-data (= 7.0), 
alien-arena-data ( 7.1)
Suggests: alien-arena-browser
Filename: pool/contrib/a/alien-arena/alien-arena_7.0-1_amd64.deb
Size: 642848
MD5sum: fed7c664a13bad08f7c3368443543376
SHA1: 101a3ad9bf6da67b4c5baafb2883d57181df9bc5
SHA256: 3f3dccf309cdd66d629b6bd84a8870e059fed12ce333cb76088259af31ca4073
Description: Standalone 3D first person online deathmatch shooter
 ALIEN ARENA is a standalone 3D first person online deathmatch shooter
 crafted from the original source code of Quake II and Quake III, released
 by id Software under the GPL license. With features including 32 bit
 graphics, new particle engine and effects, light blooms, reflective water,
 hi resolution textures and skins, hi poly models, stain maps, ALIEN ARENA
 pushes the envelope of graphical beauty rivaling today's top games.
 .
 This package installs the SDL client for Alien Arena.
Homepage: http://red.planetarena.org
Tag: game::fps, implemented-in::c, interface::3d, interface::x11, 
network::client, role::program, use::gameplaying, x11::application

Package: alien-arena-browser
Priority: extra
Section: contrib/games
Installed-Size: 160
Maintainer: Debian Games Team [EMAIL PROTECTED]
Architecture: all
Source: alien-arena
Version: 7.0-1
Depends: alien-arena (= 7.0-1), ruby-gnome2, ruby
Filename: pool/contrib/a/alien-arena/alien-arena-browser_7.0-1_all.deb
Size: 37128
MD5sum: 343aab4c68a02f5fac368392500bbfe2
SHA1: a1e20abdb74a80ce2dd56953481e1d82e61fb8a4
SHA256: dc7df3afa6c25f32753be48042186d15f9048a9a218930c890f4b0141ffa0613
Description: stand alone server browser for Alien Arena
 ALIEN ARENA is a standalone 3D first person online deathmatch shooter
 crafted from the original source code of Quake II and Quake III, released
 by id Software under the GPL license. With features including 32 bit
 graphics, new particle engine and effects, light blooms, reflective water,
 hi resolution textures and skins, hi poly models, stain maps, ALIEN ARENA
 pushes the envelope of graphical beauty rivaling today's top games.
 .
 This package installs the stand alone server browser for Alien Arena which
 allows for browsing available matches without having to launch the game.
Homepage: http://red.planetarena.org
Tag: implemented-in::ruby, role::program, use::gameplaying

{('alien-arena', '7.0-1'): Traceback (most recent call last):
  File test.py, line 24, in module
print pkglist
  File 

Bug#504413: KeyError: problem

2008-11-03 Thread John Wright
Hi Pietro,

On Mon, Nov 03, 2008 at 06:10:50PM +0100, Pietro Abate wrote:
 Package: python-debian
 Version: 0.1.11
 Severity: normal
 
 
 This snippet of code highligths this problem that is probably related to
 the fact that keys are saved at the class level and not at the instance
 level (something to do with shared_storage maybe ??).

I haven't looked too deep into this, but it doesn't happen when you call
iter_paragraphs with shared_storage=False.  If you still wish to use
apt_pkg to parse the Packages file, you can work around this issue for
now by making a copy of each paragraph, e.g. instead of

  pkglist[k] = pkg

do something like

  pkglist[k] = debian_bundle.deb822.Packages(pkg)

I'll try to see if it's possible to keep track of old apt_pkg-backed
objects without explicitly making copies, but I suspect that's a price
we pay for the fast parsing...

Thanks for the report!
-- 
John Wright [EMAIL PROTECTED]



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]