Bug#504413: KeyError: problem
Hi Pietro, Please check out my branch: [EMAIL PROTECTED]:/tmp$ git clone git://git.debian.org/git/pkg-python-debian/python-debian.git Initialized empty Git repository in /tmp/python-debian/.git/ remote: Counting objects: 1209, done. remote: Compressing objects: 100% (338/338), done. remote: Total 1209 (delta 858), reused 1198 (delta 853) Receiving objects: 100% (1209/1209), 341.36 KiB | 263 KiB/s, done. Resolving deltas: 100% (858/858), done. [EMAIL PROTECTED]:/tmp$ cd python-debian/ [EMAIL PROTECTED]:/tmp/python-debian$ git checkout origin/jsw/apt-pkg-without-shared-storage Note: moving to origin/jsw/apt-pkg-without-shared-storage which isn't a local branch If you want to create a new branch from this checkout, you may do so (now or later) by using -b with the checkout command again. Example: git checkout -b new_branch_name HEAD is now at f8c3ca7... Add a use_apt_pkg parameter to Deb822.iter_paragraphs I have added a use_apt_pkg parameter to the iter_paragraphs method, and changed the behavior of the shared_storage parameter. Now, use_apt_pkg just turns on or off the usage of the (fast) apt_pkg parser, while shared_storage determines whether to then copy the parsed data (when False) or to use apt_pkg's internal Tags objects directly (which is faster than copying, but results in objects not really being consistent across iterations, as you found). For now, both parameters default to True, to match previous behavior. I'm going to run a few benchmarks to see if it actually makes sense to default to shared_storage=False, since that's definitely the option that would cause the fewest surprises. I think the main speed benefit to shared_storage previously was that it used apt_pkg's parser, not that it didn't keep data across iterations. Anyway, I plan to pull this into the master branch and upload it after I've tested it a bit more (and probably written a few more unit tests). Thanks, -- John Wright [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#504413: KeyError: problem
tags 504413 + pending thanks On Tue, Nov 04, 2008 at 03:20:16PM +0100, Pietro Abate wrote: On Tue, Nov 04, 2008 at 02:43:22AM -0700, John Wright wrote: Anyway, I plan to pull this into the master branch and upload it after I've tested it a bit more (and probably written a few more unit tests). It seems that using share_storage = False (fast parser and implicit copy) gives the best result. I think this should be the default as it's backward compatible and it's iteration-safe. I wrote some unit tests, found and fixed a nasty bug with use_apt_pkg=True and shared_storage=False (key order wasn't being preserved). I also figured out how to make the gpg verification stuff play nice with this, so I'm a bit more comfortable pushing it back up now. I made a couple of small optimizations that resulted in pretty significant performance advantages over what I had before. As you can see, except for raw Deb822 paragraphs (with none of the multivalued-fields magic), there is little difference between using shared_storage=False and shared_storage=True. Since the performance is so similar, I've made use_apt_pkg=True, shared_storage=False the default. I've attached a little benchmark script and the results on my system. -- John Wright [EMAIL PROTECTED] #!/usr/bin/python import datetime from debian_bundle import deb822 PACKAGES = /var/lib/apt/lists/linuxcoe.corp.hp.com_LinuxCOE_Debian_dists_sid_main_binary-i386_Packages def benchmark(func, *args, **kwargs): t_0 = datetime.datetime.now() func(*args, **kwargs) t_f = datetime.datetime.now() print Elapsed time:, t_f - t_0 def iter_through_all(cls, *args, **kwargs): iterator = cls.iter_paragraphs(*args, **kwargs) for p in iterator: pass if __name__ == __main__: for cls in deb822.Deb822, deb822.Packages: print Class:, cls print use_apt_pkg=True, shared_storage=True benchmark(iter_through_all, cls, open(PACKAGES), use_apt_pkg=True, shared_storage=True) print use_apt_pkg=True, shared_storage=False benchmark(iter_through_all, cls, open(PACKAGES), use_apt_pkg=True, shared_storage=False) print use_apt_pkg=False, shared_storage=False benchmark(iter_through_all, cls, open(PACKAGES), use_apt_pkg=False, shared_storage=False) print [EMAIL PROTECTED]:~/debian/python-debian/benchmark$ python benchmark.py Class: class 'debian_bundle.deb822.Deb822' use_apt_pkg=True, shared_storage=True Elapsed time: 0:00:02.137641 use_apt_pkg=True, shared_storage=False Elapsed time: 0:00:06.978174 use_apt_pkg=False, shared_storage=False Elapsed time: 0:00:23.644512 Class: class 'debian_bundle.deb822.Packages' use_apt_pkg=True, shared_storage=True Elapsed time: 0:00:12.842322 use_apt_pkg=True, shared_storage=False Elapsed time: 0:00:18.014803 use_apt_pkg=False, shared_storage=False Elapsed time: 0:00:28.468205 [EMAIL PROTECTED]:~/debian/python-debian/benchmark$ python benchmark.py Class: class 'debian_bundle.deb822.Deb822' use_apt_pkg=True, shared_storage=True Elapsed time: 0:00:02.183086 use_apt_pkg=True, shared_storage=False Elapsed time: 0:00:07.134066 use_apt_pkg=False, shared_storage=False Elapsed time: 0:00:23.840483 Class: class 'debian_bundle.deb822.Packages' use_apt_pkg=True, shared_storage=True Elapsed time: 0:00:12.793268 use_apt_pkg=True, shared_storage=False Elapsed time: 0:00:17.962679 use_apt_pkg=False, shared_storage=False Elapsed time: 0:00:28.890604 [EMAIL PROTECTED]:~/debian/python-debian/benchmark$ python benchmark.py Class: class 'debian_bundle.deb822.Deb822' use_apt_pkg=True, shared_storage=True Elapsed time: 0:00:02.184487 use_apt_pkg=True, shared_storage=False Elapsed time: 0:00:06.981271 use_apt_pkg=False, shared_storage=False Elapsed time: 0:00:23.387675 Class: class 'debian_bundle.deb822.Packages' use_apt_pkg=True, shared_storage=True Elapsed time: 0:00:13.834592 use_apt_pkg=True, shared_storage=False Elapsed time: 0:00:18.01 use_apt_pkg=False, shared_storage=False Elapsed time: 0:00:28.221635
Bug#504413: KeyError: problem
Package: python-debian Version: 0.1.11 Severity: normal This snippet of code highligths this problem that is probably related to the fact that keys are saved at the class level and not at the instance level (something to do with shared_storage maybe ??). :) p - test.py -- mport debian_bundle.deb822 paragraphs = debian_bundle.deb822.Packages.iter_paragraphs(open('/var/lib/apt/lists/ftp.fr.debian.org_debian_dists_unstable_contrib_binary-amd64_Packages','r')) pkglist = {} pkg = paragraphs.next() k = (pkg['package'],pkg['version']) pkglist[k] = pkg print pkg pkg = paragraphs.next() k = (pkg['package'],pkg['version']) pkglist[k] = pkg print pkg pkg = paragraphs.next() k = (pkg['package'],pkg['version']) pkglist[k] = pkg print pkg print pkglist sys.exit(0) [EMAIL PROTECTED]:~/Projects/$python test.py Package: acx100-source Priority: extra Section: contrib/net Installed-Size: 292 Maintainer: Stefano Canepa [EMAIL PROTECTED] Architecture: all Source: acx100 Version: 20070101-3 Depends: debhelper (= 4.0), dpatch, module-assistant Filename: pool/contrib/a/acx100/acx100-source_20070101-3_all.deb Size: 224900 MD5sum: f9673656b0c676c22bd64c351dd7536d SHA1: 3cac7ad52e0215d1603605fc43d8ce3c81594990 SHA256: 6ad06bb46bdf8a7d941d5d3f7e02e976dbfe6ce3f26fb5067590bd044d44cb08 Description: ACX100/ACX111 wireless network drivers source This package provides the source code of the Linux drivers for wireless network cards using TI ACX100/ACX111 chips. This includes DWL-[G]520+ PCI, DWL-[G]650+ CardBus, GL-2422MP mini-PCI, DWL-120+ USB, USR5410, etc. See http://acx100.sourceforge.net/ for information about your wireless device. . In order to compile the kernel modules you need the kernel sources (or the kernel-headers for the kernel-image packages from Debian). For compile instructions look into usr/share/doc/acx100-source/README.Debian or simply use the module-assistant utility. . Please also note that the ACX100/111 chips need a firmware to be operational. You can get this firmware from the Microsoft Windows or from homepage. Homepage: http://lisas.de/~andi/acx100/ Tag: admin::kernel, implemented-in::c, role::source, use::driver Package: alien-arena Priority: extra Section: contrib/games Installed-Size: 1420 Maintainer: Debian Games Team [EMAIL PROTECTED] Architecture: amd64 Version: 7.0-1 Depends: libc6 (= 2.7-1), libcurl3-gnutls (= 7.16.2-1), libgl1-mesa-glx | libgl1, libglu1-mesa | libglu1, libjpeg62, libsdl1.2debian (= 1.2.10-1), libx11-6, libxext6, libxxf86dga1, libxxf86vm1, alien-arena-data (= 7.0), alien-arena-data ( 7.1) Suggests: alien-arena-browser Filename: pool/contrib/a/alien-arena/alien-arena_7.0-1_amd64.deb Size: 642848 MD5sum: fed7c664a13bad08f7c3368443543376 SHA1: 101a3ad9bf6da67b4c5baafb2883d57181df9bc5 SHA256: 3f3dccf309cdd66d629b6bd84a8870e059fed12ce333cb76088259af31ca4073 Description: Standalone 3D first person online deathmatch shooter ALIEN ARENA is a standalone 3D first person online deathmatch shooter crafted from the original source code of Quake II and Quake III, released by id Software under the GPL license. With features including 32 bit graphics, new particle engine and effects, light blooms, reflective water, hi resolution textures and skins, hi poly models, stain maps, ALIEN ARENA pushes the envelope of graphical beauty rivaling today's top games. . This package installs the SDL client for Alien Arena. Homepage: http://red.planetarena.org Tag: game::fps, implemented-in::c, interface::3d, interface::x11, network::client, role::program, use::gameplaying, x11::application Package: alien-arena-browser Priority: extra Section: contrib/games Installed-Size: 160 Maintainer: Debian Games Team [EMAIL PROTECTED] Architecture: all Source: alien-arena Version: 7.0-1 Depends: alien-arena (= 7.0-1), ruby-gnome2, ruby Filename: pool/contrib/a/alien-arena/alien-arena-browser_7.0-1_all.deb Size: 37128 MD5sum: 343aab4c68a02f5fac368392500bbfe2 SHA1: a1e20abdb74a80ce2dd56953481e1d82e61fb8a4 SHA256: dc7df3afa6c25f32753be48042186d15f9048a9a218930c890f4b0141ffa0613 Description: stand alone server browser for Alien Arena ALIEN ARENA is a standalone 3D first person online deathmatch shooter crafted from the original source code of Quake II and Quake III, released by id Software under the GPL license. With features including 32 bit graphics, new particle engine and effects, light blooms, reflective water, hi resolution textures and skins, hi poly models, stain maps, ALIEN ARENA pushes the envelope of graphical beauty rivaling today's top games. . This package installs the stand alone server browser for Alien Arena which allows for browsing available matches without having to launch the game. Homepage: http://red.planetarena.org Tag: implemented-in::ruby, role::program, use::gameplaying {('alien-arena', '7.0-1'): Traceback (most recent call last): File test.py, line 24, in module print pkglist File
Bug#504413: KeyError: problem
Hi Pietro, On Mon, Nov 03, 2008 at 06:10:50PM +0100, Pietro Abate wrote: Package: python-debian Version: 0.1.11 Severity: normal This snippet of code highligths this problem that is probably related to the fact that keys are saved at the class level and not at the instance level (something to do with shared_storage maybe ??). I haven't looked too deep into this, but it doesn't happen when you call iter_paragraphs with shared_storage=False. If you still wish to use apt_pkg to parse the Packages file, you can work around this issue for now by making a copy of each paragraph, e.g. instead of pkglist[k] = pkg do something like pkglist[k] = debian_bundle.deb822.Packages(pkg) I'll try to see if it's possible to keep track of old apt_pkg-backed objects without explicitly making copies, but I suspect that's a price we pay for the fast parsing... Thanks for the report! -- John Wright [EMAIL PROTECTED] -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]