tags 504413 + pending
thanks

On Tue, Nov 04, 2008 at 03:20:16PM +0100, Pietro Abate wrote:
> On Tue, Nov 04, 2008 at 02:43:22AM -0700, John Wright wrote:
> > Anyway, I plan to pull this into the master branch and upload it after
> > I've tested it a bit more (and probably written a few more unit tests).
> 
> It seems that using share_storage = False (fast parser and implicit copy) 
> gives
> the best result. I think this should be the default as it's backward 
> compatible 
> and it's iteration-safe.

I wrote some unit tests, found and fixed a nasty bug with
use_apt_pkg=True and shared_storage=False (key order wasn't being
preserved).  I also figured out how to make the gpg verification stuff
play nice with this, so I'm a bit more comfortable pushing it back up
now.

I made a couple of small optimizations that resulted in pretty
significant performance advantages over what I had before.  As you can
see, except for "raw" Deb822 paragraphs (with none of the
multivalued-fields magic), there is little difference between using
shared_storage=False and shared_storage=True.  Since the performance is
so similar, I've made use_apt_pkg=True, shared_storage=False the
default.  I've attached a little benchmark script and the results on my
system.

-- 
John Wright <[EMAIL PROTECTED]>
#!/usr/bin/python

import datetime
from debian_bundle import deb822

PACKAGES = "/var/lib/apt/lists/linuxcoe.corp.hp.com_LinuxCOE_Debian_dists_sid_main_binary-i386_Packages"

def benchmark(func, *args, **kwargs):
    t_0 = datetime.datetime.now()
    func(*args, **kwargs)
    t_f = datetime.datetime.now()
    print "Elapsed time:", t_f - t_0


def iter_through_all(cls, *args, **kwargs):
    iterator = cls.iter_paragraphs(*args, **kwargs)
    for p in iterator:
        pass


if __name__ == "__main__":

    for cls in deb822.Deb822, deb822.Packages:
        print "Class:", cls

        print "use_apt_pkg=True, shared_storage=True"
        benchmark(iter_through_all, cls, open(PACKAGES),
                     use_apt_pkg=True, shared_storage=True)
        print "use_apt_pkg=True, shared_storage=False"
        benchmark(iter_through_all, cls, open(PACKAGES),
                     use_apt_pkg=True, shared_storage=False)
        print "use_apt_pkg=False, shared_storage=False"
        benchmark(iter_through_all, cls, open(PACKAGES),
                     use_apt_pkg=False, shared_storage=False)
        print
[EMAIL PROTECTED]:~/debian/python-debian/benchmark$ python benchmark.py 
Class: <class 'debian_bundle.deb822.Deb822'>
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:02.137641
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:06.978174
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:23.644512

Class: <class 'debian_bundle.deb822.Packages'>
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:12.842322
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:18.014803
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:28.468205

[EMAIL PROTECTED]:~/debian/python-debian/benchmark$ python benchmark.py 
Class: <class 'debian_bundle.deb822.Deb822'>
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:02.183086
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:07.134066
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:23.840483

Class: <class 'debian_bundle.deb822.Packages'>
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:12.793268
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:17.962679
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:28.890604

[EMAIL PROTECTED]:~/debian/python-debian/benchmark$ python benchmark.py 
Class: <class 'debian_bundle.deb822.Deb822'>
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:02.184487
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:06.981271
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:23.387675

Class: <class 'debian_bundle.deb822.Packages'>
use_apt_pkg=True, shared_storage=True
Elapsed time: 0:00:13.834592
use_apt_pkg=True, shared_storage=False
Elapsed time: 0:00:18.019999
use_apt_pkg=False, shared_storage=False
Elapsed time: 0:00:28.221635

Reply via email to