At http://bzr.arbash-meinel.com/branches/bzr/brisbane/merge_dev
------------------------------------------------------------
revno: 3807
revision-id: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
committer: John Arbash Meinel <[EMAIL PROTECTED]>
branch nick: merge_dev
timestamp: Mon 2008-12-08 22:26:46 -0600
message:
Getting rid of Inv.copy() and changing the LRU into a FIFO drops the
time down into 2m19s. (Down from 4m originally.)
Also hacked up the Inventory.add() to also be minimal.
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/inventory.py inventory.py-20050309040759-6648b84ca2005b37
bzrlib/lru_cache.py lru_cache.py-20070119165515-tlw203kuwh0id5gv-1
bzrlib/repofmt/pack_repo.py pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
bzrlib/tests/test_lru_cache.py
test_lru_cache.py-20070119165535-hph6rk4h9rzy4180-1
bzrlib/xml5.py xml5.py-20080328030717-t9guwinq8hom0ar3-1
bzrlib/xml8.py xml5.py-20050907032657-aac8f960815b66b1
------------------------------------------------------------
revno: 3801.1.3
revision-id: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
committer: John Arbash Meinel <[EMAIL PROTECTED]>
branch nick: debug_hacks
timestamp: Mon 2008-12-08 12:31:49 -0600
message:
Merge the XML entry cache.
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/lru_cache.py
lru_cache.py-20070119165515-tlw203kuwh0id5gv-1
bzrlib/tests/test_lru_cache.py
test_lru_cache.py-20070119165535-hph6rk4h9rzy4180-1
bzrlib/xml8.py xml5.py-20050907032657-aac8f960815b66b1
------------------------------------------------------------
revno: 3735.28.110
revision-id: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
committer: John Arbash Meinel <[EMAIL PROTECTED]>
branch nick: xml_cache
timestamp: Mon 2008-12-08 12:30:41 -0600
message:
If we are going to thrash the inventory entry cache, increase its
size.
modified:
bzrlib/xml8.py xml5.py-20050907032657-aac8f960815b66b1
------------------------------------------------------------
revno: 3735.28.109
revision-id: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
committer: John Arbash Meinel <[EMAIL PROTECTED]>
branch nick: xml_cache
timestamp: Mon 2008-12-08 12:30:04 -0600
message:
Merge the lru cache changes.
modified:
NEWS NEWS-20050323055033-4e00b5db738777ff
bzrlib/lru_cache.py
lru_cache.py-20070119165515-tlw203kuwh0id5gv-1
bzrlib/tests/test_lru_cache.py
test_lru_cache.py-20070119165535-hph6rk4h9rzy4180-1
------------------------------------------------------------
revno: 3735.137.1
revision-id: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
committer: John Arbash Meinel <[EMAIL PROTECTED]>
branch nick: lru_cache
timestamp: Mon 2008-12-08 12:23:00 -0600
message:
Add LRUCache.resize(), and change the init arguments for LRUCache.
The old name was a bit confusing, and caused LRUSizeCache to
re-use variables in
a confusing way with LRUCache.
Also, this changes the default cleanup size to be 80% of
max_size. This should
be better, as it means we get a little bit of room when adding
keys,
rather than having to cleanup after every add, we can instead do
it in
batches.
modified:
NEWS
NEWS-20050323055033-4e00b5db738777ff
bzrlib/lru_cache.py
lru_cache.py-20070119165515-tlw203kuwh0id5gv-1
bzrlib/tests/test_lru_cache.py
test_lru_cache.py-20070119165535-hph6rk4h9rzy4180-1
------------------------------------------------------------
revno: 3735.28.108
revision-id: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
committer: John Arbash Meinel <[EMAIL PROTECTED]>
branch nick: xml_cache
timestamp: Mon 2008-12-08 12:27:57 -0600
message:
Add an InventoryEntry cache to the xml deserializer.
modified:
bzrlib/xml8.py xml5.py-20050907032657-aac8f960815b66b1
------------------------------------------------------------
revno: 3801.1.2
revision-id: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
committer: John Arbash Meinel <[EMAIL PROTECTED]>
branch nick: debug_hacks
timestamp: Sun 2008-12-07 12:42:33 -0600
message:
Don't execute pack ops twice.
modified:
bzrlib/repofmt/pack_repo.py
pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
------------------------------------------------------------
revno: 3801.1.1
revision-id: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
committer: John Arbash Meinel <[EMAIL PROTECTED]>
branch nick: merge_dev
timestamp: Sun 2008-12-07 11:43:38 -0600
message:
Merge in the debug_hacks.
modified:
bzrlib/repofmt/pack_repo.py
pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
------------------------------------------------------------
revno: 3791.1.16
revision-id: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
committer: John Arbash Meinel <[EMAIL PROTECTED]>
branch nick: chk_map
timestamp: Tue 2008-12-02 22:32:30 -0600
message:
Hack in some other code, so we can determine how much compression we get.
This just tracks the 'old size' of all the packs that are getting
combined versus the
'new size' of the newly created pack file.
modified:
bzrlib/repofmt/pack_repo.py
pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
------------------------------------------------------------
revno: 3791.1.15
revision-id: [EMAIL PROTECTED]
parent: [EMAIL PROTECTED]
committer: John Arbash Meinel <[EMAIL PROTECTED]>
branch nick: chk_map
timestamp: Tue 2008-12-02 22:11:38 -0600
message:
Add size information to the mutter when -Dpack is used.
Also fix a bug in -Dpack when the repository doesn't support chk_bytes.
modified:
bzrlib/repofmt/pack_repo.py
pack_repo.py-20070813041115-gjv5ma7ktfqwsjgn-1
=== modified file 'NEWS'
--- a/NEWS 2008-12-07 17:40:43 +0000
+++ b/NEWS 2008-12-08 18:31:49 +0000
@@ -9,6 +9,11 @@
--------------
CHANGES:
+
+ * ``LRUCache(after_cleanup_size)`` was renamed to
+ ``after_cleanup_count`` and the old name deprecated. The new name is
+ used for clarity, and to avoid confusion with
+ ``LRUSizeCache(after_cleanup_size)``. (John Arbash Meinel)
NEW FEATURES:
=== modified file 'bzrlib/inventory.py'
--- a/bzrlib/inventory.py 2008-12-07 17:40:43 +0000
+++ b/bzrlib/inventory.py 2008-12-09 04:26:46 +0000
@@ -1137,6 +1137,16 @@
parent.children[entry.name] = entry
return self._add_child(entry)
+ def _add_one_entry_no_checks(self, entry, last_parent):
+ assert last_parent is not None
+ if last_parent.file_id == entry.parent_id:
+ parent = last_parent
+ else:
+ parent = self._byid[entry.parent_id]
+ parent.children[entry.name] = entry
+ self._byid[entry.file_id] = entry
+ return parent
+
def add_path(self, relpath, kind, file_id=None, parent_id=None):
"""Add entry from a path.
=== modified file 'bzrlib/lru_cache.py'
--- a/bzrlib/lru_cache.py 2008-10-14 20:19:06 +0000
+++ b/bzrlib/lru_cache.py 2008-12-08 18:23:00 +0000
@@ -1,4 +1,4 @@
-# Copyright (C) 2006 Canonical Ltd
+# Copyright (C) 2006, 2008 Canonical Ltd
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@@ -17,25 +17,26 @@
"""A simple least-recently-used (LRU) cache."""
from collections import deque
-import gc
+
+from bzrlib import symbol_versioning
class LRUCache(object):
"""A class which manages a cache of entries, removing unused ones."""
- def __init__(self, max_cache=100, after_cleanup_size=None):
- self._max_cache = max_cache
- if after_cleanup_size is None:
- self._after_cleanup_size = self._max_cache
- else:
- self._after_cleanup_size = min(after_cleanup_size, self._max_cache)
-
- self._compact_queue_length = 4*self._max_cache
-
+ def __init__(self, max_cache=100, after_cleanup_count=None,
+ after_cleanup_size=symbol_versioning.DEPRECATED_PARAMETER):
+ if symbol_versioning.deprecated_passed(after_cleanup_size):
+ symbol_versioning.warn('LRUCache.__init__(after_cleanup_size) was'
+ ' deprecated in 1.11. Use'
+ ' after_cleanup_count instead.',
+ DeprecationWarning)
+ after_cleanup_count = after_cleanup_size
self._cache = {}
self._cleanup = {}
self._queue = deque() # Track when things are accessed
self._refcount = {} # number of entries in self._queue for each key
+ self._update_max_cache(max_cache, after_cleanup_count)
def __contains__(self, key):
return key in self._cache
@@ -89,11 +90,13 @@
"""Clear the cache until it shrinks to the requested size.
This does not completely wipe the cache, just makes sure it is under
- the after_cleanup_size.
+ the after_cleanup_count.
"""
# Make sure the cache is shrunk to the correct size
- while len(self._cache) > self._after_cleanup_size:
+ while len(self._cache) > self._after_cleanup_count:
self._remove_lru()
+ # No need to compact the queue at this point, because the code that
+ # calls this would have already triggered it based on queue length
def __setitem__(self, key, value):
"""Add a value to the cache, there will be no cleanup function."""
@@ -150,6 +153,23 @@
while self._cache:
self._remove_lru()
+ def resize(self, max_cache, after_cleanup_count=None):
+ """Change the number of entries that will be cached."""
+ self._update_max_cache(max_cache,
+ after_cleanup_count=after_cleanup_count)
+
+ def _update_max_cache(self, max_cache, after_cleanup_count=None):
+ self._max_cache = max_cache
+ if after_cleanup_count is None:
+ self._after_cleanup_count = self._max_cache * 8 / 10
+ else:
+ self._after_cleanup_count = min(after_cleanup_count,
self._max_cache)
+
+ self._compact_queue_length = 4*self._max_cache
+ if len(self._queue) > self._compact_queue_length:
+ self._compact_queue()
+ self.cleanup()
+
class LRUSizeCache(LRUCache):
"""An LRUCache that removes things based on the size of the values.
@@ -175,20 +195,15 @@
The function should take the form "compute_size(value) => integer".
If not supplied, it defaults to 'len()'
"""
- # This approximates that texts are > 0.5k in size. It only really
- # effects when we clean up the queue, so we don't want it to be too
- # large.
- LRUCache.__init__(self, max_cache=int(max_size/512))
- self._max_size = max_size
- if after_cleanup_size is None:
- self._after_cleanup_size = self._max_size
- else:
- self._after_cleanup_size = min(after_cleanup_size, self._max_size)
-
self._value_size = 0
self._compute_size = compute_size
if compute_size is None:
self._compute_size = len
+ # This approximates that texts are > 0.5k in size. It only really
+ # effects when we clean up the queue, so we don't want it to be too
+ # large.
+ self._update_max_size(max_size, after_cleanup_size=after_cleanup_size)
+ LRUCache.__init__(self, max_cache=max(int(max_size/512), 1))
def add(self, key, value, cleanup=None):
"""Add a new value to the cache.
@@ -229,3 +244,16 @@
"""Remove an entry, making sure to maintain the invariants."""
val = LRUCache._remove(self, key)
self._value_size -= self._compute_size(val)
+
+ def resize(self, max_size, after_cleanup_size=None):
+ """Change the number of bytes that will be cached."""
+ self._update_max_size(max_size, after_cleanup_size=after_cleanup_size)
+ max_cache = max(int(max_size/512), 1)
+ self._update_max_cache(max_cache)
+
+ def _update_max_size(self, max_size, after_cleanup_size=None):
+ self._max_size = max_size
+ if after_cleanup_size is None:
+ self._after_cleanup_size = self._max_size * 8 / 10
+ else:
+ self._after_cleanup_size = min(after_cleanup_size, self._max_size)
=== modified file 'bzrlib/repofmt/pack_repo.py'
--- a/bzrlib/repofmt/pack_repo.py 2008-12-07 18:25:13 +0000
+++ b/bzrlib/repofmt/pack_repo.py 2008-12-09 04:26:46 +0000
@@ -418,9 +418,15 @@
'../packs/' + self.name + '.pack')
self._state = 'finished'
if 'pack' in debug.debug_flags:
+ try:
+ size = self.pack_transport.stat(self.name + '.pack').st_size
+ size /= 1024.*1024
+ except errors.TransportNotPossible:
+ size = -1
# XXX: size might be interesting?
- mutter('%s: create_pack: pack renamed into place: %s%s->%s%s
t+%6.3fs',
- time.ctime(), self.upload_transport.base, self.random_name,
+ mutter('%s: create_pack: pack renamed into place (%.3fMB):
%s%s->%s%s'
+ ' t+%6.3fs',
+ time.ctime(), size, self.upload_transport.base,
self.random_name,
self.pack_transport, self.name,
time.time() - self.start_time)
@@ -815,10 +821,17 @@
rev_count = len(self.revision_ids)
else:
rev_count = 'all'
- mutter('%s: create_pack: creating pack from source packs: '
+ size = 0
+ for a_pack in self.packs:
+ try:
+ size += a_pack.pack_transport.stat(a_pack.name +
'.pack').st_size
+ except errors.TransportNotPossible:
+ pass
+ size /= 1024.*1024
+ mutter('%s: create_pack: creating pack from source packs (%.3fMB):
'
'%s%s %s revisions wanted %s t=0',
- time.ctime(), self._pack_collection._upload_transport.base,
new_pack.random_name,
- plain_pack_list, rev_count)
+ time.ctime(), size,
self._pack_collection._upload_transport.base,
+ new_pack.random_name, plain_pack_list, rev_count)
self._copy_revision_texts()
self._copy_inventory_texts()
self._copy_text_texts()
@@ -1376,8 +1389,17 @@
'containing %d revisions. Packing %d files into %d affecting %d'
' revisions', self, total_packs, total_revisions, num_old_packs,
num_new_packs, num_revs_affected)
- self._execute_pack_operations(pack_operations)
- mutter('Auto-packing repository %s completed', self)
+ old_size, new_size = self._execute_pack_operations(pack_operations)
+ if old_size is None:
+ old_size = -1
+ else:
+ old_size /= (1024.0*1024)
+ if new_size is None:
+ new_size = -1
+ else:
+ new_size /= (1024.0*1024)
+ mutter('Auto-packing repository %s completed %.3fMB => %.3fMB',
+ self.transport.base, old_size, new_size)
return True
def _execute_pack_operations(self, pack_operations, _packer_class=Packer):
@@ -1387,19 +1409,32 @@
:param _packer_class: The class of packer to use (default: Packer).
:return: None.
"""
+ new_size = 0
for revision_count, packs in pack_operations:
# we may have no-ops from the setup logic
if len(packs) == 0:
continue
- _packer_class(self, packs, '.autopack').pack()
+ new_pack = _packer_class(self, packs, '.autopack').pack()
+ try:
+ new_size += new_pack.pack_transport.stat(new_pack.name +
'.pack').st_size
+ except errors.TransportNotPossible:
+ new_size = None
for pack in packs:
self._remove_pack_from_memory(pack)
# record the newly available packs and stop advertising the old
# packs
+ if new_size is None:
+ old_size = None
+ else:
+ old_size = 0
+ for revision_count, packs in pack_operations:
+ for a_pack in packs:
+ old_size += a_pack.pack_transport.stat(a_pack.name +
'.pack').st_size
self._save_pack_names(clear_obsolete_packs=True)
# Move the old packs out of the way now they are no longer referenced.
for revision_count, packs in pack_operations:
self._obsolete_packs(packs)
+ return old_size, new_size
def lock_names(self):
"""Acquire the mutex around the pack-names index.
=== modified file 'bzrlib/tests/test_lru_cache.py'
--- a/bzrlib/tests/test_lru_cache.py 2008-10-14 20:19:06 +0000
+++ b/bzrlib/tests/test_lru_cache.py 2008-12-08 18:23:00 +0000
@@ -1,4 +1,4 @@
-# Copyright (C) 2006 Canonical Ltd
+# Copyright (C) 2006, 2008 Canonical Ltd
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@@ -38,7 +38,7 @@
def test_overflow(self):
"""Adding extra entries will pop out old ones."""
- cache = lru_cache.LRUCache(max_cache=1)
+ cache = lru_cache.LRUCache(max_cache=1, after_cleanup_count=1)
cache['foo'] = 'bar'
# With a max cache of 1, adding 'baz' should pop out 'foo'
@@ -113,7 +113,7 @@
self.assertEqual([(2, 20), (2, 25)], cleanup_called)
def test_len(self):
- cache = lru_cache.LRUCache(max_cache=10)
+ cache = lru_cache.LRUCache(max_cache=10, after_cleanup_count=10)
cache[1] = 10
cache[2] = 20
@@ -140,8 +140,8 @@
# We hit the max
self.assertEqual(10, len(cache))
- def test_cleanup_shrinks_to_after_clean_size(self):
- cache = lru_cache.LRUCache(max_cache=5, after_cleanup_size=3)
+ def test_cleanup_shrinks_to_after_clean_count(self):
+ cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=3)
cache.add(1, 10)
cache.add(2, 20)
@@ -156,15 +156,16 @@
self.assertEqual(3, len(cache))
def test_after_cleanup_larger_than_max(self):
- cache = lru_cache.LRUCache(max_cache=5, after_cleanup_size=10)
- self.assertEqual(5, cache._after_cleanup_size)
+ cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=10)
+ self.assertEqual(5, cache._after_cleanup_count)
def test_after_cleanup_none(self):
- cache = lru_cache.LRUCache(max_cache=5, after_cleanup_size=None)
- self.assertEqual(5, cache._after_cleanup_size)
+ cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=None)
+ # By default _after_cleanup_size is 80% of the normal size
+ self.assertEqual(4, cache._after_cleanup_count)
def test_cleanup(self):
- cache = lru_cache.LRUCache(max_cache=5, after_cleanup_size=2)
+ cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=2)
# Add these in order
cache.add(1, 10)
@@ -214,7 +215,7 @@
self.assertIs(obj, cache.get(3, obj))
def test_keys(self):
- cache = lru_cache.LRUCache(max_cache=5)
+ cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=5)
cache[1] = 2
cache[2] = 3
@@ -225,6 +226,52 @@
cache[6] = 7
self.assertEqual([2, 3, 4, 5, 6], sorted(cache.keys()))
+ def test_after_cleanup_size_deprecated(self):
+ obj = self.callDeprecated([
+ 'LRUCache.__init__(after_cleanup_size) was deprecated in 1.11.'
+ ' Use after_cleanup_count instead.'],
+ lru_cache.LRUCache, 50, after_cleanup_size=25)
+ self.assertEqual(obj._after_cleanup_count, 25)
+
+ def test_resize_smaller(self):
+ cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=4)
+ cache[1] = 2
+ cache[2] = 3
+ cache[3] = 4
+ cache[4] = 5
+ cache[5] = 6
+ self.assertEqual([1, 2, 3, 4, 5], sorted(cache.keys()))
+ cache[6] = 7
+ self.assertEqual([3, 4, 5, 6], sorted(cache.keys()))
+ # Now resize to something smaller, which triggers a cleanup
+ cache.resize(max_cache=3, after_cleanup_count=2)
+ self.assertEqual([5, 6], sorted(cache.keys()))
+ # Adding something will use the new size
+ cache[7] = 8
+ self.assertEqual([5, 6, 7], sorted(cache.keys()))
+ cache[8] = 9
+ self.assertEqual([7, 8], sorted(cache.keys()))
+
+ def test_resize_larger(self):
+ cache = lru_cache.LRUCache(max_cache=5, after_cleanup_count=4)
+ cache[1] = 2
+ cache[2] = 3
+ cache[3] = 4
+ cache[4] = 5
+ cache[5] = 6
+ self.assertEqual([1, 2, 3, 4, 5], sorted(cache.keys()))
+ cache[6] = 7
+ self.assertEqual([3, 4, 5, 6], sorted(cache.keys()))
+ cache.resize(max_cache=8, after_cleanup_count=6)
+ self.assertEqual([3, 4, 5, 6], sorted(cache.keys()))
+ cache[7] = 8
+ cache[8] = 9
+ cache[9] = 10
+ cache[10] = 11
+ self.assertEqual([3, 4, 5, 6, 7, 8, 9, 10], sorted(cache.keys()))
+ cache[11] = 12 # triggers cleanup back to new after_cleanup_count
+ self.assertEqual([6, 7, 8, 9, 10, 11], sorted(cache.keys()))
+
class TestLRUSizeCache(tests.TestCase):
@@ -232,7 +279,7 @@
cache = lru_cache.LRUSizeCache()
self.assertEqual(2048, cache._max_cache)
self.assertEqual(4*2048, cache._compact_queue_length)
- self.assertEqual(cache._max_size, cache._after_cleanup_size)
+ self.assertEqual(int(cache._max_size*0.8), cache._after_cleanup_size)
self.assertEqual(0, cache._value_size)
def test_add_tracks_size(self):
@@ -332,3 +379,37 @@
cache[2] = 'b'
cache[3] = 'cdef'
self.assertEqual([1, 2, 3], sorted(cache.keys()))
+
+ def test_resize_smaller(self):
+ cache = lru_cache.LRUSizeCache(max_size=10, after_cleanup_size=9)
+ cache[1] = 'abc'
+ cache[2] = 'def'
+ cache[3] = 'ghi'
+ cache[4] = 'jkl'
+ # Triggers a cleanup
+ self.assertEqual([2, 3, 4], sorted(cache.keys()))
+ # Resize should also cleanup again
+ cache.resize(max_size=6, after_cleanup_size=4)
+ self.assertEqual([4], sorted(cache.keys()))
+ # Adding should use the new max size
+ cache[5] = 'mno'
+ self.assertEqual([4, 5], sorted(cache.keys()))
+ cache[6] = 'pqr'
+ self.assertEqual([6], sorted(cache.keys()))
+
+ def test_resize_larger(self):
+ cache = lru_cache.LRUSizeCache(max_size=10, after_cleanup_size=9)
+ cache[1] = 'abc'
+ cache[2] = 'def'
+ cache[3] = 'ghi'
+ cache[4] = 'jkl'
+ # Triggers a cleanup
+ self.assertEqual([2, 3, 4], sorted(cache.keys()))
+ cache.resize(max_size=15, after_cleanup_size=12)
+ self.assertEqual([2, 3, 4], sorted(cache.keys()))
+ cache[5] = 'mno'
+ cache[6] = 'pqr'
+ self.assertEqual([2, 3, 4, 5, 6], sorted(cache.keys()))
+ cache[7] = 'stu'
+ self.assertEqual([4, 5, 6, 7], sorted(cache.keys()))
+
=== modified file 'bzrlib/xml5.py'
--- a/bzrlib/xml5.py 2008-04-24 07:22:53 +0000
+++ b/bzrlib/xml5.py 2008-12-09 04:26:46 +0000
@@ -44,13 +44,28 @@
if data_revision_id is not None:
revision_id = cache_utf8.encode(data_revision_id)
inv = inventory.Inventory(root_id, revision_id=revision_id)
+ last = (inv.root.file_id, inv.root, inv.root.children)
+ byid = inv._byid
+ unpack_entry = self._unpack_entry
for e in elt:
- ie = self._unpack_entry(e)
- if ie.parent_id is None:
- ie.parent_id = root_id
- inv.add(ie)
+ ie = unpack_entry(e)
+ parent_id = ie.parent_id
+ if parent_id is None:
+ ie.parent_id = parent_id = root_id
+ if last[0] == parent_id:
+ next = last
+ else:
+ parent = byid[parent_id]
+ next = (parent_id, parent, parent.children)
+ next[2][ie.name] = ie
+ byid[ie.file_id] = ie
+ last = next
if revision_id is not None:
inv.root.revision = revision_id
+ if len(inv) > xml8._entry_cache._max_cache:
+ new_len = len(inv) * 1.2
+ trace.note('Resizing inventory cache to %s', new_len)
+ _entry_cache.resize(new_len)
return inv
def _check_revisions(self, inv):
=== modified file 'bzrlib/xml8.py'
--- a/bzrlib/xml8.py 2008-04-24 07:22:53 +0000
+++ b/bzrlib/xml8.py 2008-12-09 04:26:46 +0000
@@ -21,7 +21,9 @@
cache_utf8,
errors,
inventory,
+ lru_cache,
revision as _mod_revision,
+ trace,
)
from bzrlib.xml_serializer import SubElement, Element, Serializer
from bzrlib.inventory import ROOT_ID, Inventory, InventoryEntry
@@ -38,6 +40,8 @@
"<":"<",
">":">",
}
+# A cache of InventoryEntry objects
+_entry_cache = lru_cache.LRUCache(10*1024)
def _ensure_utf8_re():
@@ -352,44 +356,65 @@
for e in elt:
ie = self._unpack_entry(e)
inv.add(ie)
+ if len(inv) > _entry_cache._max_cache:
+ new_len = len(inv) * 1.2
+ trace.note('Resizing inventory cache to %s', new_len)
+ _entry_cache.resize(new_len)
return inv
- def _unpack_entry(self, elt):
+ def _unpack_entry(self, elt, _entry_cache=_entry_cache,
+ _raw_cache=_entry_cache._cache):
+ elt_get = elt.get
+ file_id = elt_get('file_id')
+ revision = elt_get('revision')
+ # Check and see if we have already unpacked this exact entry
+ key = (file_id, revision)
+ try:
+ # Using the raw cache (basically FIFO instead of LRU) saves 30s
+ raw_ie = _raw_cache[key]
+ except KeyError:
+ pass
+ else:
+ # calling .copy() only on directorie saves 15s
+ if raw_ie.kind == 'directory':
+ return raw_ie.copy()
+ return raw_ie
+
kind = elt.tag
if not InventoryEntry.versionable_kind(kind):
raise AssertionError('unsupported entry kind %s' % kind)
get_cached = _get_utf8_or_ascii
- parent_id = elt.get('parent_id')
+ file_id = get_cached(file_id)
+ if revision is not None:
+ revision = get_cached(revision)
+ parent_id = elt_get('parent_id')
if parent_id is not None:
parent_id = get_cached(parent_id)
- file_id = get_cached(elt.get('file_id'))
if kind == 'directory':
ie = inventory.InventoryDirectory(file_id,
- elt.get('name'),
+ elt_get('name'),
parent_id)
elif kind == 'file':
ie = inventory.InventoryFile(file_id,
- elt.get('name'),
+ elt_get('name'),
parent_id)
- ie.text_sha1 = elt.get('text_sha1')
- if elt.get('executable') == 'yes':
+ ie.text_sha1 = elt_get('text_sha1')
+ if elt_get('executable') == 'yes':
ie.executable = True
- v = elt.get('text_size')
+ v = elt_get('text_size')
ie.text_size = v and int(v)
elif kind == 'symlink':
ie = inventory.InventoryLink(file_id,
- elt.get('name'),
+ elt_get('name'),
parent_id)
- ie.symlink_target = elt.get('symlink_target')
+ ie.symlink_target = elt_get('symlink_target')
else:
raise errors.UnsupportedInventoryKind(kind)
- revision = elt.get('revision')
- if revision is not None:
- revision = get_cached(revision)
ie.revision = revision
+ _entry_cache[key] = ie
return ie
--
bazaar-commits mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/bazaar-commits