RE: [Zope-dev] BTreeFolder2.objectIds() - accessing _tree.keys() slow

2006-01-13 Thread sean . upton
 [EMAIL PROTECTED] wrote:
  I'm pretty sure this works.
 
 Ok, I get it now.  I misread it the first time.
 
  This returns the equivalent of running
  self.objectIds(spec=self._mt_index.keys()) on the current 
  trunk/release code, which should be identical to 
 self._tree.keys(), but much, much faster.
  I'm still somewhat ignorant as to why self._tree.keys() is so slow 
  with 100k-plus objects (waking up too many persistent objects?),
 
 I suspect the cost is in creating ghosts for all of the 
 persistent objects.
 
 No objections here--I like this patch.

Thanks Shane - glad it makes sense.  I don't have contributor rights - would
you or anyone else be willing to gateway this diff for me and commit such
changes?

Thanks,
Sean
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


RE: [Zope-dev] BTreeFolder2.objectIds() - accessing _tree.keys() slow

2006-01-12 Thread sean . upton
 
  Hacking objectIds() as follows (diff against trunk pasted inline) - 
  gettting ids off of the meta type index for all used meta types - 
  seems to make things much quicker.  Two questions:
 
 Are you sure this actually works?  _mt_index.keys() is 
 supposed to provide a list of all meta_types used in the 
 folder.  To get the object ids from it, you'd need something 
 like this:
 
 ids = []
 for d in self._mt_index.values():
ids.extend(d.keys())
 
 The structure of _mt_index is documented in a comment:
 
  _mt_index = None  # OOBTree: { meta_type - OIBTree: { 
 id - 1 } }

I'm pretty sure this works.  The in each OIBTree in _mt_index, the keys are
ids of contained objects for all respective meta types in the folder - I use
self._mt_index.keys() to list all meta_types and leverage your original code
that runs when spec parameter has been passed gets run every time - loops
through all meta_types does a union() of set and ids for each meta_type.

This returns the equivalent of running
self.objectIds(spec=self._mt_index.keys()) on the current trunk/release
code, which should be identical to self._tree.keys(), but much, much faster.
I'm still somewhat ignorant as to why self._tree.keys() is so slow with
100k-plus objects (waking up too many persistent objects?), but using the
ids stored a few layers deep in the _mt_index seems a viable alternative
with the same expected return result.

With a bit more effort put in, the diff pasted below should be more complete
and illustrate better:


Index: BTreeFolder2.py
===
--- BTreeFolder2.py (revision 41285)
+++ BTreeFolder2.py (working copy)
@@ -341,21 +341,22 @@
 # Returns a list of subobject ids of the current object.
 # If 'spec' is specified, returns objects whose meta_type
 # matches 'spec'.
-if spec is not None:
-if isinstance(spec, StringType):
-spec = [spec]
-mti = self._mt_index
-set = None
-for meta_type in spec:
-ids = mti.get(meta_type, None)
-if ids is not None:
-set = union(set, ids)
-if set is None:
-return ()
-else:
-return set.keys()
+
+mti = self._mt_index
+if spec is None:
+spec = mti.keys() #all meta types
+
+if isinstance(spec, StringType):
+spec = [spec]
+set = None
+for meta_type in spec:
+ids = mti.get(meta_type, None)
+if ids is not None:
+set = union(set, ids)
+if set is None:
+return ()
 else:
-return self._tree.keys()
+return set.keys()


 security.declareProtected(access_contents_information,
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] BTreeFolder2.objectIds() - accessing _tree.keys() slow

2006-01-12 Thread Shane Hathaway

[EMAIL PROTECTED] wrote:

I'm pretty sure this works.


Ok, I get it now.  I misread it the first time.


This returns the equivalent of running
self.objectIds(spec=self._mt_index.keys()) on the current trunk/release
code, which should be identical to self._tree.keys(), but much, much faster.
I'm still somewhat ignorant as to why self._tree.keys() is so slow with
100k-plus objects (waking up too many persistent objects?),


I suspect the cost is in creating ghosts for all of the persistent objects.

No objections here--I like this patch.

Shane
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] BTreeFolder2.objectIds() - accessing _tree.keys() slow

2006-01-11 Thread Shane Hathaway

[EMAIL PROTECTED] wrote:

I have very large BTreeFolder2 (CMFMember via BaseBTreeFolder in Archetypes)
- has about 260k items in _tree - objectIds() is painfully slow, as is
self._tree.keys() - I've casually observed using the meta type index to get
the object ids is many orders of magnitude faster.

Hacking objectIds() as follows (diff against trunk pasted inline) - gettting
ids off of the meta type index for all used meta types - seems to make
things much quicker.  Two questions:


Are you sure this actually works?  _mt_index.keys() is supposed to 
provide a list of all meta_types used in the folder.  To get the object 
ids from it, you'd need something like this:


ids = []
for d in self._mt_index.values():
  ids.extend(d.keys())

The structure of _mt_index is documented in a comment:

_mt_index = None  # OOBTree: { meta_type - OIBTree: { id - 1 } }

(It's strange that I chose to use an OIBTree instead of an OOTreeSet. 
Maybe I didn't know about the set support in the BTree module at the time.)


Shane
___
Zope-Dev maillist  -  Zope-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope )