RE: [Zope-dev] BTreeFolder2.objectIds() - accessing _tree.keys() slow
[EMAIL PROTECTED] wrote: I'm pretty sure this works. Ok, I get it now. I misread it the first time. This returns the equivalent of running self.objectIds(spec=self._mt_index.keys()) on the current trunk/release code, which should be identical to self._tree.keys(), but much, much faster. I'm still somewhat ignorant as to why self._tree.keys() is so slow with 100k-plus objects (waking up too many persistent objects?), I suspect the cost is in creating ghosts for all of the persistent objects. No objections here--I like this patch. Thanks Shane - glad it makes sense. I don't have contributor rights - would you or anyone else be willing to gateway this diff for me and commit such changes? Thanks, Sean ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] BTreeFolder2.objectIds() - accessing _tree.keys() slow
Hacking objectIds() as follows (diff against trunk pasted inline) - gettting ids off of the meta type index for all used meta types - seems to make things much quicker. Two questions: Are you sure this actually works? _mt_index.keys() is supposed to provide a list of all meta_types used in the folder. To get the object ids from it, you'd need something like this: ids = [] for d in self._mt_index.values(): ids.extend(d.keys()) The structure of _mt_index is documented in a comment: _mt_index = None # OOBTree: { meta_type - OIBTree: { id - 1 } } I'm pretty sure this works. The in each OIBTree in _mt_index, the keys are ids of contained objects for all respective meta types in the folder - I use self._mt_index.keys() to list all meta_types and leverage your original code that runs when spec parameter has been passed gets run every time - loops through all meta_types does a union() of set and ids for each meta_type. This returns the equivalent of running self.objectIds(spec=self._mt_index.keys()) on the current trunk/release code, which should be identical to self._tree.keys(), but much, much faster. I'm still somewhat ignorant as to why self._tree.keys() is so slow with 100k-plus objects (waking up too many persistent objects?), but using the ids stored a few layers deep in the _mt_index seems a viable alternative with the same expected return result. With a bit more effort put in, the diff pasted below should be more complete and illustrate better: Index: BTreeFolder2.py === --- BTreeFolder2.py (revision 41285) +++ BTreeFolder2.py (working copy) @@ -341,21 +341,22 @@ # Returns a list of subobject ids of the current object. # If 'spec' is specified, returns objects whose meta_type # matches 'spec'. -if spec is not None: -if isinstance(spec, StringType): -spec = [spec] -mti = self._mt_index -set = None -for meta_type in spec: -ids = mti.get(meta_type, None) -if ids is not None: -set = union(set, ids) -if set is None: -return () -else: -return set.keys() + +mti = self._mt_index +if spec is None: +spec = mti.keys() #all meta types + +if isinstance(spec, StringType): +spec = [spec] +set = None +for meta_type in spec: +ids = mti.get(meta_type, None) +if ids is not None: +set = union(set, ids) +if set is None: +return () else: -return self._tree.keys() +return set.keys() security.declareProtected(access_contents_information, ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] BTreeFolder2.objectIds() - accessing _tree.keys() slow
[EMAIL PROTECTED] wrote: I'm pretty sure this works. Ok, I get it now. I misread it the first time. This returns the equivalent of running self.objectIds(spec=self._mt_index.keys()) on the current trunk/release code, which should be identical to self._tree.keys(), but much, much faster. I'm still somewhat ignorant as to why self._tree.keys() is so slow with 100k-plus objects (waking up too many persistent objects?), I suspect the cost is in creating ghosts for all of the persistent objects. No objections here--I like this patch. Shane ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] BTreeFolder2.objectIds() - accessing _tree.keys() slow
[EMAIL PROTECTED] wrote: I have very large BTreeFolder2 (CMFMember via BaseBTreeFolder in Archetypes) - has about 260k items in _tree - objectIds() is painfully slow, as is self._tree.keys() - I've casually observed using the meta type index to get the object ids is many orders of magnitude faster. Hacking objectIds() as follows (diff against trunk pasted inline) - gettting ids off of the meta type index for all used meta types - seems to make things much quicker. Two questions: Are you sure this actually works? _mt_index.keys() is supposed to provide a list of all meta_types used in the folder. To get the object ids from it, you'd need something like this: ids = [] for d in self._mt_index.values(): ids.extend(d.keys()) The structure of _mt_index is documented in a comment: _mt_index = None # OOBTree: { meta_type - OIBTree: { id - 1 } } (It's strange that I chose to use an OIBTree instead of an OOTreeSet. Maybe I didn't know about the set support in the BTree module at the time.) Shane ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )