Re: [Zope-CMF] Re: reindexing optimizations
Julien Anguenot wrote: Can you gimme a use case, within a test, where the interface we define in CPS would not be enough for you ? In our CPSTestCase we set all subscribers to 'async' and then all works as if no txn subscribers exists during the tests. Ah, okay, yes, I guess this is one way to do it. Also, how would you go about overriding this late-indexing behaviour if you really wanted to for some specific objects? Define reindexObject() and reindexObjectSecurity() on the given content type for instance. But clearly, we could think about a more complex Transaction Manager that may deal with more complex use cases such as this one. We didn't express this need yet. Yep. cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Zope-CMF maillist - Zope-CMF@lists.zope.org http://mail.zope.org/mailman/listinfo/zope-cmf See http://collector.zope.org/CMF for bug reports and feature requests
Re: [Zope-CMF] Re: reindexing optimizations
Florent Guillaume wrote: That's certainly a good hack. There are several ways to do it, either with a thread-local variable, or in the request, or by walking the stack's locals to check for a __dont_index__ attribute... You'd have to bench, but a thread-local variable is probably the fastest. You want to store a set of objects whose indexing should be skipped. Urm? Surely it's the other way round? ie: you build a mapping of objects that need re-indexing, so you only re-index each one once... As Julien said it's not very hard to implement, it's just that there are application changes to consider. Still, there's agreement that CMF should move in that direction, I can provide patches taken from the CPS implementation. (And it requires Zope 2.8/ZODB 3.4 of course.) Some of the framework should pushed into Zope itself. I think this is the way to go, but I do feel there should be some override method for unit testing and als ofor where you want to do a search after doing a load of object changes cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Zope-CMF maillist - Zope-CMF@lists.zope.org http://mail.zope.org/mailman/listinfo/zope-cmf See http://collector.zope.org/CMF for bug reports and feature requests
Re: [Zope-CMF] Re: reindexing optimizations
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chris Withers wrote: Florent Guillaume wrote: As Julien said it's not very hard to implement, it's just that there are application changes to consider. Still, there's agreement that CMF should move in that direction, I can provide patches taken from the CPS implementation. (And it requires Zope 2.8/ZODB 3.4 of course.) Some of the framework should pushed into Zope itself. I think this is the way to go, but I do feel there should be some override method for unit testing and als ofor where you want to do a search after doing a load of object changes yup you're right. We do define this already. All the subscribers can run 'async' or 'sync' . Check the base interface for those subscribers there : http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/interfaces.py http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/BaseManager.py Cheers, J. - -- Julien Anguenot | Nuxeo RD (Paris, France) CPS Platform : http://www.cps-project.org Zope3 / ECM : http://www.z3lab.org mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFDgd3tGhoG8MxZ/pIRAhL7AJ9v6rVu3XWbzs5dMq8LcfZayyhFYwCfX79a J0p5O2d9JzUnduG5MUrSMmU= =Q9yQ -END PGP SIGNATURE- ___ Zope-CMF maillist - Zope-CMF@lists.zope.org http://mail.zope.org/mailman/listinfo/zope-cmf See http://collector.zope.org/CMF for bug reports and feature requests
Re: [Zope-CMF] Re: reindexing optimizations
Julien Anguenot wrote: yup you're right. We do define this already. All the subscribers can run 'async' or 'sync' . Check the base interface for those subscribers there : http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/interfaces.py http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/BaseManager.py Not quite the semantics I had in mind, I meant more of a flush-type method... cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Zope-CMF maillist - Zope-CMF@lists.zope.org http://mail.zope.org/mailman/listinfo/zope-cmf See http://collector.zope.org/CMF for bug reports and feature requests
Re: [Zope-CMF] Re: reindexing optimizations
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Florent Guillaume wrote: As Julien said it's not very hard to implement, it's just that there are application changes to consider. Still, there's agreement that CMF should move in that direction, I can provide patches taken from the CPS implementation. (And it requires Zope 2.8/ZODB 3.4 of course.) Some of the framework should pushed into Zope itself. Yes. I guess we should write a Zope3 proposal for this. Hopefully, it will be more successful than on the ZODB side... The main goal of this would be to have the same interfaces and API for the ordering of subscribers for Zope based systems. Cheers, J. - -- Julien Anguenot | Nuxeo RD (Paris, France) CPS Platform : http://www.cps-project.org Zope3 / ECM : http://www.z3lab.org mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFDgd9yGhoG8MxZ/pIRAtMoAJ9R9gNIasY8iZNVrSqWSGLLGQMGDACghgJF e0U1lH7eny87FiVkne7zjRU= =TyjW -END PGP SIGNATURE- ___ Zope-CMF maillist - Zope-CMF@lists.zope.org http://mail.zope.org/mailman/listinfo/zope-cmf See http://collector.zope.org/CMF for bug reports and feature requests
Re: [Zope-CMF] Re: reindexing optimizations
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chris Withers wrote: Julien Anguenot wrote: yup you're right. We do define this already. All the subscribers can run 'async' or 'sync' . Check the base interface for those subscribers there : http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/interfaces.py http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/BaseManager.py Not quite the semantics I had in mind, I meant more of a flush-type method... If you want a flush, following what I think I understood you mean ;), we use the transaction.commit() in the tests when needed. If there's interest in having that on Zope then of course we could discuss and see what can be improved based on our . If not, we are happy about this ourselves on CPS ;) It really enhances the performances on large scaled projects. (especially the ones with several k documents indexed...) Cheers, J. - -- Julien Anguenot | Nuxeo RD (Paris, France) CPS Platform : http://www.cps-project.org Zope3 / ECM : http://www.z3lab.org mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFDgeD1GhoG8MxZ/pIRAh/DAJ9QnzF6RsRCxXvN/Gzqm1mZdHDFAwCdEI65 UMlxS4ps1iJi1S7ck15X8gI= =2xu1 -END PGP SIGNATURE- ___ Zope-CMF maillist - Zope-CMF@lists.zope.org http://mail.zope.org/mailman/listinfo/zope-cmf See http://collector.zope.org/CMF for bug reports and feature requests
Re: [Zope-CMF] Re: reindexing optimizations
Julien Anguenot wrote: If you want a flush, following what I think I understood you mean ;), we use the transaction.commit() in the tests when needed. Okay, but then how do you undo the changes made by that commit? Also, how would you go about overriding this late-indexing behaviour if you really wanted to for some specific objects? cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Zope-CMF maillist - Zope-CMF@lists.zope.org http://mail.zope.org/mailman/listinfo/zope-cmf See http://collector.zope.org/CMF for bug reports and feature requests
Re: [Zope-CMF] Re: reindexing optimizations
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Chris Withers wrote: Julien Anguenot wrote: If you want a flush, following what I think I understood you mean ;), we use the transaction.commit() in the tests when needed. Okay, but then how do you undo the changes made by that commit? You can't with this. Can you gimme a use case, within a test, where the interface we define in CPS would not be enough for you ? In our CPSTestCase we set all subscribers to 'async' and then all works as if no txn subscribers exists during the tests. Also, how would you go about overriding this late-indexing behaviour if you really wanted to for some specific objects? Define reindexObject() and reindexObjectSecurity() on the given content type for instance. But clearly, we could think about a more complex Transaction Manager that may deal with more complex use cases such as this one. We didn't express this need yet. Cheers, J. - -- Julien Anguenot | Nuxeo RD (Paris, France) CPS Platform : http://www.cps-project.org Zope3 / ECM : http://www.z3lab.org mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFDgeReGhoG8MxZ/pIRAo1iAJ0c++bO9ZVLyuPalEBIM6rtxzB73ACfX2la NuKrEr1eo1qRLaIQvsv1pzI= =aWX1 -END PGP SIGNATURE- ___ Zope-CMF maillist - Zope-CMF@lists.zope.org http://mail.zope.org/mailman/listinfo/zope-cmf See http://collector.zope.org/CMF for bug reports and feature requests
[Zope-CMF] Re: reindexing optimizations
Chris Withers wrote: Hi Alec, Alec Mitchell wrote: So, Sidnei has been plugging away at the AT reindexes things an obscene number of times issue today, and appears to have fixed many of the AT triggered indexing redundancies. Where is this work being done? I'd be very interested to track it... https://svn.plone.org/svn/archetypes/Archetypes/branches/sidnei-indexing-sanity/ Raphael ___ Zope-CMF maillist - Zope-CMF@lists.zope.org http://mail.zope.org/mailman/listinfo/zope-cmf See http://collector.zope.org/CMF for bug reports and feature requests
[Zope-CMF] Re: reindexing optimizations
Hi! Florent Guillaume wrote: Alec Mitchell wrote: So we have two full reindexes, and three metadata updates. The last reindex appears to be there only to catch the change to 'portal_type' in _finishConstruction. So, this final reindexObject, might safely be changed to reindexObject(['portal_type', 'Type']), This was the case in my initial code, but Yuppie changed it: http://svn.zope.org/trunk/CMFCore/TypesTool.py?rev=35903r1=35864r2=35903 I don't remember what the reason was, though I believe it was discussed a bit at the time on the lists. - This change was made 2 years ago, Zope 2.6.3 was not released yet. At that time a full reindex was the *only* way to update the metadata. - indexObject() doesn't call notifyModified() - addCreator(). So who ever modifies that code should make sure addCreator() is still called and indexed. See also http://mail.zope.org/pipermail/zope-cmf/2004-July/020818.html Cheers, Yuppie ___ Zope-CMF maillist - Zope-CMF@lists.zope.org http://mail.zope.org/mailman/listinfo/zope-cmf See http://collector.zope.org/CMF for bug reports and feature requests
[Zope-CMF] Re: reindexing optimizations
Alec Mitchell wrote: Howdy CMFers, So, Sidnei has been plugging away at the AT reindexes things an obscene number of times issue today, and appears to have fixed many of the AT triggered indexing redundancies. There are however still a few places in CMF where some cataloging redundancy might be avoided. One obvious place is during object creation, where the following happens: *) TypesTool.constructInstance() is triggered **) A _setObject call results in CMFCatalogAware.manage_afterAdd() which triggers a full indexObject(). *) This is shortly followed by TypesTool._finishConstruction() *) Which calls CMFCatalogAware.notifyWorkflowCreated() *) Which in turn calls WorkFlowTool._reindexWorkflowVariables() **) Which does a CMFCatalogAware.reindexObject([idxs]) on workflow specific variables (with a full metadata update) *) And calls CMFCatalogAware.reindexObjectSecurity() which reindexes the object only on the security index, and doesn't touch metadata. **) TypesTool._finishConstruction() then does another CMFCatalogAware.reindexObject(). So we have two full reindexes, and three metadata updates. The last reindex appears to be there only to catch the change to 'portal_type' in _finishConstruction. So, this final reindexObject, might safely be changed to reindexObject(['portal_type', 'Type']), This was the case in my initial code, but Yuppie changed it: http://svn.zope.org/trunk/CMFCore/TypesTool.py?rev=35903r1=35864r2=35903 I don't remember what the reason was, though I believe it was discussed a bit at the time on the lists. though the possibility exists that other indexed attributes added by 3rd parties may depend on the value of portal_type (say, I use an autogenerated Title which includes the Type). Additionally, almost immediately before this last reindexObject call, another reindexObject call has happened in notifyWorkflowCreated, which included a full catalog metadata update. As a result, updating the catalog metadata here is certainly redundant. Unfortunately, the CMFCatalogAware.reindexObject method provides no means of avoiding the duplicate metadata update, though it would be trivial to add and to use here. But as you realize, there is a problem when you have metadata computed using methods. As exemplified by portal_type / Type, just because one attribute is modified doesn't mean only one metadata (or index for that matter) is changed. Another option suggested by Sidnei on IRC, which would avoid the potential issues with limiting the variables indexed in the final reindex. Would be to let CMFCatalogAware.manage_afterAdd know (presumably via some state variable) that it is being invoked through constructInstance/invokeFactory, in which case it could safely skip the initial indexing and allow _finishConstruction to take care of indexing the object fully on it's own at the end. That's certainly a good hack. There are several ways to do it, either with a thread-local variable, or in the request, or by walking the stack's locals to check for a __dont_index__ attribute... You'd have to bench, but a thread-local variable is probably the fastest. You want to store a set of objects whose indexing should be skipped. In the long term we will probably be better served by delaying all indexing to transaction boundaries, though it will be a fair bit harder to implement, and may irk some developers who depend on immediate changes to the catalog on reindex. As Julien said it's not very hard to implement, it's just that there are application changes to consider. Still, there's agreement that CMF should move in that direction, I can provide patches taken from the CPS implementation. (And it requires Zope 2.8/ZODB 3.4 of course.) Some of the framework should pushed into Zope itself. Florent -- Florent Guillaume, Nuxeo (Paris, France) Director of RD +33 1 40 33 71 59 http://nuxeo.com [EMAIL PROTECTED] ___ Zope-CMF maillist - Zope-CMF@lists.zope.org http://mail.zope.org/mailman/listinfo/zope-cmf See http://collector.zope.org/CMF for bug reports and feature requests