Re: [Zope-CMF] Re: reindexing optimizations

2005-11-24 Thread Chris Withers

Julien Anguenot wrote:

Can you gimme a use case, within a test, where the interface we define
in CPS would not be enough for you ? In our CPSTestCase we set all
subscribers to 'async' and then all works as if no txn subscribers
exists during the tests.


Ah, okay, yes, I guess this is one way to do it.


Also, how would you go about overriding this late-indexing behaviour if
you really wanted to for some specific objects?


Define reindexObject() and reindexObjectSecurity() on the given content
type for instance. But clearly, we could think about a more complex
Transaction Manager that may deal with more complex use cases such as
this one. We didn't express this need yet.


Yep.

cheers,

Chris

--
Simplistix - Content Management, Zope  Python Consulting
   - http://www.simplistix.co.uk

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


Re: [Zope-CMF] Re: reindexing optimizations

2005-11-21 Thread Chris Withers

Florent Guillaume wrote:
That's certainly a good hack. There are several ways to do it, either 
with a thread-local variable, or in the request, or by walking the 
stack's locals to check for a __dont_index__ attribute... You'd have to 
bench, but a thread-local variable is probably the fastest. You want to 
store a set of objects whose indexing should be skipped.


Urm? Surely it's the other way round? ie: you build a mapping of objects 
that need re-indexing, so you only re-index each one once...


As Julien said it's not very hard to implement, it's just that there are 
application changes to consider. Still, there's agreement that CMF 
should move in that direction, I can provide patches taken from the CPS 
implementation. (And it requires Zope 2.8/ZODB 3.4 of course.) Some of 
the framework should pushed into Zope itself.


I think this is the way to go, but I do feel there should be some 
override method for unit testing and als ofor where you want to do a 
search after doing a load of object changes


cheers,

Chris

--
Simplistix - Content Management, Zope  Python Consulting
   - http://www.simplistix.co.uk
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


Re: [Zope-CMF] Re: reindexing optimizations

2005-11-21 Thread Julien Anguenot
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Chris Withers wrote:
 Florent Guillaume wrote:
 As Julien said it's not very hard to implement, it's just that there
 are application changes to consider. Still, there's agreement that CMF
 should move in that direction, I can provide patches taken from the
 CPS implementation. (And it requires Zope 2.8/ZODB 3.4 of course.)
 Some of the framework should pushed into Zope itself.
 
 I think this is the way to go, but I do feel there should be some
 override method for unit testing and als ofor where you want to do a
 search after doing a load of object changes
 

yup you're right. We do define this already. All the subscribers can run
'async' or 'sync' . Check the base interface for those subscribers there :

http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/interfaces.py
http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/BaseManager.py

Cheers,

J.

- --
Julien Anguenot | Nuxeo RD (Paris, France)
CPS Platform : http://www.cps-project.org
Zope3 / ECM   : http://www.z3lab.org
mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFDgd3tGhoG8MxZ/pIRAhL7AJ9v6rVu3XWbzs5dMq8LcfZayyhFYwCfX79a
J0p5O2d9JzUnduG5MUrSMmU=
=Q9yQ
-END PGP SIGNATURE-
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


Re: [Zope-CMF] Re: reindexing optimizations

2005-11-21 Thread Chris Withers

Julien Anguenot wrote:


yup you're right. We do define this already. All the subscribers can run
'async' or 'sync' . Check the base interface for those subscribers there :

http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/interfaces.py
http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/BaseManager.py


Not quite the semantics I had in mind, I meant more of a flush-type 
method...


cheers,

Chris

--
Simplistix - Content Management, Zope  Python Consulting
   - http://www.simplistix.co.uk
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


Re: [Zope-CMF] Re: reindexing optimizations

2005-11-21 Thread Julien Anguenot
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Florent Guillaume wrote:
 As Julien said it's not very hard to implement, it's just that there are
 application changes to consider. Still, there's agreement that CMF
 should move in that direction, I can provide patches taken from the CPS
 implementation. (And it requires Zope 2.8/ZODB 3.4 of course.) Some of
 the framework should pushed into Zope itself.
 

Yes. I guess we should write a Zope3 proposal for this. Hopefully, it
will be more successful than on the ZODB side...

The main goal of this would be to have the same interfaces and API for
the ordering of subscribers for Zope based systems.

Cheers,

J.

- --
Julien Anguenot | Nuxeo RD (Paris, France)
CPS Platform : http://www.cps-project.org
Zope3 / ECM   : http://www.z3lab.org
mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFDgd9yGhoG8MxZ/pIRAtMoAJ9R9gNIasY8iZNVrSqWSGLLGQMGDACghgJF
e0U1lH7eny87FiVkne7zjRU=
=TyjW
-END PGP SIGNATURE-
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


Re: [Zope-CMF] Re: reindexing optimizations

2005-11-21 Thread Julien Anguenot
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Chris Withers wrote:
 Julien Anguenot wrote:

 yup you're right. We do define this already. All the subscribers can run
 'async' or 'sync' . Check the base interface for those subscribers
 there :

 http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/interfaces.py
 http://svn.nuxeo.org/trac/pub/file/CPSCore/trunk/BaseManager.py
 
 Not quite the semantics I had in mind, I meant more of a flush-type
 method...
 

If you want a flush, following what I think I understood you mean ;), we
use the transaction.commit() in the tests when needed.

If there's interest in having that on Zope then of course we could
discuss and see what can be improved based on our .  If not, we are
happy about this ourselves on CPS ;) It really enhances the performances
on large scaled projects. (especially the ones with several k documents
indexed...)

Cheers,

J.

- --
Julien Anguenot | Nuxeo RD (Paris, France)
CPS Platform : http://www.cps-project.org
Zope3 / ECM   : http://www.z3lab.org
mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFDgeD1GhoG8MxZ/pIRAh/DAJ9QnzF6RsRCxXvN/Gzqm1mZdHDFAwCdEI65
UMlxS4ps1iJi1S7ck15X8gI=
=2xu1
-END PGP SIGNATURE-
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


Re: [Zope-CMF] Re: reindexing optimizations

2005-11-21 Thread Chris Withers

Julien Anguenot wrote:


If you want a flush, following what I think I understood you mean ;), we
use the transaction.commit() in the tests when needed.


Okay, but then how do you undo the changes made by that commit?

Also, how would you go about overriding this late-indexing behaviour if 
you really wanted to for some specific objects?


cheers,

Chris

--
Simplistix - Content Management, Zope  Python Consulting
   - http://www.simplistix.co.uk
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


Re: [Zope-CMF] Re: reindexing optimizations

2005-11-21 Thread Julien Anguenot
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Chris Withers wrote:
 Julien Anguenot wrote:

 If you want a flush, following what I think I understood you mean ;), we
 use the transaction.commit() in the tests when needed.
 
 Okay, but then how do you undo the changes made by that commit?

You can't with this.

Can you gimme a use case, within a test, where the interface we define
in CPS would not be enough for you ? In our CPSTestCase we set all
subscribers to 'async' and then all works as if no txn subscribers
exists during the tests.

 
 Also, how would you go about overriding this late-indexing behaviour if
 you really wanted to for some specific objects?

Define reindexObject() and reindexObjectSecurity() on the given content
type for instance. But clearly, we could think about a more complex
Transaction Manager that may deal with more complex use cases such as
this one. We didn't express this need yet.

Cheers,

J.

- --
Julien Anguenot | Nuxeo RD (Paris, France)
CPS Platform : http://www.cps-project.org
Zope3 / ECM   : http://www.z3lab.org
mail: anguenot at nuxeo.com; tel: +33 (0) 6 72 57 57 66
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFDgeReGhoG8MxZ/pIRAo1iAJ0c++bO9ZVLyuPalEBIM6rtxzB73ACfX2la
NuKrEr1eo1qRLaIQvsv1pzI=
=aWX1
-END PGP SIGNATURE-
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] Re: reindexing optimizations

2005-11-21 Thread Raphael Ritz

Chris Withers wrote:

Hi Alec,

Alec Mitchell wrote:

So, Sidnei has been plugging away at the AT reindexes things an 
obscene number of times issue today, and appears to have fixed many 
of the AT triggered indexing redundancies. 



Where is this work being done? I'd be very interested to track it...


https://svn.plone.org/svn/archetypes/Archetypes/branches/sidnei-indexing-sanity/

Raphael

___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] Re: reindexing optimizations

2005-11-20 Thread yuppie

Hi!


Florent Guillaume wrote:

Alec Mitchell wrote:
So we have two full reindexes, and three metadata updates.  The last 
reindex appears to be there only to catch the change to 'portal_type' 
in _finishConstruction.  So, this final reindexObject, might safely be 
changed to reindexObject(['portal_type', 'Type']),


This was the case in my initial code, but Yuppie changed it:
http://svn.zope.org/trunk/CMFCore/TypesTool.py?rev=35903r1=35864r2=35903
I don't remember what the reason was, though I believe it was discussed 
a bit at the time on the lists.


- This change was made 2 years ago, Zope 2.6.3 was not released yet. At 
that time a full reindex was the *only* way to update the metadata.


- indexObject() doesn't call notifyModified() - addCreator(). So who 
ever modifies that code should make sure addCreator() is still called 
and indexed.


See also http://mail.zope.org/pipermail/zope-cmf/2004-July/020818.html


Cheers,

Yuppie


___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests


[Zope-CMF] Re: reindexing optimizations

2005-11-19 Thread Florent Guillaume

Alec Mitchell wrote:

Howdy CMFers,

So, Sidnei has been plugging away at the AT reindexes things an obscene 
number of times issue today, and appears to have fixed many of the AT 
triggered indexing redundancies.  There are however still a few places in 
CMF where some cataloging redundancy might be avoided.  One obvious place is 
during object creation, where the following happens:


*) TypesTool.constructInstance() is triggered
**) A _setObject call results in CMFCatalogAware.manage_afterAdd() which 
triggers a full indexObject().

*) This is shortly followed by TypesTool._finishConstruction()
*) Which calls CMFCatalogAware.notifyWorkflowCreated()
*) Which in turn calls WorkFlowTool._reindexWorkflowVariables()
**) Which does a CMFCatalogAware.reindexObject([idxs]) on 
workflow specific variables (with a full metadata update)
*) And calls CMFCatalogAware.reindexObjectSecurity() which 
reindexes the object only on the security index, and doesn't touch metadata.
**) TypesTool._finishConstruction() then does another 
CMFCatalogAware.reindexObject().


So we have two full reindexes, and three metadata updates.  The last reindex 
appears to be there only to catch the change to 'portal_type' in 
_finishConstruction.  So, this final reindexObject, might safely be changed 
to reindexObject(['portal_type', 'Type']),


This was the case in my initial code, but Yuppie changed it:
http://svn.zope.org/trunk/CMFCore/TypesTool.py?rev=35903r1=35864r2=35903
I don't remember what the reason was, though I believe it was discussed 
a bit at the time on the lists.


though the possibility exists 
that other indexed attributes added by 3rd parties may depend on the value 
of portal_type (say, I use an autogenerated Title which includes the Type).  
Additionally, almost immediately before this last reindexObject call, 
another reindexObject call has happened in notifyWorkflowCreated, which 
included a full catalog metadata update.  As a result, updating the catalog 
metadata here is certainly redundant.  Unfortunately, the 
CMFCatalogAware.reindexObject method provides no means of avoiding the 
duplicate metadata update, though it would be trivial to add and to use 
here.


But as you realize, there is a problem when you have metadata computed 
using methods. As exemplified by portal_type / Type, just because one 
attribute is modified doesn't mean only one metadata (or index for that 
matter) is changed.


Another option suggested by Sidnei on IRC, which would avoid the potential 
issues with limiting the variables indexed in the final reindex.  Would be 
to let CMFCatalogAware.manage_afterAdd know (presumably via some state 
variable) that it is being invoked through constructInstance/invokeFactory, 
in which case it could safely skip the initial indexing and allow 
_finishConstruction to take care of indexing the object fully on it's own at 
the end. 


That's certainly a good hack. There are several ways to do it, either 
with a thread-local variable, or in the request, or by walking the 
stack's locals to check for a __dont_index__ attribute... You'd have to 
bench, but a thread-local variable is probably the fastest. You want to 
store a set of objects whose indexing should be skipped.


In the long term we will probably be better served by delaying all 
indexing to transaction boundaries, though it will be a fair bit harder to 
implement, and may irk some developers who depend on immediate changes to 
the catalog on reindex.


As Julien said it's not very hard to implement, it's just that there are 
application changes to consider. Still, there's agreement that CMF 
should move in that direction, I can provide patches taken from the CPS 
implementation. (And it requires Zope 2.8/ZODB 3.4 of course.) Some of 
the framework should pushed into Zope itself.


Florent

--
Florent Guillaume, Nuxeo (Paris, France)   Director of RD
+33 1 40 33 71 59   http://nuxeo.com   [EMAIL PROTECTED]
___
Zope-CMF maillist  -  Zope-CMF@lists.zope.org
http://mail.zope.org/mailman/listinfo/zope-cmf

See http://collector.zope.org/CMF for bug reports and feature requests