Re: [Zope] ZCatalog Strategy
Mark Gibson wrote at 2006-3-22 19:28 -0700: I'm struggling to weigh the cost of getObject() vs. the cost of adding more metadata to the catalog. I'll explain my situation. I have 10,000 widgets cataloged. I do a path and date query that returns me maybe 12 of these. Then I have a choice of calling getObject().getData() on each of these, or I could add getData to the catalog metadata. Does the cost of calling getObject() for a dozen objects justify creating a new metadata field? More generally how does a large amount of metadata in the catalog affect performance of queries? This is very difficult to say. We have used the standard metadata set for a long time. It contains description. However, in our cases description was usually several kb large. We found that all catalog operations (lookup, indexing) were very slow and modifying catalog operations caused huge transaktions. The IOBBuckets containing the metadata records have been the culprit. Usually, they contain 45 metadata records. If you need one record, you in fact handle usually 45 of them. In our case, the buckets usually have been several 100 kB big -- much larger then usual container objects... On the other hand, if you have small metadata records (say, containing a few integers), then loading them may be much faster than loading all intermediate objects to your final object -- especially for applications such as determining statistics data (when you are processing a lot of objects). -- Dieter ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] ZCatalog Strategy
Mark Gibson wrote: Does the cost of calling getObject() for a dozen objects justify creating a new metadata field? No. More generally how does a large amount of metadata in the catalog affect performance of queries? Badly ;-) The wisdom of those more knowledgeable than me would be appreciated. As a rule of thumb, if you need a catalog search to return more than 10-20 objects and you need to do something with an attribute of all the objects returned, then whack it in the metadata. If you only need it for the 10-20 objects, then do getObject and get the attribute from the object. cheers, Chris -- Simplistix - Content Management, Zope Python Consulting - http://www.simplistix.co.uk ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
[Zope] ZCatalog Strategy
I'm struggling to weigh the cost of getObject() vs. the cost of adding more metadata to the catalog. I'll explain my situation. I have 10,000 widgets cataloged. I do a path and date query that returns me maybe 12 of these. Then I have a choice of calling getObject().getData() on each of these, or I could add getData to the catalog metadata. Does the cost of calling getObject() for a dozen objects justify creating a new metadata field? More generally how does a large amount of metadata in the catalog affect performance of queries? The wisdom of those more knowledgeable than me would be appreciated. Mark ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )
Re: [Zope] ZCatalog Strategy
One factor is the amount of ram that you have. If you have enough ram to fit the entire catalog (with indexes) then your search time is very fast. I would err on the side of less meta-data. We currently run about 1 million entries in a zcatalog with very little meta-data and then use restrictedTraverse with the object ids returned by the zcatalog search to get the fields we need. We typically have sub 1 second searches using this approach (but we have very fast scsi disks in striped raid configuration). With only 10k records in the catalog you could try both configurations (very easy to clear and then reload the catalog) and time them to see which gives you better performance. hth good luck! Jonathan - Original Message - From: Mark Gibson [EMAIL PROTECTED] To: zope@zope.org Sent: Wednesday, March 22, 2006 9:28 PM Subject: [Zope] ZCatalog Strategy I'm struggling to weigh the cost of getObject() vs. the cost of adding more metadata to the catalog. I'll explain my situation. I have 10,000 widgets cataloged. I do a path and date query that returns me maybe 12 of these. Then I have a choice of calling getObject().getData() on each of these, or I could add getData to the catalog metadata. Does the cost of calling getObject() for a dozen objects justify creating a new metadata field? More generally how does a large amount of metadata in the catalog affect performance of queries? The wisdom of those more knowledgeable than me would be appreciated. Mark ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev ) ___ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )