Re: [Zope] ZCatalog Strategy

2006-03-24 Thread Dieter Maurer
Mark Gibson wrote at 2006-3-22 19:28 -0700:
I'm struggling to weigh the cost of getObject() vs. the cost of adding 
more metadata to the catalog.  I'll explain my situation.

I have 10,000 widgets cataloged.  I do a path and date query that 
returns me maybe 12 of these.  Then I have a choice of calling 
getObject().getData() on each of these, or I could add getData to the 
catalog metadata.

Does the cost of calling getObject() for a dozen objects justify 
creating a new metadata field?

More generally how does a large amount of metadata in the catalog affect 
performance of queries?

This is very difficult to say.


  We have used the standard metadata set for a long time.
  It contains description. However, in our cases
  description was usually several kb large.

  We found that all catalog operations (lookup, indexing)
  were very slow and modifying catalog operations caused
  huge transaktions.

  The IOBBuckets containing the metadata records have
  been the culprit. Usually, they contain 45 metadata records.
  If you need one record, you in fact handle usually 45 of them.
  In our case, the buckets usually have been several 100 kB big -- much
  larger then usual container objects...


On the other hand, if you have small metadata records (say, containing
a few integers), then loading them may be much faster
than loading all intermediate objects to your final object -- especially
for applications such as determining statistics data (when you
are processing a lot of objects).


-- 
Dieter
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
 http://mail.zope.org/mailman/listinfo/zope-announce
 http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] ZCatalog Strategy

2006-03-23 Thread Chris Withers

Mark Gibson wrote:
Does the cost of calling getObject() for a dozen objects justify 
creating a new metadata field?


No.

More generally how does a large amount of metadata in the catalog affect 
performance of queries?


Badly ;-)


The wisdom of those more knowledgeable than me would be appreciated.


As a rule of thumb, if you need a catalog search to return more than 
10-20 objects and you need to do something with an attribute of all the 
objects returned, then whack it in the metadata. If you only need it for 
the 10-20 objects, then do getObject and get the attribute from the object.


cheers,

Chris

--
Simplistix - Content Management, Zope  Python Consulting
   - http://www.simplistix.co.uk

___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )


[Zope] ZCatalog Strategy

2006-03-22 Thread Mark Gibson
I'm struggling to weigh the cost of getObject() vs. the cost of adding 
more metadata to the catalog.  I'll explain my situation.


I have 10,000 widgets cataloged.  I do a path and date query that 
returns me maybe 12 of these.  Then I have a choice of calling 
getObject().getData() on each of these, or I could add getData to the 
catalog metadata.


Does the cost of calling getObject() for a dozen objects justify 
creating a new metadata field?


More generally how does a large amount of metadata in the catalog affect 
performance of queries?


The wisdom of those more knowledgeable than me would be appreciated.

Mark
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )


Re: [Zope] ZCatalog Strategy

2006-03-22 Thread Jonathan
One factor is the amount of ram that you have.  If you have enough ram to 
fit the entire catalog (with indexes)  then your search time is very fast. 
I would err on the side of less meta-data.  We currently run about 1 million 
entries in a zcatalog with very little meta-data and then use 
restrictedTraverse with the object ids returned by the zcatalog search to 
get the fields we need.  We typically have sub 1 second searches using this 
approach (but we have very fast scsi disks in striped raid configuration).


With only 10k records in the catalog you could try both configurations (very 
easy to clear and then reload the catalog) and time them to see which gives 
you better performance.


hth  good luck!

Jonathan


- Original Message - 
From: Mark Gibson [EMAIL PROTECTED]

To: zope@zope.org
Sent: Wednesday, March 22, 2006 9:28 PM
Subject: [Zope] ZCatalog Strategy


I'm struggling to weigh the cost of getObject() vs. the cost of adding 
more metadata to the catalog.  I'll explain my situation.


I have 10,000 widgets cataloged.  I do a path and date query that returns 
me maybe 12 of these.  Then I have a choice of calling 
getObject().getData() on each of these, or I could add getData to the 
catalog metadata.


Does the cost of calling getObject() for a dozen objects justify creating 
a new metadata field?


More generally how does a large amount of metadata in the catalog affect 
performance of queries?


The wisdom of those more knowledgeable than me would be appreciated.

Mark
___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope-dev )



___
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )