Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

Jim Fulton Sun, 25 Mar 2007 05:53:29 -0800


On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote:

MF> I think one of the main limitations of the current catalog (and
MF> hurry.query) is efficient support for sorting and batching thequeryMF> results. The Zope 3 catalog returns all matching results, whichcan then
MF> be sorted and batched. This will stop being scalable for large
MF> collections. A relational database is able to do thisinternally, and is
MF> potentially able to use optimizations there.

What evidence to you have to support this assertion? We did someliterature search on this a few years ago and found no special trickto avoid sorting costs.


I know of 2 approaches to reducing sort cost:

1. Sort your results based on the "primary key" and therefore, pickyour primary key to match your sort results. In terms of the Zopecatalog framework, the primary keys are the document IDs, which aretraditionally chosen randomly. You can pick your primary keys basedon a desired sort order instead. A variation on this theme is to usemultiple sets of document ids, storing multiple sets of ids in eachindex. Of course, this approach doesn't help with something likerelevance ranks.

2. Use an N-best algorithm. If N is the size of the batch and M isthe corpus size, then this is O(M*ln(N)) rather than O(M*ln(M)) whichis a significant improvement if N << M, but still quite expensive.

I don't think relational databases have any magic bullet to getaround sorting costs. Sorting is expensive. In many ways, I thinkthe sorting support in the catalog gave people a false sense ofsecurity.


Jim

--
Jim Fulton                      mailto:[EMAIL PROTECTED]                Python 
Powered!
CTO                             (540) 361-1714                  
http://www.python.org
Zope Corporation        http://www.zope.com             http://www.zope.org



_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

Reply via email to