Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-30 Thread Jim Washington
Dieter Maurer wrote: > Jim Washington wrote at 2007-3-27 16:28 -0400: > >> ... >> Yes, I think so, at least in the implementation/algorithm I am using. >> There may be other implementations that do not need this. Note, >> however, that the canonical list does not have to be complex objects. >

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-29 Thread Dieter Maurer
Jim Washington wrote at 2007-3-27 16:28 -0400: > ... >Yes, I think so, at least in the implementation/algorithm I am using. >There may be other implementations that do not need this. Note, >however, that the canonical list does not have to be complex objects. >The canonical list is just a repres

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-29 Thread Dieter Maurer
Lennart Regebro wrote at 2007-3-28 18:25 +0200: >On 3/27/07, Dieter Maurer <[EMAIL PROTECTED]> wrote: >> However, this approach is only efficient when the sort index size >> is small compared to the result size. > >Sure. But with incremental searching, the result size is always one, right? ;-) No.

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-28 Thread Lennart Regebro
On 3/27/07, Dieter Maurer <[EMAIL PROTECTED]> wrote: However, this approach is only efficient when the sort index size is small compared to the result size. Sure. But with incremental searching, the result size is always one, right? ;-) -- Lennart Regebro: Zope and Plone consulting. http://www

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-27 Thread Jim Washington
Paul Winkler wrote: > On Tue, Mar 27, 2007 at 03:25:00PM -0400, Jim Washington wrote: > >> A factoradic index is representable as a long integer. Given that >> integer and the canonical list, you can regenerate the permutation >> represented by that integer. So, instead of caching the sorted l

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-27 Thread Paul Winkler
On Tue, Mar 27, 2007 at 03:25:00PM -0400, Jim Washington wrote: > A factoradic index is representable as a long integer. Given that > integer and the canonical list, you can regenerate the permutation > represented by that integer. So, instead of caching the sorted list > itself, you find and kee

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-27 Thread Jim Washington
Dieter Maurer wrote: > Jim Washington wrote at 2007-3-27 08:24 -0400: > >> ... >> If you see a sort order as one permutation of a list, the factoradic >> technique provides a key to that permutation. So, in theory, one would >> sort the list, and store the factoradic index for that permutation.

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-27 Thread Dieter Maurer
Jim Fulton wrote at 2007-3-26 15:55 -0400: > ... > >On Mar 26, 2007, at 3:28 PM, Dieter Maurer wrote: > >> Jim Fulton wrote at 2007-3-25 09:53 -0400: >>> >>> On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote: MF> I think one of the main limitations of the current catalog (and MF> hurry.quer

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-27 Thread Dieter Maurer
Jim Washington wrote at 2007-3-27 08:24 -0400: > ... >If you see a sort order as one permutation of a list, the factoradic >technique provides a key to that permutation. So, in theory, one would >sort the list, and store the factoradic index for that permutation. The >next time the particular sor

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-27 Thread Dieter Maurer
Lennart Regebro wrote at 2007-3-27 11:59 +0200: > ... >OK, at least this avoids the big intermediate results when searching >over several indexes. But you still have to get all of the results, >and sort them before you can return the X first. I have the impression >that Lucene somehow solves this w

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-27 Thread Jim Washington
Hi, Martijn I have a suggestion, only because I have played around with the idea a bit. Google for "python factoradics", and you will get my blog entry about factoradics. I see the problem statement as "How to obtain batching without re-sorting multiple times". If you see a sort order as one pe

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-27 Thread Martijn Faassen
Hey Jim, On 3/27/07, Jim Fulton <[EMAIL PROTECTED]> wrote: [snip] > After > that I thought we were actually going somewhere with this discussion, > but you now strengthen this impression by apparently giving up in > exasparation. That is what is making *me* slightly exasparated. :) Fine. I gave

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-27 Thread Jim Fulton
On Mar 26, 2007, at 8:45 PM, Martijn Faassen wrote: ... I think we can safely conclude that: * there is no silver bullet in all this (your point) Yes * there is probably room for improvement. Yes ... My hopes: * is that there is some low-hanging fruit in improving things. Possibly,

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-27 Thread Lennart Regebro
On 3/26/07, Dieter Maurer <[EMAIL PROTECTED]> wrote: When "IncrementalSearch" does it, it works roughly as Jim described it under 1 above (although there is no need that it works the the "primary key" directly). Right. Unless the indexes are able to return sorted (partial results), we need to

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-26 Thread Martijn Faassen
Hey Jim, On 3/26/07, Jim Fulton <[EMAIL PROTECTED]> wrote: [snip] > Theoretical algorithm scalability is one thing, and the same issues > apply to both. Practical scalability might vary widely. OK, I give up. This argument just isn't worth my time any more. I'm sorry I objected to the origina

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-26 Thread Jim Fulton
On Mar 26, 2007, at 3:28 PM, Dieter Maurer wrote: Jim Fulton wrote at 2007-3-25 09:53 -0400: On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote: MF> I think one of the main limitations of the current catalog (and MF> hurry.query) is efficient support for sorting and batching the query MF> resul

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-26 Thread Dieter Maurer
Lennart Regebro wrote at 2007-3-25 17:08 +0200: > ... [Jim] >> 1. Sort your results based on the "primary key" and therefore, pick >> your primary key to match your sort results. In terms of the Zope >> catalog framework, the primary keys are the document IDs, which are >> traditionally chosen ran

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-26 Thread Dieter Maurer
Jim Fulton wrote at 2007-3-25 09:53 -0400: > >On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote: >> MF> I think one of the main limitations of the current catalog (and >> MF> hurry.query) is efficient support for sorting and batching the >> query >> MF> results. The Zope 3 catalog returns all match

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-26 Thread Jim Fulton
On Mar 25, 2007, at 5:27 PM, Martijn Faassen wrote: Hey Jim, Jim Fulton wrote: On Mar 25, 2007, at 12:33 PM, Martijn Faassen wrote: [snip] I have the strong suspicion that modern relational databases are currently better able to scale at queries using LIMIT and ORDER BY than the Zope 3 c

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-25 Thread Jim Fulton
To say nothing of the fact that they, wisely, throw massive amounts of hardware at the problem. They can do this in part because they can amortize the cost over a massive user database. I don't think anyone using our tools is going to be in a remotely similar situation. Jim On Mar 25,

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-25 Thread Jim Fulton
On Mar 25, 2007, at 12:33 PM, Martijn Faassen wrote: Hey Jim, Jim Fulton wrote: On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote: MF> I think one of the main limitations of the current catalog (and MF> hurry.query) is efficient support for sorting and batching the query MF> results. The Zo

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-25 Thread Christian Theune
> Google does somehow also the batching hellfast. Google is a bad partner to compare with when talking about DBMS efficiency. Google allows sloppy and "wrong" results in trade of for speed. (E.g. they update their indexes distributetly and do allow different results to be returned for your search

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-25 Thread Adam Groszer
Hello Martijn, MF> I would like some system that helps me reduce some of these costs, using MF> the approaches you list, or at least some caching somewhere. I would MF> imagine a relational database for instance can employ caching of result MF> sets, so that if no writes occurred, a second LIMIT

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-25 Thread Jim Fulton
On Mar 25, 2007, at 11:08 AM, Lennart Regebro wrote: ... 2. Use an N-best algorithm. If N is the size of the batch and M is the corpus size, then this is O(M*ln(N)) rather than O(M*ln(M)) which is a significant improvement if N << M, but still quite expensive. I don't think relational database

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-25 Thread Lennart Regebro
On 3/25/07, Jim Fulton <[EMAIL PROTECTED]> wrote: On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote: > MF> I think one of the main limitations of the current catalog (and > MF> hurry.query) is efficient support for sorting and batching the > query > MF> results. The Zope 3 catalog returns all match

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

2007-03-25 Thread Jim Fulton
On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote: MF> I think one of the main limitations of the current catalog (and MF> hurry.query) is efficient support for sorting and batching the query MF> results. The Zope 3 catalog returns all matching results, which can then MF> be sorted and batche