Dieter Maurer wrote:
> Jim Washington wrote at 2007-3-27 16:28 -0400:
>
>> ...
>> Yes, I think so, at least in the implementation/algorithm I am using.
>> There may be other implementations that do not need this. Note,
>> however, that the canonical list does not have to be complex objects.
>
Jim Washington wrote at 2007-3-27 16:28 -0400:
> ...
>Yes, I think so, at least in the implementation/algorithm I am using.
>There may be other implementations that do not need this. Note,
>however, that the canonical list does not have to be complex objects.
>The canonical list is just a repres
Lennart Regebro wrote at 2007-3-28 18:25 +0200:
>On 3/27/07, Dieter Maurer <[EMAIL PROTECTED]> wrote:
>> However, this approach is only efficient when the sort index size
>> is small compared to the result size.
>
>Sure. But with incremental searching, the result size is always one, right? ;-)
No.
On 3/27/07, Dieter Maurer <[EMAIL PROTECTED]> wrote:
However, this approach is only efficient when the sort index size
is small compared to the result size.
Sure. But with incremental searching, the result size is always one, right? ;-)
--
Lennart Regebro: Zope and Plone consulting.
http://www
Paul Winkler wrote:
> On Tue, Mar 27, 2007 at 03:25:00PM -0400, Jim Washington wrote:
>
>> A factoradic index is representable as a long integer. Given that
>> integer and the canonical list, you can regenerate the permutation
>> represented by that integer. So, instead of caching the sorted l
On Tue, Mar 27, 2007 at 03:25:00PM -0400, Jim Washington wrote:
> A factoradic index is representable as a long integer. Given that
> integer and the canonical list, you can regenerate the permutation
> represented by that integer. So, instead of caching the sorted list
> itself, you find and kee
Dieter Maurer wrote:
> Jim Washington wrote at 2007-3-27 08:24 -0400:
>
>> ...
>> If you see a sort order as one permutation of a list, the factoradic
>> technique provides a key to that permutation. So, in theory, one would
>> sort the list, and store the factoradic index for that permutation.
Jim Fulton wrote at 2007-3-26 15:55 -0400:
> ...
>
>On Mar 26, 2007, at 3:28 PM, Dieter Maurer wrote:
>
>> Jim Fulton wrote at 2007-3-25 09:53 -0400:
>>>
>>> On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote:
MF> I think one of the main limitations of the current catalog (and
MF> hurry.quer
Jim Washington wrote at 2007-3-27 08:24 -0400:
> ...
>If you see a sort order as one permutation of a list, the factoradic
>technique provides a key to that permutation. So, in theory, one would
>sort the list, and store the factoradic index for that permutation. The
>next time the particular sor
Lennart Regebro wrote at 2007-3-27 11:59 +0200:
> ...
>OK, at least this avoids the big intermediate results when searching
>over several indexes. But you still have to get all of the results,
>and sort them before you can return the X first. I have the impression
>that Lucene somehow solves this w
Hi, Martijn
I have a suggestion, only because I have played around with the idea a bit.
Google for "python factoradics", and you will get my blog entry about
factoradics.
I see the problem statement as "How to obtain batching without
re-sorting multiple times".
If you see a sort order as one pe
Hey Jim,
On 3/27/07, Jim Fulton <[EMAIL PROTECTED]> wrote:
[snip]
> After
> that I thought we were actually going somewhere with this discussion,
> but you now strengthen this impression by apparently giving up in
> exasparation. That is what is making *me* slightly exasparated. :)
Fine. I gave
On Mar 26, 2007, at 8:45 PM, Martijn Faassen wrote:
...
I think we can safely conclude that:
* there is no silver bullet in all this (your point)
Yes
* there is probably room for improvement.
Yes
...
My hopes:
* is that there is some low-hanging fruit in improving things.
Possibly,
On 3/26/07, Dieter Maurer <[EMAIL PROTECTED]> wrote:
When "IncrementalSearch" does it, it works roughly as Jim described it
under 1 above (although there is no need that it works the the "primary
key" directly).
Right.
Unless the indexes are able to return sorted (partial results), we
need to
Hey Jim,
On 3/26/07, Jim Fulton <[EMAIL PROTECTED]> wrote:
[snip]
> Theoretical algorithm scalability is one thing, and the same issues
> apply to both. Practical scalability might vary widely.
OK, I give up. This argument just isn't worth my time any more. I'm
sorry I objected to the origina
On Mar 26, 2007, at 3:28 PM, Dieter Maurer wrote:
Jim Fulton wrote at 2007-3-25 09:53 -0400:
On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote:
MF> I think one of the main limitations of the current catalog (and
MF> hurry.query) is efficient support for sorting and batching the
query
MF> resul
Lennart Regebro wrote at 2007-3-25 17:08 +0200:
> ...
[Jim]
>> 1. Sort your results based on the "primary key" and therefore, pick
>> your primary key to match your sort results. In terms of the Zope
>> catalog framework, the primary keys are the document IDs, which are
>> traditionally chosen ran
Jim Fulton wrote at 2007-3-25 09:53 -0400:
>
>On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote:
>> MF> I think one of the main limitations of the current catalog (and
>> MF> hurry.query) is efficient support for sorting and batching the
>> query
>> MF> results. The Zope 3 catalog returns all match
On Mar 25, 2007, at 5:27 PM, Martijn Faassen wrote:
Hey Jim,
Jim Fulton wrote:
On Mar 25, 2007, at 12:33 PM, Martijn Faassen wrote:
[snip]
I have the strong suspicion that modern relational databases are
currently better able to scale at queries using LIMIT and ORDER BY
than the Zope 3 c
To say nothing of the fact that they, wisely, throw massive amounts
of hardware at the problem. They can do this in part because they
can amortize the cost over a massive user database. I don't think
anyone using our tools is going to be in a remotely similar situation.
Jim
On Mar 25,
On Mar 25, 2007, at 12:33 PM, Martijn Faassen wrote:
Hey Jim,
Jim Fulton wrote:
On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote:
MF> I think one of the main limitations of the current catalog (and
MF> hurry.query) is efficient support for sorting and batching
the query
MF> results. The Zo
> Google does somehow also the batching hellfast.
Google is a bad partner to compare with when talking about DBMS
efficiency. Google allows sloppy and "wrong" results in trade of for
speed.
(E.g. they update their indexes distributetly and do allow different
results to be returned for your search
Hello Martijn,
MF> I would like some system that helps me reduce some of these costs, using
MF> the approaches you list, or at least some caching somewhere. I would
MF> imagine a relational database for instance can employ caching of result
MF> sets, so that if no writes occurred, a second LIMIT
On Mar 25, 2007, at 11:08 AM, Lennart Regebro wrote:
...
2. Use an N-best algorithm. If N is the size of the batch and M is
the corpus size, then this is O(M*ln(N)) rather than O(M*ln(M)) which
is a significant improvement if N << M, but still quite expensive.
I don't think relational database
On 3/25/07, Jim Fulton <[EMAIL PROTECTED]> wrote:
On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote:
> MF> I think one of the main limitations of the current catalog (and
> MF> hurry.query) is efficient support for sorting and batching the
> query
> MF> results. The Zope 3 catalog returns all match
On Mar 25, 2007, at 3:01 AM, Adam Groszer wrote:
MF> I think one of the main limitations of the current catalog (and
MF> hurry.query) is efficient support for sorting and batching the
query
MF> results. The Zope 3 catalog returns all matching results, which
can then
MF> be sorted and batche
26 matches
Mail list logo