I've lined up some time to work on fts, again, which means fts3.  One
thing I'd like to include would be to order doclists by some baked-in
ranking.  The idea is to sort to most important items to the front of
the list, and then you can do queries which limit the number of hits
and can thus be significantly faster for popular terms.  [Note that
"limit the number of hits" cannot currently be done at the fts layer,
but I'm thinking on that problem, too.]

Here's a proposal for how this might look:

CREATE VIRTUAL TABLE MyTable USING fts3(
 column_name,
 other_column,
 RANK rank_column_name DESCENDING,
 TOKENIZER simple
);

column_name and other_column would be just like in fts2, and would
manifest as TEXT columns.  Likewise the TOKENIZER clause.  RANK would
introduce a new column, in this case called rank_column_name, which
would be INTEGER.  The developer could then insert values into that
column which would be used by fts3 to order document lists.

ASCENDING or DESCENDING are listed after the column name to describe
whether higher values are better or lower values.  I think this is
necessary to handle open-ended ranges reasonably.  If you assigned
incrementing values (such as from an autoincrement rowid) to a
DESCENDING rank, then earlier documents will sort to the beginning
(like how fts2 currently works, where things are ordered by docids).
If you assigned a timestamp to an ASCENDING rank, then later documents
will sort to the beginning.  You can somewhat map one to the other by
setting the rank to "very big constant - actual number", but I think
the syntactic sugar is nice.  I'd expect the default to be ASCENDING.
[I'm open to the notion that I have ASCENDING and DESCENDING exactly
backwards.  There's the logical sense of how importance is handled,
and the physical sense of how it is stored.]

If no RANK is given at all, then things work just like they do now
(essentially ranked by rowid, DESCENDING).

Anyone out there have any thoughts which might be important to consider?

Thanks,
scott

-----------------------------------------------------------------------------
To unsubscribe, send email to [EMAIL PROTECTED]
-----------------------------------------------------------------------------

Reply via email to