On Feb 4, 2005, at 12:24 PM, Simeon Koptelov wrote:
By "renumbered", it means it squeezes out holes left by deletes.  The
actual order does not change and thus does not affect a Sort.INDEXORDER
sort.


Documents are stored in the index in the order that they were indexed -
nothing changes this order.  Document id's are not permanent if deletes
occur followed by an optimize.

Thanks for clarification, Erik. Could you answer one more question: can I
control the assignment of document numbers during indexing?

No, you cannot control Lucene's document id scheme - it is basically "for internal use".


Maybe I should explain, why I'm asking.
I'm searching for documents, but for most (almost all) of them I don't really
care about their content. I only want to know a particular numeric field from
document (id of document's category).
I also need to know how many docs in category were found, so I can't index
categories instead of docs.
The result set can be pertty big (30K) and all must be handled in inner loop.
So I wanna use HitCollector and assign intervals of ids to categories of
documents. Following this way, there's no need to actually retrieve document
in inner loop.


Am I on the right way?

You should explore the use of IndexReader. Index your documents with category id field, and use the methods on IndexReader to find all unique categories (TermEnum).


        Erik


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to