Getting the most recent doc first in the case of a tie
will _not_ "just happen". I don't think you really get the
nuance here...

You index doc1, and doc2 later. Let's
claim that doc1 gets internal Lucene doc ID of 1 and
doc2 gets an internal doc ID of 2. So far you're golden.
Let's further claim that doc1 is in a different segment than
doc2. Sometime later, as you add/update/delete docs,
segments are merged and doc1 and doc2 may or may
not be in the merged segment. At that point, doc1 can get an
internal Lucene doc ID of, say, 823 and doc2 can get an internal
doc ID of, say 64. So their relative order is changed.

You have to have a secondary sort criteria then. And it has to be
something monotonically increasing by time that won't ever change
like internal doc IDs can. Adding a timestamp
to every doc is certainly an option. Adding your own counter
is also reasonable.

But this is a _secondary_ sort, so it's not even consulted if the
first sort (score) is not a tie. You can get a sense of how this would
affect your query time/CPU usage/RAM by must specifying
sort=score desc,id asc
where id is your <uniqueKey> field. This won't do what you want,
but it will simulate it without having to re-index.

Best,
Erick

On Mon, Aug 24, 2015 at 11:54 AM, Steven White <swhite4...@gmail.com> wrote:
> Thanks Hoss.
>
> I understand the dynamic nature of doc-IDs.  All that I care about is the
> most recent docs be at the top of the hit list when there is a tie.  From
> your reply, it is not clear if that's what happens.  If not, then I have to
> sort, but this is something I want to avoid so it won't add cost to my
> queries (CPU and RAM).
>
> Can you help me answer those two questions?
>
> Steve
>
> On Mon, Aug 24, 2015 at 2:16 PM, Chris Hostetter <hossman_luc...@fucit.org>
> wrote:
>
>>
>> : A follow up question.  Is the sub-sorting on the lucene internal doc IDs
>> : ascending or descending order?  That is, do the most recently index doc
>>
>> you can not make any generic assumptions baout hte order of the internal
>> lucene doc IDS -- the secondary sort on the internal IDs is stable (and
>> FWIW: ascending) for static indexes, but as mentioned before: the *actual*
>> order hte the IDS changes as the index changes -- if there is an index
>> merge, the ids can be totally different and docs can be re-arranged into a
>> diff order...
>>
>> : > However, internal Lucene Ids can change when index changes. (merges,
>> : > updates etc).
>>
>> ...
>>
>> : show up first in this set of docs that have tied score?  If not, who can
>> I
>> : have the most recent be first?  Do I have to sort on lucene's internal
>> doc
>>
>> add a "timestamp" or "counter" field when you index your documents that
>> means whatevery you want it to mean (order added, order updated, order
>> according to some external sort criteria from some external system) and
>> then do an explicit sort on that.
>>
>>
>> -Hoss
>> http://www.lucidworks.com/
>>

Reply via email to