That was really clear; I just had another read through of the documentation with that explanation in mind and I can see I went off the rails.
Sorry for any confusion on my part, and thanks for the details. Ta, Greg On 8 March 2014 08:36, Chris Hostetter <hossman_luc...@fucit.org> wrote: > > : Thank-you, that all sounds great. My assumption about documents being > : missed was something like this: > ... > : In that situation D would always be missed, whether the cursorMark 'C or > : greater' or 'greater than B' (I'm not sure which it is in practice), > simply > : because the cursorMark is the unique ID and the unique ID is not your > first > : sort mechanism. > > First off: nothing about your example would result in "the cursorMark is > the unique ID" ... let's clear that misconception up right away: > > Using Cursors requires a deterministic sort w/o any "ties" that can result > in abiguity. For this reason (eliminating the abiguity) it is neccessary > that the uniqueKey always be included in a sort -- but the cursorMark > values that get computed are determined by *all* of the sort critera used. > > So let's revisit your example, but let's make sure we are explicit about > everything involved: > > * A,B,C,D are all uniqueyKey values in the "id" field > * 1,2,3.... are all time values in a "timestamp" field. > * we're going to use a "sort=timestamp asc, id asc" param in this example > * when we say "X(123)" we mean "Document with id 'X' which currently has > value '123' in the timestamp field" > > Let's suppose that at the start of the example, all of the docs in your > example, in sorted order, look like this... > > A(1), B(3), C(14), D(32) > > A client uses our sort, along with cursorMark=* & rows=2. That client > will get back A(1) and B(3) as well as some nextCursorMark value of "$%^" > (deliberately not using any letters or numbers so as not to misslead you > ito thinking hte cursorMark value is an id or a timestamp -- it's > neaither, it's an encoded binary value that has no meaning to client other > then as a "mark" to send back to the server) > > Now let's suppose that B & C are edited as you mention -- their new > timestamp values must -- by definition -- be greater then D's existing > timestamp value of "32" (otherwise it's not really a timestamp field) So > let's assume now, that the total ordering of all our docs, using our sort > is: > > A(1), D(32), B(56), C(57) > > After B & C are modified, the the client makes a followup request using > the same sort, rows=2, and cursorMark=$%^ (the nextCursorMark returned > from the previous request) the two documents the client will get this > time are D(32) and B(56). > > - "D" will never be skipped. > - "B" will be returned twice, because it's timestamp > value was updated after it was fetched > > Does that make sense? > > You can try this out manually if you want to see it for yourlself -- > either using a "real" auto-assigned timestamp field, or just using a > simple numeric field you set your self when updating docs. > > > > -Hoss > http://www.lucidworks.com/ >