Thanks for the indepth explanation!

The secondary sort by uuid would allow me to read a series of docs with
identical time over multiple batches by specifying filtering
time>timeOnLastReadDoc or (time=timeOnLastReadDoc and
uuid>uuidOnLastReaDoc) which essentially creates a unique sorted value to
track progress over.
On Sep 21, 2015 19:56, "Shawn Heisey" <apa...@elyograg.org> wrote:

> On 9/21/2015 9:01 AM, Gili Nachum wrote:
> > TimestampUpdateProcessorFactory takes place only on the leader shard, or
> on
> > each shard replica?
> > if on each replica then I would get different values on each replica.
> >
> > My alternative would be to perform secondary sort on a UUID to ensure
> order.
>
> If the update chain is configured properly, it runs on the leader, so
> all replicas get the same timestamp.
>
> Without SolrCloud, the way to create an "indexed at" time field is in
> the schema -- specify a default value of NOW on the field definition and
> don't send the field when indexing.  The old master/slave replication
> copies the actual index contents, so the indexed values in all replicas
> are the same.
>
> The problem with NOW in the schema when running SolrCloud is that each
> replica indexes the document independently, so each replica can have a
> different timestamp.  This is why the timestamp update processor exists
> -- to set the timestamp to a specific value before the document is
> duplicated to each replica, eliminating the problem.
>
> FYI, secondary sort parameters affect the order when the primary sort
> field is identical between two documents.  It may not do what you are
> intending because of that.
>
> Thanks,
> Shawn
>
>

Reply via email to