Thanks,

I'll create a deliberate test tomorrow feed some random data through it
several times to see what happens.

I'm also working on simply improving the buffer to handle the situation
internally, but a few hours of testing isn't a big deal.

Ta,
Greg

On 8 September 2010 21:41, Erick Erickson <erickerick...@gmail.com> wrote:

> This would be surprising behavior, if you can reliably reproduce this
> it's worth a JIRA.
>
> But (and I'm stretching a bit here) are you sure you're committing at the
> end of the batch AND are you sure you're looking after the commit? Here's
> the scenario: Your updated document is a position 1 and 100 in your batch.
> Somewhere around SOLR processing document 50, an autocommit occurs,
> and you're looking at your results before SOLR gets around to committing
> document 100. Like I said, it's a stretch.
>
> To test this, you need to be absolutely sure of two things before you
> search:
> 1> the batch is finished processing
> 2> you've issued a commit after the last document in the batch.
>
> If you're sure of the above and still see the problem, please let us
> know...
>
> HTH
> Erick
>
> On Tue, Sep 7, 2010 at 10:32 PM, Greg Pendlebury
> <greg.pendleb...@gmail.com>wrote:
>
> > Does anyone know with certainty how (or even if) order is evaluated when
> > updates are performed by batch?
> >
> > Our application internally buffers solr documents for speed of ingest
> > before
> > sending them to the server in chunks. The XML documents sent to the solr
> > server contain all documents in the order they arrived without any
> settings
> > changed from the defaults (so overwrite = true). We are careful to avoid
> > things like HashMaps on our side since they'd lose the order, but I can't
> > be
> > certain what occurs inside Solr.
> >
> > Sometimes if an object has been indexed twice for various reasons it
> could
> > appear twice in the buffer but the most up-to-date version is always
> last.
> > I
> > have however observed instances where the first copy of the document is
> > indexed and differences in the second copy are missing. Does this sound
> > likely? And if so are there any obvious settings I can play with to get
> the
> > behavior I desire?
> >
> > I looked at:
> > http://wiki.apache.org/solr/UpdateXmlMessages
> >
> > but there is no mention of order, just the overwrite flag (which I'm
> unsure
> > how it is applied internally to an update message) and the deprecated
> > duplicates flag (which I have no idea about).
> >
> > Would switching to SolrInputDocuments on a CommonsHttpSolrServer help? as
> > per http://wiki.apache.org/solr/Solrj. This is no mention of order there
> > either however.
> >
> > Thanks to anyone who took the time to read this.
> >
> > Ta,
> > Greg
> >
>

Reply via email to