The explicit commit will cause your app to be delayed until that commit completes, and then Solr would be idle until that request completion makes its way back to your app and you submit another request which finds its way to Solr, maybe a few ms. That includes network latency. That interval of time could well be more than enough for the short-interval autoCommit or commitWithin to run in the background and in parallel with the request return to your app and the submission by your app of the subsequent request.

The magic of asynchronous operation in a parallel and distributed computing environment, coupled with multi-core processors and parallel threads.

-- Jack Krupansky

-----Original Message----- From: Pisarev, Vitaliy
Sent: Wednesday, February 12, 2014 10:28 AM
To: solr-user@lucene.apache.org
Subject: RE: Solr perfromance with commitWithin seesm too good to be true. I am afraid I am missing something

I absolutely agree and I even read the NRT page before posting this question.

The thing that baffles me is this:

Doing a commit after each add kills the performance.
On the other hand, when I use commit within and specify an (absurd) 1ms delay,- I expect that this behavior will be equivalent to making a commit- from a functional perspective.

Seeing that there is no magic in the world, I am trying to understand what is the price I am actually paying when using the commitWithin feature, on the one hand it commits almost immediately, on the other hand, it performs wonderfully. Where is the catch?


-----Original Message-----
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: יום ד 12 פברואר 2014 17:00
To: solr-user
Subject: Re: Solr perfromance with commitWithin seesm too good to be true. I am afraid I am missing something

Doing a standard commit after every document is a Solr anti-pattern.

commitWithin is a “near-realtime” commit in recent versions of Solr and not a standard commit.

https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching

- Mark

http://about.me/markrmiller

On Feb 12, 2014, at 9:52 AM, Pisarev, Vitaliy <vitaliy.pisa...@hp.com> wrote:

I am running a very simple performance experiment where I post 2000 documents to my application. Who in turn persists them to a relational DB and sends them to Solr for indexing (Synchronously, in the same request).
I am testing 3 use cases:

 1.  No indexing at all - ~45 sec to post 2000 documents  2.  Indexing
included - commit after each add. ~8 minutes (!) to post and index
2000 documents  3.  Indexing included - commitWithin 1ms ~55 seconds
(!) to post and index 2000 documents The 3rd result does not make any sense, I would expect the behavior to be similar to the one in point 2. At first I thought that the documents were not really committed but I could actually see them being added by executing some queries during the experiment (via the solr web UI). I am worried that I am missing something very big. The code I use for point 2:
SolrInputDocument = // get doc
SolrServer solrConnection = // get connection solrConnection.add(doc);
solrConnection.commit(); Whereas the code for point 3:
SolrInputDocument = // get doc
SolrServer solrConnection = // get connection solrConnection.add(doc,
1); // According to API documentation I understand there is no need to
explicitly call commit with this API Is it possible that committing after each add will degrade performance by a factor of 40?

Reply via email to