Repeated the experiments on local system. Single shard Solrcloud with a replica. Tried to index 10K docs. All the indexing operation were redirected to replica Solr node. While the document while getting indexed on replica, I shutdown the leader Solr node. Out of 10K docs, only 9900 docs got indexed. If I repeat the experiment without shutting down the leader instance, all 10K docs get indexed. I am using curl to upload the docs, there was no curl error while uploading documents.
Following error was there in replica log file. ERROR - 2013-10-08 16:10:32.662; org.apache.solr.common.SolrException; org.apache.solr.common.SolrException: No registered leader was found, collection:test_collection slice:shard1 Attached replica log file. On Thu, Sep 26, 2013 at 7:15 PM, Saurabh Saxena <ssax...@gopivotal.com>wrote: > Sorry for the late reply. > > All the documents have unique id. If I repeat the experiment, the num of > docs indexed changes (I guess it depends when I shutdown a particular > shard). When I do the experiment without shutting down leader Shards, all > 80k docs get indexed (which I think proves that all documents are valid). > > I need to dig the logs to find error message. Also, I am not tracking of > curl return code, will run again and reply. > > Regards, > Saurabh > > > On Wed, Sep 25, 2013 at 3:01 AM, Erick Erickson > <erickerick...@gmail.com>wrote: > >> And do any of the documents have the same <uniqueKey>, which >> is usually called "id"? Subsequent adds of docs with the same >> <uniqueKey> replace the earlier one. >> >> It's not definitive because it changes as merges happen, old copies >> of docs that have been deleted or updated will be purged, but what >> does your admin page show for "maxDoc"? If it's more than "numDocs" >> then you have duplicate <uniqueKey>s. NOTE: if you optimize >> (which you usually shouldn't) then maxDoc and numDocs will be >> the same so if you test this don't optimize. >> >> Best, >> Erick >> >> >> On Tue, Sep 24, 2013 at 10:43 AM, Walter Underwood >> <wun...@wunderwood.org> wrote: >> > Did all of the curl update commands return success? Ane errors in the >> logs? >> > >> > wunder >> > >> > On Sep 24, 2013, at 6:40 AM, Otis Gospodnetic wrote: >> > >> >> Is it possible that some of those 80K docs were simply not valid? e.g. >> >> had a wrong field, had a missing required field, anything like that? >> >> What happens if you clear this collection and just re-run the same >> >> indexing process and do everything else the same? Still some docs >> >> missing? Same number? >> >> >> >> And what if you take 1 document that you know is valid and index it >> >> 80K times, with a different ID, of course? Do you see 80K docs in the >> >> end? >> >> >> >> Otis >> >> -- >> >> Solr & ElasticSearch Support -- http://sematext.com/ >> >> Performance Monitoring -- http://sematext.com/spm >> >> >> >> >> >> >> >> On Tue, Sep 24, 2013 at 2:45 AM, Saurabh Saxena <ssax...@gopivotal.com> >> wrote: >> >>> Doc count did not change after I restarted the nodes. I am doing a >> single >> >>> commit after all 80k docs. Using Solr 4.4. >> >>> >> >>> Regards, >> >>> Saurabh >> >>> >> >>> >> >>> On Mon, Sep 23, 2013 at 6:37 PM, Otis Gospodnetic < >> >>> otis.gospodne...@gmail.com> wrote: >> >>> >> >>>> Interesting. Did the doc count change after you started the nodes >> again? >> >>>> Can you tell us about commits? >> >>>> Which version? 4.5 will be out soon. >> >>>> >> >>>> Otis >> >>>> Solr & ElasticSearch Support >> >>>> http://sematext.com/ >> >>>> On Sep 23, 2013 8:37 PM, "Saurabh Saxena" <ssax...@gopivotal.com> >> wrote: >> >>>> >> >>>>> Hello, >> >>>>> >> >>>>> I am testing High Availability feature of SolrCloud. I am using the >> >>>>> following setup >> >>>>> >> >>>>> - 8 linux hosts >> >>>>> - 8 Shards >> >>>>> - 1 leader, 1 replica / host >> >>>>> - Using Curl for update operation >> >>>>> >> >>>>> I tried to index 80K documents on replicas (10K/replica in >> parallel). >> >>>>> During indexing process, I stopped 4 Leader nodes. Once indexing is >> done, >> >>>>> out of 80K docs only 79808 docs are indexed. >> >>>>> >> >>>>> Is this an expected behaviour ? In my opinion replica should take >> care of >> >>>>> indexing if leader is down. >> >>>>> >> >>>>> If this is an expected behaviour, any steps that can be taken from >> the >> >>>>> client side to avoid such a situation. >> >>>>> >> >>>>> Regards, >> >>>>> Saurabh Saxena >> >>>>> >> >>>> >> > >> > -- >> > Walter Underwood >> > wun...@wunderwood.org >> > >> > >> > >> > >