Sorry for the late reply. All the documents have unique id. If I repeat the experiment, the num of docs indexed changes (I guess it depends when I shutdown a particular shard). When I do the experiment without shutting down leader Shards, all 80k docs get indexed (which I think proves that all documents are valid).
I need to dig the logs to find error message. Also, I am not tracking of curl return code, will run again and reply. Regards, Saurabh On Wed, Sep 25, 2013 at 3:01 AM, Erick Erickson <erickerick...@gmail.com>wrote: > And do any of the documents have the same <uniqueKey>, which > is usually called "id"? Subsequent adds of docs with the same > <uniqueKey> replace the earlier one. > > It's not definitive because it changes as merges happen, old copies > of docs that have been deleted or updated will be purged, but what > does your admin page show for "maxDoc"? If it's more than "numDocs" > then you have duplicate <uniqueKey>s. NOTE: if you optimize > (which you usually shouldn't) then maxDoc and numDocs will be > the same so if you test this don't optimize. > > Best, > Erick > > > On Tue, Sep 24, 2013 at 10:43 AM, Walter Underwood > <wun...@wunderwood.org> wrote: > > Did all of the curl update commands return success? Ane errors in the > logs? > > > > wunder > > > > On Sep 24, 2013, at 6:40 AM, Otis Gospodnetic wrote: > > > >> Is it possible that some of those 80K docs were simply not valid? e.g. > >> had a wrong field, had a missing required field, anything like that? > >> What happens if you clear this collection and just re-run the same > >> indexing process and do everything else the same? Still some docs > >> missing? Same number? > >> > >> And what if you take 1 document that you know is valid and index it > >> 80K times, with a different ID, of course? Do you see 80K docs in the > >> end? > >> > >> Otis > >> -- > >> Solr & ElasticSearch Support -- http://sematext.com/ > >> Performance Monitoring -- http://sematext.com/spm > >> > >> > >> > >> On Tue, Sep 24, 2013 at 2:45 AM, Saurabh Saxena <ssax...@gopivotal.com> > wrote: > >>> Doc count did not change after I restarted the nodes. I am doing a > single > >>> commit after all 80k docs. Using Solr 4.4. > >>> > >>> Regards, > >>> Saurabh > >>> > >>> > >>> On Mon, Sep 23, 2013 at 6:37 PM, Otis Gospodnetic < > >>> otis.gospodne...@gmail.com> wrote: > >>> > >>>> Interesting. Did the doc count change after you started the nodes > again? > >>>> Can you tell us about commits? > >>>> Which version? 4.5 will be out soon. > >>>> > >>>> Otis > >>>> Solr & ElasticSearch Support > >>>> http://sematext.com/ > >>>> On Sep 23, 2013 8:37 PM, "Saurabh Saxena" <ssax...@gopivotal.com> > wrote: > >>>> > >>>>> Hello, > >>>>> > >>>>> I am testing High Availability feature of SolrCloud. I am using the > >>>>> following setup > >>>>> > >>>>> - 8 linux hosts > >>>>> - 8 Shards > >>>>> - 1 leader, 1 replica / host > >>>>> - Using Curl for update operation > >>>>> > >>>>> I tried to index 80K documents on replicas (10K/replica in parallel). > >>>>> During indexing process, I stopped 4 Leader nodes. Once indexing is > done, > >>>>> out of 80K docs only 79808 docs are indexed. > >>>>> > >>>>> Is this an expected behaviour ? In my opinion replica should take > care of > >>>>> indexing if leader is down. > >>>>> > >>>>> If this is an expected behaviour, any steps that can be taken from > the > >>>>> client side to avoid such a situation. > >>>>> > >>>>> Regards, > >>>>> Saurabh Saxena > >>>>> > >>>> > > > > -- > > Walter Underwood > > wun...@wunderwood.org > > > > > > >