Hey guys,
I'm looking into a strange issue on an unhealthy 4.3.1 SolrCloud with
3-node external Zookeeper and 1 collection (2 shards, 2 replicas).
Currently we are noticing inconsistent results from the SolrCloud when
performing the same simple /select query many times to our collection.
Almost every other query the numFound count (and the returned data)
jumps between two very different values.
Initially I suspected a replica in a shard of the collection was
inconsistent (and every other request hit that node) and started
performing the same /select query direct to the individual cores of the
SolrCloud collection on each instance, only to notice the same problem -
the count jumps between two very different values!
I may be incorrect here, but I assumed when querying a single core of a
SolrCloud collection, the SolrCloud routing is bypassed and I am talking
directly to a plain/non-SolrCloud core.
As you can see here, the count for 1 core of my SolrCloud collection
fluctuates wildly, and is only receiving updates and no deletes to
explain the jumps:
"solrcloud [tvaillancourt@prodapp solr_cloud]$ curl -s
'http://backend:8983/solr/app_shard2_replica2/select?q=*:*&wt=json&rows=0&indent=true'|grep
numFound
"response":{"numFound":123596839,"start":0,"maxScore":1.0,"docs":[]
solrcloud [tvaillancourt@prodapp solr_cloud]$ curl -s
'http://backend:8983/solr/app_shard2_replica2/select?q=*:*&wt=json&rows=0&indent=true'|grep
numFound
"response":{"numFound":84739144,"start":0,"maxScore":1.0,"docs":[]
solrcloud [tvaillancourt@prodapp solr_cloud]$ curl -s
'http://backend:8983/solr/app_shard2_replica2/select?q=*:*&wt=json&rows=0&indent=true'|grep
numFound
"response":{"numFound":123596839,"start":0,"maxScore":1.0,"docs":[]
solrcloud [tvaillancourt@prodapp solr_cloud]$ curl -s
'http://backend:8983/solr/app_shard2_replica2/select?q=*:*&wt=json&rows=0&indent=true'|grep
numFound
"response":{"numFound":84771358,"start":0,"maxScore":1.0,"docs":[]"
Could anyone help me understand why the same /select query direct to a
single core would return inconsistent, flapping results if there are no
deletes issued in my app to cause such jumps? Am I incorrect in my
assumption that I am querying the core "directly"?
An interesting observation is when I do an /admin/cores call to see the
docCount of the core's index, it does not fluctuate, only the query result.
That was hard to explain, hopefully someone has some insight! :)
Thanks!
Tim