This may be a silly question, but I can't seem to find an answer. Perhaps just my google-fu is weak.
If I query a SolrCloud cluster, with debug=true, In the tracking output, I will see during GET_TOP_IDS a list of N replicas per shard. shards.url= http://solr-node-1:8983/solr/my_collectiion_shard1_replica_1234|http://solr-node-1:8983/solr/my_collectiion_shard1_replica_5678 Does this imply ALL of these replicas are queried, and the first response is aggregated back into the response? Or is EXACTLY ONE replica queried? AND To reduce tail latency - ie waiting for the slowest core - is there a way to control how many replicas per shard are requested? Thanks -Doug