On 7/21/14, 4:50 PM, "Shawn Heisey" <s...@elyograg.org> wrote:

>On 7/21/2014 5:37 PM, Jeff Wartes wrote:
>> I¹d like to ensure an extended warmup is done on each SolrCloud node
>>prior to that node serving traffic.
>> I can do certain things prior to starting Solr, such as pump the index
>>dir through /dev/null to pre-warm the filesystem cache, and post-start I
>>can use the ping handler with a health check file to prevent the node
>>from entering the clients load balancer until I¹m ready.
>> What I seem to be missing is control over when a node starts
>>participating in queries sent to the other nodes.
>> 
>> I can, of course, add solrconfig.xml firstSearcher queries, which I
>>assume (and fervently hope!) happens before a node registers itself in
>>ZK clusterstate.json as ready for work, but that doesn¹t scale so well
>>if I want that initial warmup to run thousands of queries, or run them
>>with some paralleism. I¹m storing solrconfig.xml in ZK, so I¹m sensitive
>>to the size.
>> 
>> Any ideas, or corrections to my assumptions?
>
>I think that firstSearcher/newSearcher (and making sure useColdSearcher
>is set to false) is going to be the only way you can do this in a way
>that's compatible with SolrCloud.  If you were doing manual distributed
>search without SolrCloud, you'd have more options available.
>
>If useColdSearcher is set to false, that should keep *everything* from
>using the searcher until the warmup has finished.  I cannot be certain
>that this is the case, but I have some reasonable confidence that this
>is how it works.  If you find that it doesn't behave this way, I'd call
>it a bug.
>
>Thanks,
>Shawn


Thanks for the quick reply. Since distributed search latency is the max of
the shard sub-requests, I¹m trying my best to minimize any spikes in
cluster latency due to node restarts.
I double-checked useColdSearcher was false, but the doc says this means
requests ³block until the first searcher is done warming², which
translates pretty clearly to ³latency spike². The more I think about it,
the more worried I am that a node might indeed register itself in
live_nodes and get distributed requests before it¹s got a searcher to work
with. *Especially* if I have lots of serial firstSearcher queries.

I¹ll look through the code myself tomorrow, but if anyone can help
confirm/deny the order of operations here, I¹d appreciate it.

Reply via email to