Timothy Potter created SOLR-7333:
------------------------------------
Summary: Make the poll queue time configurable and use knowledge
that a batch is being processed to poll efficiently
Key: SOLR-7333
URL: https://issues.apache.org/jira/browse/SOLR-7333
Project: Solr
Issue Type: Sub-task
Components: SolrCloud
Reporter: Timothy Potter
Assignee: Timothy Potter
{{StreamingSolrClients}} uses {{ConcurrentUpdateSolrServer}} to stream
documents from leader to replica, by default it sets the {{pollQueueTime}} for
CUSS to 0 so that we don't impose an unnecessary wait when processing single
document updates or the last doc in a batch. However, the downside is that
replicas receive many more update requests than leaders; I've seen up to 40x
number of update requests between replica and leader.
If we're processing a batch of docs, then ideally the poll queue time should be
greater than 0 up until the last doc is pulled off the queue. If we're
processing a single doc, then the poll queue time should always be 0 as we
don't want the thread to wait unnecessarily for another doc that won't come.
Rather than force indexing applications to provide this optional parameter in
an update request, it would be better for server-side code that can detect
whether an update request is a single document or batch of documents to
override this value internally, i.e. it'll be 0 by default, but since
{{JavaBinUpdateRequestCodec}} can determine when it's seen the last doc in a
batch, it can override the pollQueueTime to something greater than 0.
This means that current indexing clients will see a boost when doing batch
updates without making any changes on their side.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]