[
https://issues.apache.org/jira/browse/SOLR-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524083#comment-16524083
]
Gus Heck commented on SOLR-12357:
---------------------------------
This is going require some rework of MaintainRoutedAliasCmd. Presently the code
there can never delete a collection unless it's creating a collection. With
this feature it would then delay deletion for timePartionSize -
premptiveCreateInterval... which would be significant for long partitions and
confusing in general. Also, delete time frames that are not even multiples of
partition size probably behave somewhat strangely as it is, with old partitions
living somewhat longer than they should. I think the maintain command needs to
delete if delete is appropriate and create if create is appropriate
independently.
Also, it uses Instant.now() to check if it should create a collection and it
will now need to know the triggering date from the document or be sent an
implicit "force create" attribute. The latter option doesn't sound good because
I believe we are relying on this command to be idempotent. If more than one
client is updating, several documents might be processed (one by each client)
before the results of the command take effect so we can get several instances
of the maintain command given to the overseer. Synchronization in the overseer
should ensure that subsequent instances see the results of the first and then
return as a no-op. So I think we need to pass in a "docDate" or maybe
"referenceDate"
> TRA: Pre-emptively create next collection
> ------------------------------------------
>
> Key: SOLR-12357
> URL: https://issues.apache.org/jira/browse/SOLR-12357
> Project: Solr
> Issue Type: Sub-task
> Security Level: Public(Default Security Level. Issues are Public)
> Components: SolrCloud
> Reporter: David Smiley
> Priority: Major
>
> When adding data to a Time Routed Alias (TRA), we sometimes need to create
> new collections. Today we only do this synchronously – on-demand when a
> document is coming in. But this can add delays as the documents inbound are
> held up for a collection to be created. And, there may be a problem like a
> lack of resources (e.g. ample SolrCloud nodes with space) that the policy
> framework defines. Such problems could be rectified sooner rather than later
> assume there is log alerting in place (definitely out of scope here).
> Pre-emptive TRA collection needs a time window configuration parameter,
> perhaps named something like "preemptiveCreateWindowMs". If a document's
> timestamp is within this time window _from the end time of the head/lead
> collection_ then the collection can be created pre-eptively. If no data is
> being sent to the TRA, no collections will be auto created, nor will it
> happen if older data is being added. It may be convenient to effectively
> limit this time setting to the _smaller_ of this value and the TRA interval
> window, which I think is a fine limitation.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]