[ 
https://issues.apache.org/jira/browse/SOLR-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16524083#comment-16524083
 ] 

Gus Heck commented on SOLR-12357:
---------------------------------

This is going require some rework of MaintainRoutedAliasCmd. Presently the code 
there can never delete a collection unless it's creating a collection. With 
this feature it would then delay deletion for timePartionSize - 
premptiveCreateInterval... which would be significant for long partitions and 
confusing in general. Also, delete time frames that are not even multiples of 
partition size probably behave somewhat strangely as it is, with old partitions 
living somewhat longer than they should. I think the maintain command needs to 
delete if delete is appropriate and create if create is appropriate 
independently.

Also, it uses Instant.now() to check if it should create a collection and it 
will now need to know the triggering date from the document or be sent an 
implicit "force create" attribute. The latter option doesn't sound good because 
I believe we are relying on this command to be idempotent. If more than one 
client is updating, several documents might be processed (one by each client) 
before the results of the command take effect so we can get several instances 
of the maintain command given to the overseer. Synchronization in the overseer 
should ensure that subsequent instances see the results of the first and then 
return as a no-op. So I think we need to pass in a "docDate" or maybe 
"referenceDate"

> TRA: Pre-emptively create next collection 
> ------------------------------------------
>
>                 Key: SOLR-12357
>                 URL: https://issues.apache.org/jira/browse/SOLR-12357
>             Project: Solr
>          Issue Type: Sub-task
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>            Reporter: David Smiley
>            Priority: Major
>
> When adding data to a Time Routed Alias (TRA), we sometimes need to create 
> new collections.  Today we only do this synchronously – on-demand when a 
> document is coming in.  But this can add delays as the documents inbound are 
> held up for a collection to be created.  And, there may be a problem like a 
> lack of resources (e.g. ample SolrCloud nodes with space) that the policy 
> framework defines.  Such problems could be rectified sooner rather than later 
> assume there is log alerting in place (definitely out of scope here).
> Pre-emptive TRA collection needs a time window configuration parameter, 
> perhaps named something like "preemptiveCreateWindowMs".  If a document's 
> timestamp is within this time window _from the end time of the head/lead 
> collection_ then the collection can be created pre-eptively.  If no data is 
> being sent to the TRA, no collections will be auto created, nor will it 
> happen if older data is being added.  It may be convenient to effectively 
> limit this time setting to the _smaller_ of this value and the TRA interval 
> window, which I think is a fine limitation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to