[jira] [Commented] (SOLR-11900) API command to delete oldest collections in a time routed alias

2018-01-29 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344455#comment-16344455
 ] 

David Smiley commented on SOLR-11900:
-

I was chatting with [~gus_heck] and we figured the need for this isn't very 
compelling (in either form above).  Instead, if the user wants to delete old 
collections explicitly, they could do these commands themselves (update the 
alias, delete the collections).  Collection deletion could even be enhanced to 
detect its a part of an alias and auto-remove itself, which would make it 
easier and would eliminate a race condition of the target collection list 
getting updated at the same time more collections get added (however unlikely). 
 And after SOLR-11925, the user could also temporarily adjust whatever metadata 
setting that establishes the automatic collection deletion time span, assuming 
that new data is coming in to trigger the logic.

So I'll stop this now and re-use most of the code here in SOLR-11925 which 
needs most of the same stuff.

> API command to delete oldest collections in a time routed alias
> ---
>
> Key: SOLR-11900
> URL: https://issues.apache.org/jira/browse/SOLR-11900
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Fix For: 7.3
>
> Attachments: SOLR-11900.patch
>
>
> For Time Routed Aliases, we'll need an API command to delete the oldest 
> collection(s).  Perhaps the command action name is 
> DELETE_COLLECTION_OF_ROUTED_ALIAS (yes that's long).  And input is of course 
> the routed alias name, plus a mandatory "before" which is a standard time 
> input that Solr accepts that will likely include date math.  Thus if you used 
> before="NOW/DAY-90DAYS" then your guaranteed to have the last 90 days worth 
> of data.  If a collection overlaps past what "before" is computed to be then 
> it needs to stay.  The pattern might match any number of collections, perhaps 
> none.  But in all cases, the most recent collection must be retained -- the 
> time routed aliases must at all times refer to at least one collection.
> The underlying steps will be to first update the alias, and then delete the 
> collection(s).  It ought to return the collections that get deleted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11900) API command to delete oldest collections in a time routed alias

2018-01-29 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344109#comment-16344109
 ] 

David Smiley commented on SOLR-11900:
-

The attached patch uses the idea above, and is mostly done.  The main thing 
left is to add alias metadata flag to control this, defaulting to false.  
Suggested: "deleteQueryDeletesCollections".  I'm not sure wether to also 
pass-through the delete query as a normal query as well... there are 
distinctions in the timezone since a NOW/MONTH for this code I added will use 
the TZ from the alias metadata but the delete query against Solr will use the 
TZ parameter sent in the update request.  (P.S. I believe there is another 
issue about tlog replay not serializing the update request params).  So that's 
not nice.  Maybe I'm stubbornly latching onto this idea and I ought to instead 
make yet another conventional SolrCloud collections API request.  
DELETEROUTEDALIASCOLLECTION?  Ugh.

It'd be interesting to see what happens if the incoming delete request is 
flowing into the oldest collection.  It will try to delete itself.  Does that 
work? I'm guessing it would, albeit with a timeout error.  If it doesn't; is it 
a big deal? I don't think so since an incoming request to the alias will always 
route to the first collection ("soonest"), and this one is not delete-able by 
this code.

> API command to delete oldest collections in a time routed alias
> ---
>
> Key: SOLR-11900
> URL: https://issues.apache.org/jira/browse/SOLR-11900
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Fix For: 7.3
>
> Attachments: SOLR-11900.patch
>
>
> For Time Routed Aliases, we'll need an API command to delete the oldest 
> collection(s).  Perhaps the command action name is 
> DELETE_COLLECTION_OF_ROUTED_ALIAS (yes that's long).  And input is of course 
> the routed alias name, plus a mandatory "before" which is a standard time 
> input that Solr accepts that will likely include date math.  Thus if you used 
> before="NOW/DAY-90DAYS" then your guaranteed to have the last 90 days worth 
> of data.  If a collection overlaps past what "before" is computed to be then 
> it needs to stay.  The pattern might match any number of collections, perhaps 
> none.  But in all cases, the most recent collection must be retained -- the 
> time routed aliases must at all times refer to at least one collection.
> The underlying steps will be to first update the alias, and then delete the 
> collection(s).  It ought to return the collections that get deleted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Commented] (SOLR-11900) API command to delete oldest collections in a time routed alias

2018-01-26 Thread David Smiley (JIRA)

[ 
https://issues.apache.org/jira/browse/SOLR-11900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16341937#comment-16341937
 ] 

David Smiley commented on SOLR-11900:
-

Perhaps in fact we don't actually need a new API  but instead have a delete 
query that looks like this {{timeRoutedField:[* TO NOW/MONTH]}} auto-purge the 
old collections.  We've already got the URP in place to intercept and act. 
Arguably if new data creates collections, telling it to delete old stuff should 
delete the old collections.

Regardless of how this feature looks, there will be a separate issue to 
auto-delete.  The issue here is about being explicit about it.

> API command to delete oldest collections in a time routed alias
> ---
>
> Key: SOLR-11900
> URL: https://issues.apache.org/jira/browse/SOLR-11900
> Project: Solr
>  Issue Type: Sub-task
>  Security Level: Public(Default Security Level. Issues are Public) 
>  Components: SolrCloud
>Reporter: David Smiley
>Assignee: David Smiley
>Priority: Major
> Fix For: 7.3
>
>
> For Time Routed Aliases, we'll need an API command to delete the oldest 
> collection(s).  Perhaps the command action name is 
> DELETE_COLLECTION_OF_ROUTED_ALIAS (yes that's long).  And input is of course 
> the routed alias name, plus a mandatory "before" which is a standard time 
> input that Solr accepts that will likely include date math.  Thus if you used 
> before="NOW/DAY-90DAYS" then your guaranteed to have the last 90 days worth 
> of data.  If a collection overlaps past what "before" is computed to be then 
> it needs to stay.  The pattern might match any number of collections, perhaps 
> none.  But in all cases, the most recent collection must be retained -- the 
> time routed aliases must at all times refer to at least one collection.
> The underlying steps will be to first update the alias, and then delete the 
> collection(s).  It ought to return the collections that get deleted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org