[ 
https://issues.apache.org/jira/browse/SOLR-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13949525#comment-13949525
 ] 

Hoss Man commented on SOLR-5795:
--------------------------------

bq. the deleteByQuery commands seemed to be getting forwarded – i was seeing 
deleteByQuery that had TOLEADER and FROMLEADER params getting logged.

I'm not sure what i was looking at before, but after digging into the code a 
lot more i realized that the only deletes i were seeing happen where happening 
on the _control_ server -- which it turns out, was acting as the overseer (see 
SOLR-5919) ... none of the replicas of the test collection were acting sa the 
overseer, so nothing was doing periodic deletes in the test collection.

basically, when i laid out my desin for dealing with cloud, i was being 
silly-stupid...

bq. if cloud mode, return No-Op unless we are running on the overseer

...because there is no garuntee that the overseer node will be hosting a core 
for every cllection -- you might have 1000 nodes in your cluster, and 
"collection47" might only be using cores on 10 of those nodes -- that's a 1/100 
chance that any of the nodes collection47 will be on the overseer.

So i'm going to need to step back and rethink a way of ensuring that the 
distributed deletes happen, but don't happen on every node and flood the whole 
collection with N**2 delete requests.  (possibly by using a micro 
"LeaderElection" just for this purpose? constrained to the existing shard 
leaders?  or use a best-guess hueristic about the shard leaders? - it's not the 
end of hte world to have some redundent deletes, we just don't want it to be 
exponential)

> Option to periodically delete docs based on an expiration field -- or ttl 
> specified when indexed.
> -------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-5795
>                 URL: https://issues.apache.org/jira/browse/SOLR-5795
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Hoss Man
>            Assignee: Hoss Man
>         Attachments: SOLR-5795.patch, SOLR-5795.patch, SOLR-5795.patch, 
> SOLR-5795.patch
>
>
> A question I get periodically from people is how to automatically remove 
> documents from a collection at a certain time (or after a certain amount of 
> time).  
> Excluding from search results using a filter query on a date field is 
> trivial, but you still have to periodically send a deleteByQuery to clean up 
> those older "expired" documents.  And in the case where you want all 
> documents to auto-expire some fixed amount of time when they were indexed, 
> you still have to setup a simple UpdateProcessorto set that expiration date.  
> So i've been thinking it would be nice if there was a simple way to configure 
> solr to do it all for you.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to