[
https://issues.apache.org/jira/browse/SOLR-5795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13949525#comment-13949525
]
Hoss Man commented on SOLR-5795:
--------------------------------
bq. the deleteByQuery commands seemed to be getting forwarded – i was seeing
deleteByQuery that had TOLEADER and FROMLEADER params getting logged.
I'm not sure what i was looking at before, but after digging into the code a
lot more i realized that the only deletes i were seeing happen where happening
on the _control_ server -- which it turns out, was acting as the overseer (see
SOLR-5919) ... none of the replicas of the test collection were acting sa the
overseer, so nothing was doing periodic deletes in the test collection.
basically, when i laid out my desin for dealing with cloud, i was being
silly-stupid...
bq. if cloud mode, return No-Op unless we are running on the overseer
...because there is no garuntee that the overseer node will be hosting a core
for every cllection -- you might have 1000 nodes in your cluster, and
"collection47" might only be using cores on 10 of those nodes -- that's a 1/100
chance that any of the nodes collection47 will be on the overseer.
So i'm going to need to step back and rethink a way of ensuring that the
distributed deletes happen, but don't happen on every node and flood the whole
collection with N**2 delete requests. (possibly by using a micro
"LeaderElection" just for this purpose? constrained to the existing shard
leaders? or use a best-guess hueristic about the shard leaders? - it's not the
end of hte world to have some redundent deletes, we just don't want it to be
exponential)
> Option to periodically delete docs based on an expiration field -- or ttl
> specified when indexed.
> -------------------------------------------------------------------------------------------------
>
> Key: SOLR-5795
> URL: https://issues.apache.org/jira/browse/SOLR-5795
> Project: Solr
> Issue Type: New Feature
> Reporter: Hoss Man
> Assignee: Hoss Man
> Attachments: SOLR-5795.patch, SOLR-5795.patch, SOLR-5795.patch,
> SOLR-5795.patch
>
>
> A question I get periodically from people is how to automatically remove
> documents from a collection at a certain time (or after a certain amount of
> time).
> Excluding from search results using a filter query on a date field is
> trivial, but you still have to periodically send a deleteByQuery to clean up
> those older "expired" documents. And in the case where you want all
> documents to auto-expire some fixed amount of time when they were indexed,
> you still have to setup a simple UpdateProcessorto set that expiration date.
> So i've been thinking it would be nice if there was a simple way to configure
> solr to do it all for you.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]