Why not use TTL for 
document? 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-ttl-field.html

On Tuesday, April 1, 2014 8:50:14 AM UTC+11, slushi wrote:
>
> I have varying data retention requirements I am trying to balance (I am 
> continuously indexing new documents):
>
>    - 1% of my documents need to be kept forever
>    - 10% need to be kept 1 year
>    - the remainder needs to be kept for 1 month
>    
> I can easily set properties indicating the retention policy for each 
> document and then periodically do a "delete by query". However, since the 
> delete would remove 89% of the indexed documents, would there be any 
> potential performance problems with this straightforward approach? I guess 
> this is a YMMV type thing, but I was just wondering what the typical 
> approach is here. Would it be necessary to perhaps filter the query to not 
> affect so many documents at once? Would query performance be greatly 
> impacted?
>
> The alternate approach I was thinking would be to create separate indices 
> for each retention type. Cleanup would be easier, but unfortunately a 
> document's retention policy can be upgraded/downgraded so that could be a 
> little messy to keep consistent.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/eefba11c-d147-4e02-b84b-bc8f90a08e3f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to