yes, unfortunately it’s not completely known at index time. I would need to keep the separate indices in sync when a retention policy change occurs. attempting this seems like it could open up a whole can of worms.
On Tuesday, April 1, 2014 1:58:04 AM UTC-4, David Pilato wrote: > > If you know in advance which doc should be removed (i mean at index time), > you should send the document to an index which should be entirely removed > after a given period. > > > Makes sense? > > -- > David ;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > > Le 1 avr. 2014 à 00:00, slushi <[email protected] <javascript:>> a > écrit : > > I attended an elastic search meet up and at some point it was mentioned > that TTL use is discouraged, but yes this would make a lot of sense here. > Also the 1 year thing is really a guesstimate, we want to keep as much of > that data as possible. I guess maybe with TTL you may not have as much > control when the document deletion and possible segment merging? I am not > that familiar with elastic search performance stuff yet (we just started > looking into using ES). > > On Monday, March 31, 2014 5:52:28 PM UTC-4, Kevin Wang wrote: >> >> Why not use TTL for document? >> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-ttl-field.html >> >> On Tuesday, April 1, 2014 8:50:14 AM UTC+11, slushi wrote: >>> >>> I have varying data retention requirements I am trying to balance (I am >>> continuously indexing new documents): >>> >>> - 1% of my documents need to be kept forever >>> - 10% need to be kept 1 year >>> - the remainder needs to be kept for 1 month >>> >>> I can easily set properties indicating the retention policy for each >>> document and then periodically do a "delete by query". However, since the >>> delete would remove 89% of the indexed documents, would there be any >>> potential performance problems with this straightforward approach? I guess >>> this is a YMMV type thing, but I was just wondering what the typical >>> approach is here. Would it be necessary to perhaps filter the query to not >>> affect so many documents at once? Would query performance be greatly >>> impacted? >>> >>> The alternate approach I was thinking would be to create separate >>> indices for each retention type. Cleanup would be easier, but unfortunately >>> a document's retention policy can be upgraded/downgraded so that could be a >>> little messy to keep consistent. >>> >>> -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/9b685cff-e956-473a-935e-9546b2ea59b3%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/9b685cff-e956-473a-935e-9546b2ea59b3%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/eec089d7-0cef-4a9b-b53f-7dce55ad2bfd%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
