Hi! The easiest way is to use the scroll feature of elasticsearch: http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/search-request-scroll.html
That way you can iterate over all the documents in indices and write them to Hive. We don't have a built-in way to perform archiving yet, but this should solve your immediate problem with minimal effort and impact. Best, Kay On Thursday, February 27, 2014 9:36:59 AM UTC+1, ChrisDK wrote: > > Hi Guys, > > We have a requirement to archive our Graylog2( v0.20.1) data into Hive. > With a 400 million cap we currently keep only a couple of weeks' data, > where the requirement is 36 months. > > Ideally these exports should run near real-time, not batched as nightly > exports. > It should also have a minimal impact on our live ElasticSearch cluster. > > What would be the best way to do this? > > Thanks! > -- You received this message because you are subscribed to the Google Groups "graylog2" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
