Hi, Right, this is not really about routing in ElasticSearch-sense. What's handy for indexing logs are index aliases.... which I thought I had added to JIRA a while back, but it looks like I have not. Index aliases would let you keep a "last 7 days" alias fixed while underneath you push and pop an index every day without the client app having to adjust.
Otis -- Performance Monitoring - http://sematext.com/spm/index.html Search Analytics - http://sematext.com/search-analytics/index.html On Mon, Dec 24, 2012 at 4:30 AM, Per Steffensen <st...@designware.dk> wrote: > I believe it is a misunderstandig to use custom routing (or sharding as > Erick calls it) for this kind of stuff. Custom routing is nice if you want > to control which slice/shard under a collection a specific document goes to > - mainly to be able to control that two (or more) documents are indexed on > the same slice/shard, but also just to be able to control on which > slice/shard a specific document is indexed. Knowing/controlling this kind > of stuff can be used for a lot of nice purposes. But you dont want to move > slices/shards around among collection or delete/add slices from/to a > collection - unless its for elasticity reasons. > > I think you should fill a collection every week/month and just keep those > collections as is. Instead of ending up with a big "historic" collection > containing many slices/shards/cores (one for each historic week/month), you > will end up with many historic collections (one for each historic > week/month). Searching historic data you will have to cross-search those > historic collections, but that is no problem at all. If Solr Cloud is made > at it is supposed to be made (and I believe it is) it shouldnt require more > resouces or be harder in any way to cross-search X slices across many > collections, than it is to cross-search X slices under the same collection. > > Besides that see my answer for topic "Will SolrCloud always slice by ID > hash?" a few days back. > > Regards, Per Steffensen > > > On 12/24/12 1:07 AM, Erick Erickson wrote: > >> I think this is one of the primary use-cases for custom sharding. Solr 4.0 >> doesn't really lend itself to this scenario, but I _believe_ that the >> patch >> for custom sharding has been committed... >> >> That said, I'm not quite sure how you drop off the old shard if you don't >> need to keep old data. I'd guess it's possible, but haven't implemented >> anything like that myself. >> >> FWIW, >> Erick >> >> >> On Fri, Dec 21, 2012 at 12:17 PM, Upayavira <u...@odoko.co.uk> wrote: >> >> I'm working on a system for indexing logs. We're probably looking at >>> filling one core every month. >>> >>> We'll maintain a short term index containing the last 7 days - that one >>> is easy to handle. >>> >>> For the longer term stuff, we'd like to maintain a collection that will >>> query across all the historic data, but that means every month we need >>> to add another core to an existing collection, which as I understand it >>> in 4.0 is not possible. >>> >>> How do people handle this sort of situation where you have rolling new >>> content arriving? I'm sure I've heard people using SolrCloud for this >>> sort of thing. >>> >>> Given it is logs, distributed IDF has no real bearing. >>> >>> Upayavira >>> >>> >