Thanks Erik,

Sounds about right.

BTW how long can I keep adding collections i.e. can I keep 5/10 years data
like this?

Also what do you think of bullet 2) of having collection specific
configurations in zookeeper?


On Fri, Apr 25, 2014 at 11:44 PM, Erick Erickson <erickerick...@gmail.com>wrote:

> So you're talking about 700 or so collections. That should be do-able,
> especially as Solr is rapidly evolving to handle more and more
> collections and there's two years for that to happen.
>
> The aging out bit is manual (well, you'd script it I suppose). So
> every day there'd be a script that ran and "just knew" the right
> collection to change the alias on, there's nothing automatic yet.
>
> Best,
> Erick
>
> On Fri, Apr 25, 2014 at 9:37 AM, Mukesh Jha <me.mukesh....@gmail.com>
> wrote:
> > Thanks for quick reply Erik,
> >
> > I want to keep my collections till I run out of hardware, which is at
> least
> > a couple of years worth data.
> > I'd like to know more on ageing out aliases, did a quick search but
> didn't
> > find much.
> >
> >
> > On Fri, Apr 25, 2014 at 9:45 PM, Erick Erickson <erickerick...@gmail.com
> >wrote:
> >
> >> Hmmm, tell us a little more about your use-case. In particular, how
> >> long do you need to keep the data around? Days? Months? Years?
> >>
> >> Because if you only need to keep the data for a specified period, you
> >> can use the collection aliasing process to age-out collections and
> >> keep the number of cores from growing too large.
> >>
> >> Best,
> >> Erick
> >>
> >> On Fri, Apr 25, 2014 at 6:49 AM, Mukesh Jha <me.mukesh....@gmail.com>
> >> wrote:
> >> > Hi Experts,
> >> >
> >> > I need to divide my indexes based on hour/day with each index having
> >> ~50-80
> >> > GB data & ~50-80 mill docs, so I'm planning to create daily collection
> >> with
> >> > names e.g. *sample_colledction_yyyy_mm_dd_hh.*
> >> > I'll also create an alias *sample_collection* and update it whenever I
> >> will
> >> > create a new collection so that the entire data set is searchable.
> >> >
> >> > I've a couple of question on the above design
> >> > 1) How far can it scale? As my collections will increase (so will the
> >> > shards & replicas) do we have a breaking point when adding
> more/searching
> >> > will become an issue?
> >> > 2) As my cluster will grow because of huge number of collections the
> >> > clusterstate.json file present in zookeeper will grow too, won't this
> be
> >> a
> >> > limiting factor? If so instead of storing all this info in one
> >> > clusterstate.json file shouldn't Solr save cluster specific details in
> >> this
> >> > file & have collection specific config files present on zookeeper?
> >> > 3) How can I easily manage all these collections? Do we have Java
> >> Coreadmin
> >> > API's available. I cannot find much documented on it.
> >> >
> >> > --
> >> > Txz,
> >> >
> >> > *Mukesh Jha <me.mukesh....@gmail.com>*
> >>
> >
> >
> >
> > --
> >
> >
> > Thanks & Regards,
> >
> > *Mukesh Jha <me.mukesh....@gmail.com>*
>



-- 


Thanks & Regards,

*Mukesh Jha <me.mukesh....@gmail.com>*

Reply via email to