Hmmm, tell us a little more about your use-case. In particular, how
long do you need to keep the data around? Days? Months? Years?

Because if you only need to keep the data for a specified period, you
can use the collection aliasing process to age-out collections and
keep the number of cores from growing too large.

Best,
Erick

On Fri, Apr 25, 2014 at 6:49 AM, Mukesh Jha <me.mukesh....@gmail.com> wrote:
> Hi Experts,
>
> I need to divide my indexes based on hour/day with each index having ~50-80
> GB data & ~50-80 mill docs, so I'm planning to create daily collection with
> names e.g. *sample_colledction_yyyy_mm_dd_hh.*
> I'll also create an alias *sample_collection* and update it whenever I will
> create a new collection so that the entire data set is searchable.
>
> I've a couple of question on the above design
> 1) How far can it scale? As my collections will increase (so will the
> shards & replicas) do we have a breaking point when adding more/searching
> will become an issue?
> 2) As my cluster will grow because of huge number of collections the
> clusterstate.json file present in zookeeper will grow too, won't this be a
> limiting factor? If so instead of storing all this info in one
> clusterstate.json file shouldn't Solr save cluster specific details in this
> file & have collection specific config files present on zookeeper?
> 3) How can I easily manage all these collections? Do we have Java Coreadmin
> API's available. I cannot find much documented on it.
>
> --
> Txz,
>
> *Mukesh Jha <me.mukesh....@gmail.com>*

Reply via email to