Not saying it's always one way or the other, just
that one shouldn't automatically _assume_
putting the most recent data on a single node
is automatically good. It may well be, but
not in all cases.




On Wed, Jul 3, 2013 at 12:21 PM, Otis Gospodnetic <
otis.gospodne...@gmail.com> wrote:

> Exactly.  And the newest shard can also be kept small (e.g. maybe just
> last 12h is OK to hit first and dig deeper only if you can't find
> enough stories in the last 12h), which means it will fit in memory and
> be crazy fast.
>
> Otis
> --
> Solr & ElasticSearch Support -- http://sematext.com/
> Performance Monitoring -- http://sematext.com/spm
>
>
>
> On Wed, Jul 3, 2013 at 10:11 AM, Mark Miller <markrmil...@gmail.com>
> wrote:
> >
> > On Jul 3, 2013, at 7:47 AM, Erick Erickson <erickerick...@gmail.com>
> wrote:
> >
> >> Usually most people
> >> care about today's news, and a hot story will
> >> generate lots of queries, all of which are serviced
> >> by today's shard.
> >
> > That's really the whole point though - rather than slamming your whole
> cluster with every search, the majority of people are just searching today
> - which will have only a fraction of the data and will be able to hold up
> very well to a large load. This is also how you can do really fast NRT on a
> huge data set - it only has to happen on todays shard.
> >
> > News sites have been using the trick forever with Lucene.
> >
> > - Mark
>

Reply via email to