Not saying it's always one way or the other, just that one shouldn't automatically _assume_ putting the most recent data on a single node is automatically good. It may well be, but not in all cases.
On Wed, Jul 3, 2013 at 12:21 PM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Exactly. And the newest shard can also be kept small (e.g. maybe just > last 12h is OK to hit first and dig deeper only if you can't find > enough stories in the last 12h), which means it will fit in memory and > be crazy fast. > > Otis > -- > Solr & ElasticSearch Support -- http://sematext.com/ > Performance Monitoring -- http://sematext.com/spm > > > > On Wed, Jul 3, 2013 at 10:11 AM, Mark Miller <markrmil...@gmail.com> > wrote: > > > > On Jul 3, 2013, at 7:47 AM, Erick Erickson <erickerick...@gmail.com> > wrote: > > > >> Usually most people > >> care about today's news, and a hot story will > >> generate lots of queries, all of which are serviced > >> by today's shard. > > > > That's really the whole point though - rather than slamming your whole > cluster with every search, the majority of people are just searching today > - which will have only a fraction of the data and will be able to hold up > very well to a large load. This is also how you can do really fast NRT on a > huge data set - it only has to happen on todays shard. > > > > News sites have been using the trick forever with Lucene. > > > > - Mark >