Thanks a lot Erick... great inputs... Currently our deployment is on Tomcat 7 and I think SOLR 5.x does not support Tomcat but runs on its own Jetty server, right ? I will discuss this with the team.
Thanks again. Regards Vishal On Wed, May 27, 2015 at 4:16 PM, Erick Erickson <erickerick...@gmail.com> wrote: > I'd move to Solr 4.10.3 at least, but preferably Solr 5.x. Solr 5.2 is > being readied for release as we speak, it'll probably be available in > a week or so barring unforeseen problems and that's the one I'd go > with by preference. > > Do be aware, though, that the 5.x Solr world deprecates using a war > file. It's still actually produced, but Solr is moving towards start > scripts instead. This is something new to get used to. See: > https://wiki.apache.org/solr/WhyNoWar > > Best, > Erick > > On Wed, May 27, 2015 at 12:51 PM, Vishal Swaroop <vishal....@gmail.com> > wrote: > > Thanks a lot Erick... You are right we should not delay moving to > > sharding/SolrCloud process. > > > > As you all are expert... currently we are using SOLR 4.7.. Do you suggest > > we should move to latest SOLR release 5.1.0 ? or we can manage the above > > issue using SOLR 4.7 > > > > Regards > > Vishal > > > > On Wed, May 27, 2015 at 2:21 PM, Erick Erickson <erickerick...@gmail.com > > > > wrote: > > > >> Hard to say. I've seen 20M doc be the place you need to consider > >> sharding/SolrCloud. I've seen 300M docs be the place you need to start > >> sharding. That said I'm quite sure you'll need to shard before you get > >> to 2B. There's no good reason to delay that process. > >> > >> You'll have to do something about the join issue though, that's the > >> problem you might want to solve first. The new streaming aggregation > >> stuff might help there, you'll have to figure that out. > >> > >> The first thing I'd explore is whether you can denormlized your way > >> out of the need to join. Or whether you can use block joins instead. > >> > >> Best, > >> Erick > >> > >> On Wed, May 27, 2015 at 11:15 AM, Vishal Swaroop <vishal....@gmail.com> > >> wrote: > >> > Currently, we have SOLR configured on single linux server (24 GB > physical > >> > memory) with multiple cores. > >> > We are using SOLR joins (https://wiki.apache.org/solr/Join) across > >> cores on > >> > this single server. > >> > > >> > But, as data will grow to ~2 billion we need to assess whether we’ll > need > >> > to run SolrCloud as "In a DistributedSearch environment, you can not > Join > >> > across cores on multiple nodes" > >> > > >> > Please suggest at what point or index size should we start > considering to > >> > run SolrCloud ? > >> > > >> > Regards > >> >