Thanks for the quick responses Shawn & Erick! Just to clarify another few
points:
 1. Does having a larger heap size impact ingesting additional documents to
the index (all CRUD operations) onto a TLOG?
 2. Does having a larger ram configured machine (in this case 32gb) affect
ingestion on TLOGS also?
 3. We are currently routing queries via Amazon ASG / Load Balancer. Is
this one of the recommended ways to set up SOLR infrastructure?

Best Regards,

Ash


On Thu, Jul 19, 2018 at 12:56 AM Erick Erickson <erickerick...@gmail.com>
wrote:

> There's little good reason to _not_ route searches to your TLOG
> replicas. The only difference between the PULL and TLOG replicas is
> that the TLOG replicas get a raw copy of the incoming document from
> the leader and write them to the TLOG. I.e. there's some additional
> I/O.
>
> It's possible that if you have extremely heavy indexing you might
> notice some additional load on the TLOG .vs. PULL replicas, but from
> what you've said I doubt you have that much indexing traffic.
>
> So basically I'd configure my TLOG and PULL replicas pretty much
> identically and search them both.
>
> Best,
> Erick
>
> On Wed, Jul 18, 2018 at 7:46 AM, Shawn Heisey <apa...@elyograg.org> wrote:
> > On 7/18/2018 12:04 AM, Ash Ramesh wrote:
> >>
> >> I have a quick question about what the memory requirements for TLOG
> >> machines are on 7.3.1. We currently run replication where there are 3
> >> TLOGs
> >> with 8gb ram (2gb heap) and N PULL replicas with 32gb ram (4gb heap). We
> >> have > 10M documents (1 collection) with the index size being ~ 17gb. We
> >> send all read traffic to the PULLs and send Updates and Additions to the
> >> Leader TLOG.
> >>
> >> We are wondering how this setup can affect performance for replication,
> >> etc. We are thinking of increasing the heap of the TLOG to 4gb but
> leaving
> >> the total memory on the machine at 8gb. What will that do to
> performance?
> >> We also expect our index to grow 3/4x in the next 6 months.
> >
> >
> > Performance has more to do with index size and memory size than the type
> of
> > replication you're doing.
> >
> > SolrCloud will load balance queries across the cloud, so your low-memory
> > TLOG replicas are most likely handling queries as well.  In a SolrCloud
> > cluster, a query is not necessarily handled by the machine that you send
> the
> > query to.
> >
> > With memory resources that low compared to index size, the 8GB machines
> > probably do not perform queries as well as the 32GB machines.  If you
> > increase the heap to 4GB, that will only leave 4GB available for the OS
> disk
> > cache, and that's going to drop query performance even further.
> >
> > There is a feature in Solr 7.4 that will allow you to prefer certain
> replica
> > types, so you can tell Solr that it should prefer PULL replicas.  But
> since
> > you're running 7.3.1, you don't have that feature.
> >
> > https://issues.apache.org/jira/browse/SOLR-11982
> >
> > There is also a "preferLocalShards" parameter that has existed for longer
> > than the new feature mentioned above.  This tells Solr that it should not
> > load balance queries in the cloud if there is a local index that can
> satisfy
> > the query.  This parameter should only be used if you have an external
> load
> > balancer.
> >
> > Indexing is a heap-intensive operation that doesn't benefit much from
> having
> > a lot of extra memory for the operating system. I have no idea whether
> 2GB
> > of heap is enough or not.  Increasing the heap size MIGHT make
> performance
> > better, or it might make no difference at all.
> >
> > Thanks,
> > Shawn
> >
>

-- 
*P.S. We've launched a new blog to share the latest ideas and case studies 
from our team. Check it out here: product.canva.com 
<http://product.canva.com/>. ***
** <https://canva.com>Empowering the world 
to design
Also, we're hiring. Apply here! 
<https://about.canva.com/careers/>
 <https://twitter.com/canva> 
<https://facebook.com/canva> <https://au.linkedin.com/company/canva> 
<https://instagram.com/canva>





Reply via email to