There are synchronization points, which become chokepoints at some number of cores. I don't know where they cause Lucene to top out. Lucene apps are generally disk-bound, not CPU-bound, but yours will be. There are so many variables that it's really not possible to give any numbers.
Lance On Mon, Aug 30, 2010 at 8:34 PM, Amit Nithian <anith...@gmail.com> wrote: > Lance, > > makes sense and I have heard about the long GC times on large heaps but I > personally haven't experienced a slowdown but that doesn't mean anything > either :-). Agreed that tuning the SOLR caching is the way to go. > > I haven't followed all the solr/lucene changes but from what I remember > there are synchronization points that could be a bottleneck where adding > more cores won't help this problem? Or am I completely missing something. > > Thanks again > Amit > > On Mon, Aug 30, 2010 at 8:28 PM, scott chu (朱炎詹) > <scott....@udngroup.com>wrote: > >> I am also curious as Amit does. Can you make an example about the garbage >> collection problem you mentioned? >> >> ----- Original Message ----- From: "Lance Norskog" <goks...@gmail.com> >> To: <solr-user@lucene.apache.org> >> Sent: Tuesday, August 31, 2010 9:14 AM >> Subject: Re: Hardware Specs Question >> >> >> >> It generally works best to tune the Solr caches and allocate enough >>> RAM to run comfortably. Linux & Windows et. al. have their own cache >>> of disk blocks. They use very good algorithms for managing this cache. >>> Also, they do not make long garbage collection passes. >>> >>> On Mon, Aug 30, 2010 at 5:48 PM, Amit Nithian <anith...@gmail.com> wrote: >>> >>>> Lance, >>>> >>>> Thanks for your help. What do you mean by that the OS can keep the index >>>> in >>>> memory better than Solr? Do you mean that you should use another means to >>>> keep the index in memory (i.e. ramdisk)? Is there a generally accepted >>>> heap >>>> size/index size that you follow? >>>> >>>> Thanks >>>> Amit >>>> >>>> On Mon, Aug 30, 2010 at 5:00 PM, Lance Norskog <goks...@gmail.com> >>>> wrote: >>>> >>>> The price-performance knee for small servers is 32G ram, 2-6 SATA >>>>> disks on a raid, 8/16 cores. You can buy these servers and half-fill >>>>> them, leaving room for expansion. >>>>> >>>>> I have not done benchmarks about the max # of processors that can be >>>>> kept busy during indexing or querying, and the total numbers: QPS, >>>>> response time averages & variability, etc. >>>>> >>>>> If your index file size is 8G, and your Java heap is 8G, you will do >>>>> long garbage collection cycles. The operating system is very good at >>>>> keeping your index in memory- better than Solr can. >>>>> >>>>> Lance >>>>> >>>>> On Mon, Aug 30, 2010 at 4:52 PM, Amit Nithian <anith...@gmail.com> >>>>> wrote: >>>>> > Hi all, >>>>> > >>>>> > I am curious to know get some opinions on at what point having more > >>>>> CPU >>>>> > cores shows diminishing returns in terms of QPS. Our index size is > >>>>> about >>>>> 8GB >>>>> > and we have 16GB of RAM on a quad core 4 x 2.4 GHz AMD Opteron 2216. >>>>> > Currently I have the heap to 8GB. >>>>> > >>>>> > We are looking to get more servers to increase capacity and because > >>>>> the >>>>> > warranty is set to expire on our old servers and so I was curious > >>>>> before >>>>> > asking for a certain spec what others run and at what point does > >>>>> having >>>>> more >>>>> > cores cease to matter? Mainly looking at somewhere between 4-12 cores >>>>> > per >>>>> > server. >>>>> > >>>>> > Thanks! >>>>> > Amit >>>>> > >>>>> >>>>> >>>>> >>>>> -- >>>>> Lance Norskog >>>>> goks...@gmail.com >>>>> >>>>> >>>> >>> >>> >>> -- >>> Lance Norskog >>> goks...@gmail.com >>> >>> >> >> >> -------------------------------------------------------------------------------- >> >> >> >> ___b___J_T_________f_r_C >> Checked by AVG - www.avg.com >> Version: 9.0.851 / Virus Database: 271.1.1/3102 - Release Date: 08/30/10 >> 14:35:00 >> >> > -- Lance Norskog goks...@gmail.com