Shawn, thank you for the tips.
I know the significant cons of virtualization, but I don't want to move
this thread into a virtualization pros/cons in the Solr(Cloud) case.

I've just asked what is the minimal code change should be made, in order to
examine whether this is a possible solution or not.. :)


On Sun, Jul 28, 2013 at 1:06 AM, Shawn Heisey <s...@elyograg.org> wrote:

> On 7/27/2013 3:33 PM, Isaac Hebsh wrote:
> > I have about 40 shards. repFactor=2.
> > The cause of slower shards is very interesting, and this is the main
> > approach we took.
> > Note that in every query, it is another shard which is the slowest. In
> 20%
> > of the queries, the slowest shard takes about 4 times more than the
> average
> > shard qtime.
> > While continuing investigation, remember it might be the virtualization /
> > storage-access / network / gc /..., so I thought that reducing the effect
> > of the slow shards might be a good (temporary or permanent) solution.
>
> Virtualization is not the best approach for Solr.  Assuming you're
> dealing with your own hardware and not something based in the cloud like
> Amazon, you can get better results by running on bare metal and having
> multiple shards per host.
>
> Garbage collection is a very likely source of this problem.
>
> http://wiki.apache.org/solr/SolrPerformanceProblems#GC_pause_problems
>
> > I thought it should be an almost trivial code change (for proving the
> > concept). Isn't it?
>
> I have no idea what you're saying/asking here.  Can you clarify?
>
> It seems to me that sending requests to all replicas would just increase
> the overall load on the cluster, with no real benefit.
>
> Thanks,
> Shawn
>
>

Reply via email to