Re: Unusually High Number of timeouts on 1 Solr Shard

Saksham Gupta Tue, 30 Jul 2024 00:07:43 -0700

Aman,
I am using a collection with implicit routing, as per solr 8.10
documentation, we can use split shard API only for hash based routing. Any
help on how we can plan split shard activity for implicit routing.
Is there a way to avoid creating a collection from scratch and taking it to
production after full indexing.


More details on [solr community mail]: *Split Shards in Solr Collection
with Implicit Routing*

On Fri, Jul 19, 2024 at 3:33 PM Saksham Gupta <saksham.gu...@indiamart.com>
wrote:

> Hi Aman,
> Yes, I mean the shard is having a size of 10 gb. The index was created
> from scratch, so no recovery issue should exist.
>
> Have you tried subdividing a shard further? I was thinking of breaking the
> data of this shard using a numeric field [for instance, id mod 2 and
> assigning a subshard with a certain value]. Is there a better way to
> achieve this?
>
> On Wed, Jul 17, 2024 at 9:25 PM Aman Tandon <amantandon...@gmail.com>
> wrote:
>
>> Hi Saksham,
>>
>> When you are saying one replica node is having more size, do you mean that
>> shard, it's shard is also of same size of 10gb. If not please check if
>> there is any old recovery issue due to which the old logs or index still
>> exist.
>>
>> If this shard is having 10gb of space, you please try to divide data. I
>> hope you can try in development environment before applying it on
>> production clusters.
>>
>> Regards,
>> Aman
>>
>> On Wed, 17 Jul 2024, 18:12 Saksham Gupta,
>> <saksham.gu...@indiamart.com.invalid> wrote:
>>
>> > Hi All,
>> > Pinging again for assistance. This is a very unusual case, which is
>> ruining
>> > user experience for a particular type of search [searches mapped in the
>> > replica facing timeouts] as these requests are taking more than 3
>> seconds.
>> >
>> > On Wed, Jul 17, 2024 at 11:37 AM Saksham Gupta <
>> > saksham.gu...@indiamart.com>
>> > wrote:
>> >
>> > > Hi All,
>> > >
>> > > We are using a solr cloud cluster of 59 shards [1 replica for each
>> shard]
>> > > spread across 8 nodes. We have used implicit routing for indexing and
>> > > searching data across these shards.
>> > >
>> > > Upon analyzing the timeouts on solr, we have found that more than 85%
>> > > [3097/3693 timeouts on 9th July] of the solr timeouts were happening
>> due
>> > to
>> > > just 1 replica where the the size of the replica is more compared to
>> > other
>> > > replica [other replica contain < 5gb of data, whereas this replica
>> > contains
>> > > 10 gb].
>> > >
>> > > 1. Anyone who faced a similar issue, how to mitigate this? Is there a
>> way
>> > > to increase timeout for a particular replica/ node?
>> > >
>> > > 2. Also, has someone tried to further divide a shards' data into
>> multiple
>> > > shards? How can we plan this, as there is already a logical separation
>> > > [implicit routing] b/w the 59 shards, and we will be adding another
>> logic
>> > > to subdivide data for 1 of the shards.
>> > >
>> >
>>
>

Re: Unusually High Number of timeouts on 1 Solr Shard

Reply via email to