Re: NoHttpResponseException error between leader and replica

Varun Thacker Fri, 17 Jun 2016 02:07:05 -0700

bq. It's now part of HttpClient.

Were you referring to Line230 of HttpClientUtil on master ? -
cm.setValidateAfterInactivity(Integer.getInteger(VALIDATE_AFTER_INACTIVITY,
VALIDATE_AFTER_INACTIVITY_DEFAULT));


On Fri, Jun 17, 2016 at 12:13 PM, Varun Thacker <[email protected]>
wrote:

> Hi Mark,
>
> We were running Solr 5.4.1 on a 4 node machine and a 2 shard 2 replica
> collection.
> The test data is roughly 30M large documents. The indexing process is via
> map-reduce and there are 80 parallel reducers sending a batch of 500
> documents to solr at a go.
>
> In this setup almost all runs hit the NoHttpResponseException b/w leader
> and replica once.
>
> "It's now part of HttpClient." - Sorry I didn't quite follow whats part of
> HttpClient?
>
>
>
> On Fri, Jun 17, 2016 at 6:51 AM, Mark Miller <[email protected]>
> wrote:
>
>> I'm sorry, you say it's easy to reproduce, but can you explain roughly
>> what you are doing to reproduce it?
>>
>> Mark
>>
>> On Thu, Jun 16, 2016 at 9:20 PM Mark Miller <[email protected]>
>> wrote:
>>
>>> That's already how things work. It's now part of HttpClient. There are
>>> some settings you can mess with. Is it easy to reproduce?
>>>
>>> Mark
>>> On Thu, Jun 16, 2016 at 1:15 PM Varun Thacker <
>>> [email protected]> wrote:
>>>
>>>> When running a bulk index process occasionally we see a
>>>> NoHttpResponseException error when the leader is forwarding docs to the
>>>> replica. I think this is a known issue and can be reproduced pretty easily.
>>>>
>>>> What makes me want to dig more is that because of one such
>>>> NoHttpResponseException the leader will put the replica into recovery. The
>>>> replica can never catch up because the indexing throughput is quite high .
>>>> This can add hours of recovery time for the replica depending on how many
>>>> documents one is indexing .
>>>>
>>>> So from what I can think we have two options here -
>>>> 1. Implement a thread which removes stale connections. This has been
>>>> discussed on https://issues.apache.org/jira/browse/SOLR-4509 in the
>>>> past
>>>> 2. The above solution is not the right way forward. The main problem
>>>> here is that replicas can't catch up because Solr doesn't implement
>>>> backpressure yet and implementing that would be the correct solution here
>>>>
>>>> Does anyone have an opinion on how we should we go forward with this
>>>> issue?
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> Regards,
>>>> Varun Thacker
>>>>
>>> --
>>> - Mark
>>> about.me/markrmiller
>>>
>> --
>> - Mark
>> about.me/markrmiller
>>
>
>
>
> --
>
>
> Regards,
> Varun Thacker
>



-- 


Regards,
Varun Thacker

Re: NoHttpResponseException error between leader and replica

Reply via email to