Looking at this PR from Sijie I noticed that there is a rate limiter for
our internal subclass of ZooKeeper client.
https://github.com/apache/bookkeeper/pull/264

The rate limiter is not enabled and cannot be enabled.
I wonder if I hit a bug in our getData or ZkRetryRunnable or it is enough
to enable the rate limiter.

@Sijie
I left a comment on the PR, for me it is OK but it seems that it lacks
support for client-side BookKeeper, it enables it only on the Bookie

-- Enrico



2017-07-19 11:27 GMT+02:00 Enrico Olivelli <eolive...@gmail.com>:

>
>
> Il mer 19 lug 2017, 11:11 Sijie Guo <guosi...@gmail.com> ha scritto:
>
>> On Wed, Jul 19, 2017 at 4:04 PM, Enrico Olivelli <eolive...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> in some internal benchmarks we are experiencing openLedgerNoRecovery
>>> calls which remain hung.
>>> I see that basically that function calls ZookKeeper#getData.
>>>
>>
>>> Does anyone have an idea of how it can happen ?
>>>
>>
>> What version are you testing? Is it related your recent change on bumping
>> zookeeper version? If that's the case, we should consider rolling back the
>> zookeeper version.
>>
>
> 3.5.1 and 3.5.3
>
>>
>>
>>>
>>> Is there any implicit timeout on ZK.getData() ? I did not find any way
>>> and personally I never got into this problem.
>>>
>>
>> As far as I know, there is no timeout on zookeeper requests. It would be
>> a good question to zookeeper community.
>>
>
> I will do
>
>>
>>
>>>
>>> Maybe there is space for an improvement to add a timeout on
>>> openLedgerXXX operations, but anyway it is strange that the callback is
>>> never called.
>>>
>>> Unfortunately the problem happens only in integration tests, mabye I can
>>> work to reproduce it on a BK only test case.
>>>
>>> The case is simple: start ZK + 1 Bookie + 1 BookKeeper, create
>>> concurrencly many ledgers, write and concurrently open them with
>>> openLedgerNoRecovery from other threads.
>>> The fact is that no error is on ZK logs and BK logs
>>>
>>
>> Can you turn on debugging log for the bookkeeper client and also
>> zookeeper? There might be logs for checking.
>>
>
> Yes I am koggong at info, I will try at debug
>
>>
>> Another solution is to do a TCP dump for tracing the zookeeper calls to
>> see if the getData request and response is received at both sides.
>>
>>
>>>
>>> Any suggestion ?
>>>
>>
>
> Thank you again
> Enrico
>
>>
>>> Thanks
>>>
>>> -- Enrico
>>>
>>>
>>> --
>
>
> -- Enrico Olivelli
>

Reply via email to