Hi Ryan,

Which curator version did you use ? Is it possibly covered by
CURATOR-673[1] (Complete BackgroundCallback if curator got closed or
exceptions from no-zookeeper world) ?

[1]: https://issues.apache.org/jira/browse/CURATOR-673


Best,
Kezhu Wang

On Sat, May 20, 2023 at 12:44 PM tison <wander4...@gmail.com> wrote:
>
> Yep.
>
> You're welcome to file a ticket on our JIRA project to share your 
> reproductive code - https://issues.apache.org/jira/projects/CURATOR
>
> If you don't have a JIRA account yet, you can self-request at 
> https://selfserve.apache.org/jira-account.html
>
> Best,
> tison.
>
>
> Ryan Ruel <r.r...@icloud.com> 于2023年5月19日周五 22:38写道:
>>
>> It will require some effort to get there (without exposing proprietary 
>> code), but the goal would be reproduction (hopefully via unit test).
>>
>> Am I correct then in that these conditions are unexpected?
>>
>> ----
>> Ryan Ruel
>> r...@ryanruel.com
>>
>> On May 19, 2023, at 9:33 AM, tison <wander4...@gmail.com> wrote:
>>
>> 
>> Hi Ryan,
>>
>> Thanks for reporting your use case!
>>
>> According to your description, it's hard to investigate which part can be 
>> the root cause of the hanging result. Could you provide a reproduce code 
>> repo, or share the significant code snippet with related logs?
>>
>> Best,
>> tison.
>>
>>
>> Ryan Ruel <r.r...@icloud.com> 于2023年5月19日周五 19:44写道:
>>>
>>> I’ve encountered an issue in my application where I don’t seem to be 
>>> receiving exceptions back from Curator in certain error scenarios.
>>>
>>> My application:
>>> * Has a pool of worker threads working on different jobs which need to 
>>> read/write ZNodes in ZK.
>>> * Utilizes the Curator ModeledFramework to serialize data within the ZNodes.
>>> * As this is using ModeledFramework (which is built upon Curator Async), 
>>> the application uses a Future.get() to wait for Curator to respond with 
>>> either a success or failure result for each operation.
>>>
>>> Under heavy load, where the ZK connectivity becomes flakey, I occasionally 
>>> encounter a case where all my worker threads block on calls to Future.get().
>>>
>>> With a connection loss event occurs (or if ZK is just too busy to reply in 
>>> a timely manner), I’d expect to see exceptions thrown by Curator, but this 
>>> never happens… the application threads wedge indefinitely.
>>>
>>> Is the expectation when using the Async APIs that we should always expect a 
>>> success/failure response?
>>>
>>> Or is the expectation that the application should implement an additional 
>>> timer in the event that Curator doesn’t respond?
>>>
>>> If it’s the former, I can dig further into why Curator is not responding.
>>>
>>> /Ryan
>>>
>>> ----
>>> Ryan Ruel
>>> r...@ryanruel.com

Reply via email to